Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31177

SHOW SLAVE STATUS Last_SQL_Errno Race Condition on Errored Slave Restart

    XMLWordPrintable

Details

    Description

      From knielsen:

      There is a small window between when the SQL thread reports itself running, and when it clears any error set by a previous error stop. Thus a race exists where include/rpl_end.inc can see the previous error being set, causing it to fail the test.

      This can be observed by the following test failure with rpl.rpl_xa_prepare_gtid_fail:

      https://buildbot.mariadb.org/#/builders/208/builds/12453

      10.6 fe89df426

      rpl.rpl_xa_prepare_gtid_fail 'innodb,row' w17 [ fail ]
              Test ended at 2023-05-02 10:11:23
       
      CURRENT_TEST: rpl.rpl_xa_prepare_gtid_fail
      mysqltest: In included file "./include/check_slave_param.inc": 
      included from ./include/check_slave_no_error.inc at line 24:
      included from ./include/rpl_end.inc at line 74:
      included from /home/buildbot/amd64-ubuntu-1804-bigtest/build/mysql-test/suite/rpl/t/rpl_xa_prepare_gtid_fail.test at line 106:
      At line 31: Wrong value for slave parameter Last_SQL_Errno
       
      The result from queries just before the failure was:
      < snip >
      master-bin.000001	590	Table_map	1	636	table_id: 32 (test.t1)
      master-bin.000001	636	Write_rows_v1	1	678	table_id: 32 flags: STMT_END_F
      master-bin.000001	678	Xid	1	709	COMMIT /* xid=154 */
      master-bin.000001	709	Gtid	1	754	XA START X'31',X'',1 GTID 0-1-100
      master-bin.000001	754	Annotate_rows	1	807	update t1 set b=b+10 where a=1
      master-bin.000001	807	Table_map	1	853	table_id: 32 (test.t1)
      master-bin.000001	853	Update_rows_v1	1	905	table_id: 32 flags: STMT_END_F
      master-bin.000001	905	Query	1	990	XA END X'31',X'',1
      master-bin.000001	990	XA_prepare	1	1027	XA PREPARE X'31',X'',1
      master-bin.000001	1027	Gtid	1	1070	GTID 0-1-101
      master-bin.000001	1070	Query	1	1158	XA COMMIT X'31',X'',1
      master-bin.000001	1158	Gtid	1	1200	GTID 0-1-102
      master-bin.000001	1200	Query	1	1317	use `test`; DROP TABLE `t1` /* generated by server */
       
      **** SHOW RELAYLOG EVENTS on server_1 ****
      relaylog_name = 'No such row'
      SHOW RELAYLOG EVENTS IN 'No such row';
      Log_name	Pos	Event_type	Server_id	End_log_pos	Info
      connection server_2;
      Wrong value for Last_SQL_Errno. Expected '0', got '1942'
      

      The slave error log shows an abnormal server shutdown with "initiated by: unknown":

      2023-05-02 10:11:23 0 [Note] /home/buildbot/amd64-ubuntu-1804-bigtest/build/sql/mariadbd (initiated by: unknown): Normal shutdown
      2023-05-02 10:11:23 24 [Note] Error reading relay log event: slave SQL thread was killed
      2023-05-02 10:11:23 24 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 1317; GTID position '0-1-102', master: 127.0.0.1:16300
      2023-05-02 10:11:23 23 [Note] Slave I/O thread killed during or after a reconnect done to recover from failed read
      2023-05-02 10:11:23 23 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 1317; GTID position 0-1-102, master 127.0.0.1:16300
      2023-05-02 10:11:23 0 [Note] InnoDB: FTS optimize thread exiting.
      2023-05-02 10:11:23 0 [Note] InnoDB: Starting shutdown...
      

      Attachments

        Issue Links

          Activity

            People

              bnestere Brandon Nesterenko
              angelique.sklavounos Angelique Sklavounos (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.