Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28378

galera.galera_as_slave_ctas fails with a timeout

Details

    Description

      The test fails for me with a timeout:

      CURRENT_TEST: galera.galera_as_slave_ctas
      --- /mariadb/10.4/mysql-test/suite/galera/r/galera_as_slave_ctas.result	2022-01-24 10:38:52.049546080 +0200
      +++ /mariadb/10.4/mysql-test/suite/galera/r/galera_as_slave_ctas.reject	2022-04-21 11:22:43.595270214 +0300
      @@ -13,7 +13,32 @@
       CREATE TABLE source (f1 INTEGER PRIMARY KEY) ENGINE=InnoDB;
       CREATE TABLE target AS SELECT * FROM source;
       connection node_1;
      +Timeout in wait_condition.inc for SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'target';
      +Id	User	Host	db	Command	Time	State	Info	Progress
      +1	system user		NULL	Sleep	36	wsrep aborter idle	NULL	0.000
      +2	system user		NULL	Sleep	36	Closing tables	NULL	0.000
      +3	system user		NULL	Daemon	NULL	InnoDB purge coordinatorNULL	0.000
      +4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +7	system user		NULL	Daemon	NULL	InnoDB shutdown handlerNULL	0.000
      +17	root	localhost	test	Sleep	32		NULL	0.000
      +18	root	localhost:52110	test	Query	0	Init	show full processlist	0.000
      +19	system user		NULL	Slave_IO	30	Waiting for master to send event	NULL	0.000
      +20	system user		test	Slave_SQL	30	Slave has read all relay log; waiting for the slave I/O thread to update it	CREATE TABLE `target` (
      +  `f1` int(11) NOT NULL
      +)	0.000
       connection node_2;
      +Timeout in wait_condition.inc for SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'target';
      +Id	User	Host	db	Command	Time	State	Info	Progress
      +1	system user		NULL	Sleep	65	wsrep aborter idle	NULL	0.000
      +2	system user		NULL	Sleep	60	After apply log event	NULL	0.000
      +4	system user		NULL	Daemon	NULL	InnoDB purge coordinatorNULL	0.000
      +5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +7	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +8	system user		NULL	Daemon	NULL	InnoDB shutdown handlerNULL	0.000
      +16	root	localhost:41988	test	Query	0	Init	show full processlist	0.000
       connection node_3;
       DROP TABLE target;
       INSERT INTO source VALUES(1);
       
      mysqltest: Result length mismatch
      

      The table target will not appear in INFORMATION_SCHEMA.TABLES even after a delay of several seconds. I double-checked it by editing the test.

      Attachments

        Issue Links

          Activity

            I could reproduce timeout after ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; was added to test case, because we need it.

            janlindstrom Jan Lindström added a comment - I could reproduce timeout after ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; was added to test case, because we need it.

            denis.protivensky I could not fully figure this out, I do not really know is some earlier change on CTAS causing problems. Original timeout can be fixed on wsrep_peak_event (slave.cc) with adding

            +    // Make sure that future position is not larger than EOF
            +    // because e.g. in CTAS where selected table is empty
            +    // there will be no row events.
            +    if (future_pos >= rgi->rli->cur_log->end_of_file)
            +      break;
            +
            

            Before we seek to future pos. However, this is not enough. There is two cases (1) mysql.gtid_slave_pos engine is InnoDB (I think this is default on releases) and (2) it is MyISAM (not sure if we need to fix this). I have attached test case to this MDEV.

            janlindstrom Jan Lindström added a comment - denis.protivensky I could not fully figure this out, I do not really know is some earlier change on CTAS causing problems. Original timeout can be fixed on wsrep_peak_event (slave.cc) with adding + // Make sure that future position is not larger than EOF + // because e.g. in CTAS where selected table is empty + // there will be no row events. + if (future_pos >= rgi->rli->cur_log->end_of_file) + break; + Before we seek to future pos. However, this is not enough. There is two cases (1) mysql.gtid_slave_pos engine is InnoDB (I think this is default on releases) and (2) it is MyISAM (not sure if we need to fix this). I have attached test case to this MDEV.

            janlindstrom You were close.

            Indeed, the hang is because we tried to read an event past the end of the log. But the reason is that there was no QUERY_EVENT we tried to find, but only XID_EVENT which we skipped and hung after that. I think it's okay as we've already applied part of the event and for empty CTAS there's nothing more but XID left, which we peeked and skipped.

            The other important aspect is that this fix requires MDEV-32633 to be merged before.

            denis.protivensky Denis Protivensky added a comment - janlindstrom You were close. Indeed, the hang is because we tried to read an event past the end of the log. But the reason is that there was no QUERY_EVENT we tried to find, but only XID_EVENT which we skipped and hung after that. I think it's okay as we've already applied part of the event and for empty CTAS there's nothing more but XID left, which we peeked and skipped. The other important aspect is that this fix requires MDEV-32633 to be merged before.

            denis.protivensky I see, my problem later was that nodes GTIDs did not remain consistent on both CTAS with empty table and CTAS with table with rows.

            janlindstrom Jan Lindström added a comment - denis.protivensky I see, my problem later was that nodes GTIDs did not remain consistent on both CTAS with empty table and CTAS with table with rows.

            People

              sysprg Julius Goryavsky
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.