[MDEV-4726] Race in mysql-test/suite/rpl/t/rpl_gtid_stop_start.test Created: 2013-06-28  Updated: 2014-02-10  Resolved: 2014-02-10

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.3
Fix Version/s: 10.0.9

Type: Bug Priority: Major
Reporter: Jeremy Cole Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File mdev_4726.patch    
Issue Links:
Blocks
is blocked by MDEV-4984 Implement MASTER_GTID_WAIT() Closed

 Description   

There's a race in mysql-test/suite/rpl/t/rpl_gtid_stop_start.test as follows:

The rpl_gtid_stop_start test used SELECT COUNT(*) as a wait_condition when waiting for rows to appear from the replication stream. Since the table it is waiting on is MyISAM, rows appearing there do not guarantee in any way that the GTID information will have been updated by the slave thread (and can't guarantee that). This causes a race in the test which is especially exacerbated by making the gtid_slave_pos table InnoDB, since then the COMMIT that is done as part of recording the GTID is bound to take some tens of milliseconds.



 Comments   
Comment by Jeremy Cole [ 2013-06-28 ]

I attached a patch which changes the wait_condition to wait on the proper GTID to appear instead of waiting on a row count. I believe this to be the only actual way to wait without a race when using MyISAM for gtid_slave_pos; nonetheless this is safe and effective anyway.

Comment by Jeremy Cole [ 2013-06-28 ]

We've had another failing test this morning which is also due to using SELECT COUNT as a wait_condition. Really, all of the tests in the rpl suite should be reviewed to not use SELECT COUNT as a wait_condition, lest they may all flake eventually.

Comment by Elena Stepanova [ 2013-06-28 ]

I'm not quite sure Kristian meant it to be a MyISAM table, maybe it just happened this way. I'll check with him before modifying tests.

Comment by Elena Stepanova [ 2013-06-28 ]

Hi Kristian,

See the above – did you mean the tables to be MyISAM?
If you did, I can modify the tests as Jeremy suggested. If you didn't, I can add explicit engine=InnoDB in all cases when there is no engine in the DDL.

Comment by Kristian Nielsen [ 2013-06-29 ]

Thanks Jeremy for the nice analysis!

I do not explicitly need the table to be MyISAM, on the other hand I did try
in the various test cases to use a mix of MyISAM and InnoDB, to improve
coverage of different combinations.

The wait_condition.inc with COUNT was anyway meant as a workaround for not
yet having proper MASTER_GTID_WAIT(), so would be good to fix.

The best solution is probably to implement include/save_master_gtid_pos.inc
and include/sync_with_master_gtid.inc, doing basically what Jeremy
suggests. And then rewrite include/sync_with_master_gtid.inc when
MASTER_GTID_WAIT() is implemented to use it. If noone beats me to it I can
look into that after the parallel replication / MDEV-4506 stuff (which
currently has priority).

Comment by Kristian Nielsen [ 2014-02-10 ]

Should be fixed now. The synchronisation is now made with MASTER_GTID_WAIT() in the places where SELECT COUNT=N was used before. Let me know if I missed some places.

Generated at Thu Feb 08 06:58:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.