Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.4(EOL), 10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2, 11.3(EOL), 11.4
-
None
Description
When using CHANGE MASTER TO IGNORE_DOMAIN_IDS=(ign_gtids), using master_pos_wait(), which uses binlog coordinates rather than GTIDs, transactiions which were already ignored by the slave can be re-executed, if the ignored_domain_ids removes previously ignored domains. That is, if a connection issues master_pos_wait() to wait on the coordinates of an ignored-by-domain-id transaction, and the slave is immediately stopped (just STOP SLAVE is enough, the server doesn't need to be restarted), the replica's gtid_slave_pos state may or may not contain the GTID of the ignored transaction. The effect of this is that, if that domain_id is removed from the IGNORE_DOMAIN_IDS list and the slave is restarted, it may or may not fetch and execute that previously ignored transaction.
This is due to an inconsistency in setting the replication state when encountering an ignored transaction. The IO thread sets the binlog coordinates state immediately upon seeing an ignored transaction, whereas the SQL thread updates the GTID state. If the slave is stopped in-between these updates, they become out of sync.
This is highlighted by the test failures reported in MDEV-10684 and MDEV-14357.
Attachments
Issue Links
- causes
-
MDEV-10684 rpl.rpl_domain_id_filter_restart fails in buildbot
- Closed
-
MDEV-14357 rpl.rpl_domain_id_filter_io_crash failed in buildbot with wrong result
- Closed