Details
Description
If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update LMT with its timestamp, even if the workers may be quite far behind.
Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the MDEV-29639 logic if this counter is 0.
Attachments
Issue Links
- causes
-
MDEV-31749 New test rpl.rpl_parallel_sbm in bb-10.4-MDEV-30619 sporadically fails in various locations (prepatch: lines 100, 177, 184) (postpatch_1: lines 180, 187)
- Closed
- includes
-
MDEV-31749 New test rpl.rpl_parallel_sbm in bb-10.4-MDEV-30619 sporadically fails in various locations (prepatch: lines 100, 177, 184) (postpatch_1: lines 180, 187)
- Closed
- is caused by
-
MDEV-29639 Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas
- Closed
- relates to
-
MDEV-23021 rpl.rpl_parallel_optimistic_until fails on BB with various pattern
- Closed
-
MDEV-30608 rpl.rpl_delayed_parallel_slave_sbm sometimes fails with Seconds_Behind_Master should not have used second transaction timestamp
- Closed
-
MDEV-31895 Report a Replica's Time Difference with its Primary
- Closed
-
MDEV-32265 seconds_behind_master is inaccurate for Delayed replication
- Closed