Details
Description
If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update LMT with its timestamp, even if the workers may be quite far behind.
Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the MDEV-29639 logic if this counter is 0.
Attachments
Issue Links
- causes
-
MDEV-31749 New test rpl.rpl_parallel_sbm in bb-10.4-MDEV-30619 sporadically fails in various locations (prepatch: lines 100, 177, 184) (postpatch_1: lines 180, 187)
-
- Closed
-
- includes
-
MDEV-31749 New test rpl.rpl_parallel_sbm in bb-10.4-MDEV-30619 sporadically fails in various locations (prepatch: lines 100, 177, 184) (postpatch_1: lines 180, 187)
-
- Closed
-
- is caused by
-
MDEV-29639 Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas
-
- Closed
-
- relates to
-
MDEV-23021 rpl.rpl_parallel_optimistic_until fails on BB with various pattern
-
- Closed
-
-
MDEV-30608 rpl.rpl_delayed_parallel_slave_sbm sometimes fails with Seconds_Behind_Master should not have used second transaction timestamp
-
- Closed
-
-
MDEV-31895 Report a Replica's Time Difference with its Primary
-
- Closed
-
-
MDEV-32265 seconds_behind_master is inaccurate for Delayed replication
-
- Closed
-
-
MDEV-17516 Replication lag issue using parallel replication
-
- Stalled
-
Activity
Field | Original Value | New Value |
---|---|---|
Link |
This issue is caused by |
Link |
This issue relates to |
Fix Version/s | 10.7 [ 24805 ] |
Fix Version/s | 10.3 [ 22126 ] |
Fix Version/s | 10.8 [ 26121 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Assignee | Brandon Nesterenko [ JIRAUSER48702 ] | Andrei Elkin [ elkin ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Labels | CS0610910 |
Link | This issue blocks TODO-4054 [ TODO-4054 ] |
Status | In Review [ 10002 ] | In Testing [ 10301 ] |
Labels | CS0610910 |
Assignee | Andrei Elkin [ elkin ] | Roel Van de Paar [ roel ] |
Description |
If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update lmt with its timestamp, even if the workers may be quite far behind.
Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the |
If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update LMT with its timestamp, even if the workers may be quite far behind.
Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the |
Link |
This issue causes |
Assignee | Roel Van de Paar [ roel ] | Brandon Nesterenko [ JIRAUSER48702 ] |
Assignee | Brandon Nesterenko [ JIRAUSER48702 ] | Roel Van de Paar [ roel ] |
Assignee | Roel Van de Paar [ roel ] | Andrei Elkin [ elkin ] |
Status | In Testing [ 10301 ] | Stalled [ 10000 ] |
Link |
This issue relates to |
Fix Version/s | 10.4.31 [ 29010 ] | |
Fix Version/s | 10.5.22 [ 29011 ] | |
Fix Version/s | 10.6.15 [ 29013 ] | |
Fix Version/s | 10.9.8 [ 29015 ] | |
Fix Version/s | 10.10.6 [ 29017 ] | |
Fix Version/s | 10.11.5 [ 29019 ] | |
Fix Version/s | 11.0.3 [ 28920 ] | |
Fix Version/s | 11.1.2 [ 28921 ] | |
Fix Version/s | 11.2.1 [ 29034 ] | |
Fix Version/s | 10.8.8 [ 28518 ] | |
Fix Version/s | 10.4 [ 22408 ] | |
Fix Version/s | 10.5 [ 23123 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.9 [ 26905 ] | |
Fix Version/s | 10.10 [ 27530 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Link |
This issue includes |
Link |
This issue relates to |
Link |
This issue relates to |
Zendesk Related Tickets | 202138 |
Link | This issue relates to MDEV-17516 [ MDEV-17516 ] |
Hi Andrei!
This is ready for review as PR-2682