Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.0.12
-
None
Description
When master crashes in the middle of writing a multi-statement transaction to binlog so that slave has already received some events from this transaction, then IO thread reconnects to the restarted master and thinks that it will re-download the same binlog events. But master will actually either not send anything (if no new transactions has executed), or will send completely different events from new transactions which will result in completely different data on the slave compared to data on the master.
I'd bet the root cause of the problem is in how IO thread reconnects when GTID-based replication is turned on, and in these few lines of code starting at sql/slave.cc:5310:
/*
|
Do not queue any format description event that we receive after a
|
reconnect where we are skipping over a partial event group received
|
before the reconnect.
|
|
(If we queued such an event, and it was the first format_description
|
event after master restart, the slave SQL thread would think that
|
the partial event group before it in the relay log was from a
|
previous master crash and should be rolled back).
|
*/
|
if (unlikely(mi->gtid_reconnect_event_skip_count && !mi->gtid_event_seen))
|
gtid_skip_enqueue= true;
|
In the scenario I described above SQL thread actually must roll back the active transaction.
In the attachment is the patch that allows to emulate this scenario. Apply it, run rpl_gtid_crash test and look at the results of last two SELECTs – they will be different on master and slave.
I will look into a way to fix this problem myself, but will appreciate any help. I'll attach a patch if I manage to find a fix before anyone on MariaDB side.
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Fix Version/s | 10.0 [ 16000 ] | |
Assignee | Kristian Nielsen [ knielsen ] | |
Labels | gtid |
Attachment | fix_reconnect_crashed_master.txt [ 32400 ] |
Attachment | fix_reconnect_crashed_master.txt [ 32400 ] |
Attachment | fix_reconnect_crashed_master.txt [ 32600 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Fix Version/s | 10.0.14 [ 17101 ] | |
Fix Version/s | 10.0 [ 16000 ] |
Comment |
[ When I looked at the code I had an impression that LOCK_log is held only when binlog dumping threads sleep waiting for new events to arrive. But if the dumping thread is actively reading binlog file without stopping then it can easily jump from the previous event group to the next incomplete event group if that one has already partially written (probably that event group should be big so that it's not fully cached in the IO_CACHE during writing). So it's still possible for the slave to receive partial event group before master has crashed. Did I miss something? So I'd claim that binlog truncation won't solve this bug completely, but could be a useful addition anyway. ] |
Assignee | Kristian Nielsen [ knielsen ] | Michael Widenius [ monty ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Summary | Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction | NEED REVIEW: Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Assignee | Michael Widenius [ monty ] | Kristian Nielsen [ knielsen ] |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Summary | NEED REVIEW: Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction | Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction |
Workflow | MariaDB v2 [ 50420 ] | MariaDB v3 [ 64183 ] |
Workflow | MariaDB v3 [ 64183 ] | MariaDB v4 [ 148045 ] |
Attaching a proposed fix together with the revised and expanded test case. The fix seem to work well, but please check if it's appropriate enough for non-standard use cases, e.g. for parallel slave.