Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.5
-
None
Description
A follow-up on MDEV-33602, see discussion here: https://lists.mariadb.org/hyperkitty/list/developers@lists.mariadb.org/thread/D7ZTCHGEF2IAXFCDKGNNZDEQ4JEUKGQ4/
In GTID mode, the old-style position is still being updated, to allow switching GTID off and continuing replication from old-style position. At the start during GTID connect, the master dump thread sends a few events from the start of the binlog file (Format_description, Binlog_checkpoint, etc.), then sends a fake Gtid_list event when the target GTID position is found, containing both the GTID and old-style position. Once the fake Gtid_list event is received, the slave can set the correct old-style position.
However, the current code also updates the old-style position from the events received before the fake Gtid_list. This results in temporarily having the wrong old-style position for a brief time. If the slave threads are stopped before receiving the fake Gtid_list, the old-style position can thus be wrong; so that if at this point the GTID mode is switched off, old-style replication will continue from the wrong point.
Suggested fix, due to Elkin:
As it's Gtid_list_log_event::log_pos that makes the file:pos (old-style coordinates)
|
state be valid why won't be keep the coordinates intact until the event
|
has been arrived/processed? Say that situation is remembered in `RLI::seen_gtid_log_list_event`.
|
Then for instance in the serial case the fixes would look like
|
|
--- a/sql/rpl_rli.cc
|
+++ b/sql/rpl_rli.cc
|
@@ -1030,7 +1030,7 @@ void Relay_log_info::inc_group_relay_log_pos(ulonglong log_pos,
|
rgi->last_master_timestamp > last_master_timestamp)
|
last_master_timestamp= rgi->last_master_timestamp;
|
}
|
- else
|
+ else if (!mi->using_gtid || seen_gtid_log_list_event)
|
{
|
/* Non-parallel case. */
|
group_relay_log_pos= event_relay_log_pos;
|