[MDEV-4473] GTID doesn't work in face of master failovers Created: 2013-05-03  Updated: 2013-05-27  Resolved: 2013-05-03

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.2
Fix Version/s: 10.0.3

Type: Bug Priority: Critical
Reporter: Pavel Ivanov Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: None

Attachments: File rpl_gtid_two_masters.cnf     File rpl_gtid_two_masters.test    

 Description   

I've attached the test exposing the problem.

What test does: sets up rpl topology 1->2,1->3, executes some statements, shuts down server 3, performs failover to make topology 2->1, executes some more statements, rotates binlogs, connects server 3 as a slave to 2. At this point server 2 hits ASSERT at sql/sql_repl.cc:1001.

The problem is GTID implementation has an inconsistent view of what is replication domain and what data constitutes the replication domain state. E.g. function contains_all_slave_gtid() assumes that Gtid_list contains only one gtid by domain yet compares seq_no only if gtid contains the same server_id as requested by slave which is wrong. And this "wrong" is what I wanted to catch with my test but discovered that Gtid_list actually contains one gtid for each domain-server pair which seems to be even more wrong – for how long non-existent old server_id will be stored and passed from one Gtid_list to another on each binlog rotation?

mysqld hits ASSERT in the test because gtid_find_binlog_file() makes the same assumptions as contains_all_slave_gtid() but is written in a way prohibiting any work if domain_id is repeated in Gtid_list.

I think the fix should be to remove server_id comparisons from the places like gtid_find_binlog_file() and contains_all_slave_gtid() and make sure that Gtid_list contains only one gtid per domain. After all server_id exists only for de-duplication of events with the same seq_no belonging to alternate futures.



 Comments   
Comment by Kristian Nielsen [ 2013-05-03 ]

Hm, that assert is wrong:

const rpl_gtid *gtid= state->find(glev->list[i].domain_id);
if (!gtid)

{ /* contains_all_slave_gtid() would have returned false if so. */ DBUG_ASSERT(0); continue; }

It is correct in the first iteration of the loop, but as we delete from the
`state' hash in the loop, the condition may no longer hold in subsequent
iterations of the loop. So the assert should just be removed.

I will check that everything else is ok with the supplied test case (thanks
for supplying that, made it very easy to reproduce the problem).

Comment by Pavel Ivanov [ 2013-05-03 ]

Note though contains_all_slave_gtid() returns true for slave_log.00003 when server 3 tries to connect to server 2 which is wrong because some events become skipped in slave_log.00002.

Comment by Kristian Nielsen [ 2013-05-03 ]

Yes, I've noticed, thanks.

I'll fix.

Comment by Kristian Nielsen [ 2013-05-03 ]

BTW, I noticed that your test has --sync_with_master in server_3, but not server_2 (for the initial events on server 1). I assume this is just an oversight, because without it the test becomes non-deterministic (it will fail if server 2 slave threads happen to stop before they replicated everything from server 1)

Comment by Kristian Nielsen [ 2013-05-03 ]

Ok, so there was a deeper issue here.

Suppose binlog file X has in its Gtid_list_event: 0-1-3,0-2-5, and suppose the
slave requests to start replicating after 0-1-3.

In this case the bug was that master would start sending events from the start
of X. This is wrong, because 0-2-4 and 0-2-5 are contained in X-1, and are
needed by the slave. So these events were lost.

On the other hand, if the slave requested 0-2-5, then it is correct to start
sending from the beginning of binlog file X, because 0-2-5 is the last GTID
logged in earlier binlogs. The difference is that 0-2-5 is the last of the
GTIDs in the Gtid_list_event. The problem was that the code did not check that
the matched GTID was the last one in the list.

Fixed by checking if the gtid requested by slave that matches a gtid in the
Gtid_list_event is the last event for that domain in the list. If not, go back
to a prior binlog to ensure all needed events are sent to slave.

I pushed the fix to 10.0-base.
Patch: revid:knielsen@knielsen-hq.org-20130503092729-gedp152b19k5wdnj

Comment by Pavel Ivanov [ 2013-05-03 ]

Kristian,
So why do you think it's necessary to store gtid for each server_id in gtid_list? Why not store one gtid per domain?

And am I right that if some server was master, then failed over to another master and then ceased to exist altogether gtid with its server_id will exist indefinitely in the gtid_list in the binlogs of every server? Don't you think it's a problem and a way to accumulate a lot of unnecessary garbage? How will everything behave if gtid_list contains 1000 or 10,000 different gtid records?

Comment by Pavel Ivanov [ 2013-05-03 ]

OK, I see your response on maria-developers list. I'll move the discussions there.

Generated at Thu Feb 08 06:56:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.