[MDEV-4473] GTID doesn't work in face of master failovers - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.0.2
Fix Version/s: 10.0.3
Component/s: None
Labels:
None

Description

I've attached the test exposing the problem.

What test does: sets up rpl topology 1->2,1->3, executes some statements, shuts down server 3, performs failover to make topology 2->1, executes some more statements, rotates binlogs, connects server 3 as a slave to 2. At this point server 2 hits ASSERT at sql/sql_repl.cc:1001.

The problem is GTID implementation has an inconsistent view of what is replication domain and what data constitutes the replication domain state. E.g. function contains_all_slave_gtid() assumes that Gtid_list contains only one gtid by domain yet compares seq_no only if gtid contains the same server_id as requested by slave which is wrong. And this "wrong" is what I wanted to catch with my test but discovered that Gtid_list actually contains one gtid for each domain-server pair which seems to be even more wrong – for how long non-existent old server_id will be stored and passed from one Gtid_list to another on each binlog rotation?

mysqld hits ASSERT in the test because gtid_find_binlog_file() makes the same assumptions as contains_all_slave_gtid() but is written in a way prohibiting any work if domain_id is repeated in Gtid_list.

I think the fix should be to remove server_id comparisons from the places like gtid_find_binlog_file() and contains_all_slave_gtid() and make sure that Gtid_list contains only one gtid per domain. After all server_id exists only for de-duplication of events with the same seq_no belonging to alternate futures.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

rpl_gtid_two_masters.cnf
0.2 kB
2013-05-03 00:11
rpl_gtid_two_masters.test
2 kB
2013-05-03 00:11

Activity

Ascending order - Click to sort in descending order

Kristian Nielsen added a comment - 2013-05-03 09:13

Hm, that assert is wrong:

const rpl_gtid *gtid= state->find(glev->list[i].domain_id);
if (!gtid)

{ /* contains_all_slave_gtid() would have returned false if so. */ DBUG_ASSERT(0); continue; }

It is correct in the first iteration of the loop, but as we delete from the
`state' hash in the loop, the condition may no longer hold in subsequent
iterations of the loop. So the assert should just be removed.

I will check that everything else is ok with the supplied test case (thanks
for supplying that, made it very easy to reproduce the problem).

Kristian Nielsen added a comment - 2013-05-03 09:13 Hm, that assert is wrong: const rpl_gtid *gtid= state->find(glev->list [i] .domain_id); if (!gtid) { /* contains_all_slave_gtid() would have returned false if so. */ DBUG_ASSERT(0); continue; } It is correct in the first iteration of the loop, but as we delete from the `state' hash in the loop, the condition may no longer hold in subsequent iterations of the loop. So the assert should just be removed. I will check that everything else is ok with the supplied test case (thanks for supplying that, made it very easy to reproduce the problem).

Pavel Ivanov added a comment - 2013-05-03 09:17

Note though contains_all_slave_gtid() returns true for slave_log.00003 when server 3 tries to connect to server 2 which is wrong because some events become skipped in slave_log.00002.

Pavel Ivanov added a comment - 2013-05-03 09:17 Note though contains_all_slave_gtid() returns true for slave_log.00003 when server 3 tries to connect to server 2 which is wrong because some events become skipped in slave_log.00002.

Kristian Nielsen added a comment - 2013-05-03 09:45

Yes, I've noticed, thanks.

I'll fix.

Kristian Nielsen added a comment - 2013-05-03 09:45 Yes, I've noticed, thanks. I'll fix.

Kristian Nielsen added a comment - 2013-05-03 09:47

BTW, I noticed that your test has --sync_with_master in server_3, but not server_2 (for the initial events on server 1). I assume this is just an oversight, because without it the test becomes non-deterministic (it will fail if server 2 slave threads happen to stop before they replicated everything from server 1)

Kristian Nielsen added a comment - 2013-05-03 09:47 BTW, I noticed that your test has --sync_with_master in server_3, but not server_2 (for the initial events on server 1). I assume this is just an oversight, because without it the test becomes non-deterministic (it will fail if server 2 slave threads happen to stop before they replicated everything from server 1)

Kristian Nielsen added a comment - 2013-05-03 12:42

Ok, so there was a deeper issue here.

Suppose binlog file X has in its Gtid_list_event: 0-1-3,0-2-5, and suppose the
slave requests to start replicating after 0-1-3.

In this case the bug was that master would start sending events from the start
of X. This is wrong, because 0-2-4 and 0-2-5 are contained in X-1, and are
needed by the slave. So these events were lost.

On the other hand, if the slave requested 0-2-5, then it is correct to start
sending from the beginning of binlog file X, because 0-2-5 is the last GTID
logged in earlier binlogs. The difference is that 0-2-5 is the last of the
GTIDs in the Gtid_list_event. The problem was that the code did not check that
the matched GTID was the last one in the list.

Fixed by checking if the gtid requested by slave that matches a gtid in the
Gtid_list_event is the last event for that domain in the list. If not, go back
to a prior binlog to ensure all needed events are sent to slave.

I pushed the fix to 10.0-base.
Patch: revid:knielsen@knielsen-hq.org-20130503092729-gedp152b19k5wdnj

Kristian Nielsen added a comment - 2013-05-03 12:42 Ok, so there was a deeper issue here. Suppose binlog file X has in its Gtid_list_event: 0-1-3,0-2-5, and suppose the slave requests to start replicating after 0-1-3. In this case the bug was that master would start sending events from the start of X. This is wrong, because 0-2-4 and 0-2-5 are contained in X-1, and are needed by the slave. So these events were lost. On the other hand, if the slave requested 0-2-5, then it is correct to start sending from the beginning of binlog file X, because 0-2-5 is the last GTID logged in earlier binlogs. The difference is that 0-2-5 is the last of the GTIDs in the Gtid_list_event. The problem was that the code did not check that the matched GTID was the last one in the list. Fixed by checking if the gtid requested by slave that matches a gtid in the Gtid_list_event is the last event for that domain in the list. If not, go back to a prior binlog to ensure all needed events are sent to slave. I pushed the fix to 10.0-base. Patch: revid:knielsen@knielsen-hq.org-20130503092729-gedp152b19k5wdnj

Pavel Ivanov added a comment - 2013-05-03 16:35

Kristian,
So why do you think it's necessary to store gtid for each server_id in gtid_list? Why not store one gtid per domain?

And am I right that if some server was master, then failed over to another master and then ceased to exist altogether gtid with its server_id will exist indefinitely in the gtid_list in the binlogs of every server? Don't you think it's a problem and a way to accumulate a lot of unnecessary garbage? How will everything behave if gtid_list contains 1000 or 10,000 different gtid records?

Pavel Ivanov added a comment - 2013-05-03 16:35 Kristian, So why do you think it's necessary to store gtid for each server_id in gtid_list? Why not store one gtid per domain? And am I right that if some server was master, then failed over to another master and then ceased to exist altogether gtid with its server_id will exist indefinitely in the gtid_list in the binlogs of every server? Don't you think it's a problem and a way to accumulate a lot of unnecessary garbage? How will everything behave if gtid_list contains 1000 or 10,000 different gtid records?

Pavel Ivanov added a comment - 2013-05-03 16:51

OK, I see your response on maria-developers list. I'll move the discussions there.

Pavel Ivanov added a comment - 2013-05-03 16:51 OK, I see your response on maria-developers list. I'll move the discussions there.

People

Assignee:: Kristian Nielsen

Reporter:: Pavel Ivanov

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2013-05-03 00:11

Updated:: 2013-05-27 13:51

Resolved:: 2013-05-03 12:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration