[MDEV-4275] GTID: If slave was set up with master_gtid_pos=auto, IO thread restart makes it start from the beginning of the binlog Created: 2013-03-15 Updated: 2013-03-21 Resolved: 2013-03-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Kristian Nielsen |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | replication | ||
| Issue Links: |
|
||||||||
| Description |
|
I run CHANGE MASTER ... master_gtid_pos=auto; => the slave attempts to re-execute previous statements.
Test case:
Result:
|
| Comments |
| Comment by Kristian Nielsen [ 2013-03-18 ] |
|
Right, this is an important issue, thanks for catching. The underlying issue here is that when IO thread connects (or re-connects), it needs to request position So there are several possibitilities for fetching again something that the SQL thread is in the middle of executing, or similar races. My current code does not handle this at all. It can be especially tricky as the SQL thread may be running while the IO thread loses the connection to the master and needs to automatically reconnect. I think I need to make it so that the SQL thread remembers what it executed, so that it can skip stuff that gets duplicate-fetched into relay logs. This is not too hard, it only needs to be done in-memory. Whenever slave server is restarted or CHANGE MASTER is executed, we can just drop existing relay logs (which we need to do anyway). Still, needs to be done carefully to handle all cases properly. |
| Comment by Kristian Nielsen [ 2013-03-21 ] |
|
I found a better approach to fix this. The first connect of the I/O thread after CHANGE MASTER or restart removes any |