[MDEV-4605] GTID slave_pos gets corrupted after slave crash recovery, replication starts from the beginning Created: 2013-05-31 Updated: 2013-06-04 Resolved: 2013-06-03 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 10.0.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Kristian Nielsen |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
This is the problem we discussed earlier on IRC, but I also promised to describe how to reproduce it, so I'll do it here Scenario:
The test executes additional checks between the main steps:
The logging shows that usually the position reset happens after START SLAVE. After server startup and before START SLAVE gtid*pos variables are empty, while mysql.gtid_slave_pos contains the correct value. After START SLAVE gtid_current_pos, gtid_slave_pos and the table get updated to contain the wrong position. To reproduce,
With the given parameters, the test won't shut down servers after the finish, you need to do it manually, otherwise next time the test won't run. That's how the typical log output looks: Normal cycle between intentional crashes (nothing bad happens):
Erroneous startup:
At this point the test continuous running the flow on master till the end of duration, but you can interrupt it here. bzr version-info
|
| Comments |
| Comment by Kristian Nielsen [ 2013-06-03 ] |
|
Fix pushed to 10.0-base. Thanks for finding this serious issue! The main problem here was that the code for loading the position In addition to fixing these places in the code, I also now made Elena, I did not try to run the RQG, so I hope you will try that |
| Comment by Elena Stepanova [ 2013-06-03 ] |
|
Yes, I've run the test several times, no problems so far. Also, slave_pos is now initialized right away after server restart, before START SLAVE (as expected). |