[MDEV-7888] ANALYZE TABLE does wakeup_subsequent_commits(), causing wrong binlog order and parallel replication hang Created: 2015-03-31 Updated: 2015-04-08 Resolved: 2015-04-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.0.17, 10.1 |
| Fix Version/s: | 10.0.18, 10.1.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Kristian Nielsen |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | parallelslave | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
All threads seem to be waiting for some 'prior transaction' which is not there.
Full binlog is attached as mysql-bin.000001. When I feed the same binlog to a slave via a master again, I get it stuck in a slightly different way:
Again, bad things seem to happen around CREATE OR REPLACE TABLE, so maybe it's related. To reproduce:
Not reproducible without the optimistic mode. |
| Comments |
| Comment by Kristian Nielsen [ 2015-04-01 ] |
|
Ok, the problem here seems to be ANALYZE TABLE. It does ha_trans_commit() (The bug should be in 10.0 / "conservative" mode also, but probably harder to So a work-around is to not binlog ANALYZE TABLE, meanwhile I'll find a way to |
| Comment by Elena Stepanova [ 2015-04-02 ] |
|
Just keep in mind that ANALYZE TABLE, among other things, might affect persistent statistics. I'm not saying that not binlogging it is a bad idea (in fact, I was always wondering why it was binlogged, and we just recently had a request from danblack for not doing it), but it can stop collecting persistent statistics on the slave in setups which now rely on it. |
| Comment by Daniel Black [ 2015-04-02 ] |
|
clarifying, I was more after an option to binlog the results of the analyze table but not the command itself. Analyze table already has NO_WRITE_TO_BINLOG | LOCAL. ( |
| Comment by Kristian Nielsen [ 2015-04-08 ] |
|
http://lists.askmonty.org/pipermail/commits/2015-April/007723.html |
| Comment by Kristian Nielsen [ 2015-04-08 ] |
|
10.1-specific part: http://lists.askmonty.org/pipermail/commits/2015-April/007724.html |
| Comment by Elena Stepanova [ 2015-04-08 ] |
|
Interestingly, I got this problem a lot with optimistic mode, but haven't got it even one time with the conservative mode, otherwise the same flow and options. So, apparently there is something about the optimistic mode that seriously increases the probability. |
| Comment by Kristian Nielsen [ 2015-04-08 ] |
|
Yes, looking more, I was not actually able to trigger the observed problem directly in 10.0 (or in 10.1 conservative mode). The logical problem exists in the code in 10.0 (and I have fixed it there). However, ANALYZE cannot actually be binlogged in the same group-commit as another event group, so it may not be possible to trigger it directly in conservative mode. (In the test case, I artificially simulate same group commit using DBUG injection). So I agree this particular hang may not be possible to get in 10.0/conservative. I think maybe another consequence of the bug could be observable in 10.1 / conservative: out-of-order binlogging (different binlog order on slave compared to master). But I haven't actually checked that this is possible, and if it is the window of opportunity for the race should be very small, so hard to trigger. |