Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.1.10, 10.1.11, 10.1.13
-
CentOS release 6.7 (Final) x64 on Dell PowerEdge R510
Description
Since we switched from MariaDB 10.0.x to MariaDB 10.1 we are having trouble with the statement 'stop slave;' it just hangs and doesn't return even after hours. The 'show slave status\G' statement made with another connection is also blocking at this moment. It doesn't produce any output.
This happens randomly, if there is low load on the server it is quite hard to reproduce the issue. Stopping and starting the slave in short intervals may succeed 30 times or more without problems.
If the server is under heavy load it needs only very few tries to reproduce it. I can reproduce it very quickly when table checksums are created with pt-table-checksum.
The only way to stop MariaDB when 'stop slave' is hanging is 'kill -9'.
We are using parallel replication, as you can see in the my.cnf attached.
Further there is a back trace attached, that has been created as described on mariadb.org. If necessary, I could repeat it with a DEBUG build.
I also attached the running mysql processes in this moment.
We never had this issue with MariaDB 10.0.x with the same configuration expect slave_run_triggers_for_rbr = 1 of course, since it is available in MariaDB 10.1 only.
Please let me know, if I can provide any more details that might be helpful.
Attachments
Issue Links
- includes
-
MDEV-10644 One of parallel replication threads remains active after STOP SLAVE SQL_THREAD completes
- Closed
- relates to
-
MDEV-12104 Testing for MDEV-9573 and extra replication bugfixes
- Stalled
-
MDEV-17346 parallel slave start and stop races to workers disappeared
- Closed
-
MDEV-31572 STOP SLAVE hangs on 10.3.39
- Closed