Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5.37, 5.5.39
-
Linux 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux
> show global variables like '%version%';
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+---------------------+
| innodb_version | 5.5.38-MariaDB-35.2 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.5.39-MariaDB-log |
| version_comment | MariaDB Server |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
Linux 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux > show global variables like '%version%'; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | innodb_version | 5.5.38-MariaDB-35.2 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.39-MariaDB-log | | version_comment | MariaDB Server | | version_compile_machine | x86_64 | | version_compile_os | Linux |
Description
Hello,
This problem I noticed after replacing MySQL 5.5.x with MariaDB 5.5.
It occured on both MariaDB 5.5.39, 5.5.37 as well as some older MariaDB 5.5 versions.
It is possible to render MariaDB instance unusable by putting it into some kind of internal "deadlock state":
How to reproduce:
1. Use master-slave setup with statement based replication and InnoDB as storage engine.
2. Wait until slave server replicates and executes a DELETE query replicated from master.
3. Issue "STOP SLAVE" command on the slave server before execution of the DELETE query is completed by the replication sql_thread.
Effect:
MariaDB on slave server will enter "deadlock state" (kind of):
- STOP SLAVE stalls on State: "Killing slave"
- DELETE query stalls on State: "Freeing items"
- CPU load generated by mysqld drops to nearly 0,
- IO load generated by mysqld drops to 0,
Last entry in mysq.err:
140909 3:05:02 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-bin.034913' at position 53766638
|
Output head from "SHOW PROCESSLIST" (after "deadlock state" is in place for a few hours here, normally our DELETE queries take much less time to complete):
|
Id User Host db Command Time State Info Progress
|
24611 system user NULL Connect 89203 Waiting for master to send event NULL 0.000
|
24612 system user NULL Connect 17161 freeing items NULL 0.000
|
35399 [user1-namechanged] [ip-removed]:43814 NULL Query 15421 Killing slave stop slave 0.000
|
35400 [user2-namechanged] [ip-removed]:50631 NULL Query 15420 executing SHOW /*!50002 GLOBAL */ STATUS 0.000
|
35401 [user2-namechanged] [ip-removed]:51717 NULL Sleep 15396 NULL 0.000
|
35402 [user2-namechanged] [ip-removed]:51718 NULL Sleep 15371 NULL 0.000
|
35403 [user3-namechanged] [ip-removed]:52953 NULL Query 15347 executing show global status 0.000
|
35404 [user2-namechanged] [ip-removed]:51719 NULL Sleep 15346 NULL 0.000
|
35405 [user2-namechanged] [ip-removed]:51720 NULL Sleep 15321 NULL 0.000
|
(full processlist attached)
Output head from running Poor Man's Profiler on mysqld in the "deadlock state":
855 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,at,at,one_thread_per_connection_end,do_handle_one_connection,handle_one_connection,start_thread,clone,??
|
2 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??
|
1 read,vio_read,vio_read_buff,my_real_read,my_net_read,cli_safe_read,handle_slave_io,start_thread,clone,??
|
1 pthread_cond_wait@@GLIBC_2.3.2,os_cond_wait,reset_sig_count=5),srv_purge_thread,start_thread,clone,??
|
1 pthread_cond_wait@@GLIBC_2.3.2,os_cond_wait,reset_sig_count=43),srv_master_thread,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_monitor_thread,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_LRU_dump_restore_thread,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_lock_timeout_thread,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_error_monitor_thread,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,inline_mysql_cond_timedwait,term_lock=0x7f7a1cb26d48,,terminate_slave_threads,stop_slave,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??
|
1 pthread_cond_timedwait@@GLIBC_2.3.2,inline_mysql_cond_timedwait,sleep_time=<value,ma_checkpoint_background,start_thread,clone,??
|
1 poll,handle_connections_sockets,mysqld_main,__libc_start_main,_start
|
1 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,tables=0x7f8e36dd2338,,get_schema_tables_result,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,execute_show_status,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??
|
1 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,tables=0x7f8e36cd5338,,get_schema_tables_result,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,execute_show_status,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??
|
(full output attached)