[MDEV-6825] mysqld server "deadlocks" after issuing STOP SLAVE command while replicated DELETE query is executing (freeing items vs killing slave) - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.5.37, 5.5.39
Fix Version/s: 5.5.43
Component/s: Replication, Storage Engine - InnoDB
Labels:
- innodb
- replication
Environment:

Hide
Linux 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux

> show global variables like '%version%';
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+---------------------+
| innodb_version | 5.5.38-MariaDB-35.2 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.5.39-MariaDB-log |
| version_comment | MariaDB Server |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |

Show
Linux 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux > show global variables like '%version%'; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | innodb_version | 5.5.38-MariaDB-35.2 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.39-MariaDB-log | | version_comment | MariaDB Server | | version_compile_machine | x86_64 | | version_compile_os | Linux |

Description

Hello,

This problem I noticed after replacing MySQL 5.5.x with MariaDB 5.5.
It occured on both MariaDB 5.5.39, 5.5.37 as well as some older MariaDB 5.5 versions.

It is possible to render MariaDB instance unusable by putting it into some kind of internal "deadlock state":

How to reproduce:

1. Use master-slave setup with statement based replication and InnoDB as storage engine.
2. Wait until slave server replicates and executes a DELETE query replicated from master.
3. Issue "STOP SLAVE" command on the slave server before execution of the DELETE query is completed by the replication sql_thread.

Effect:
MariaDB on slave server will enter "deadlock state" (kind of):

STOP SLAVE stalls on State: "Killing slave"
DELETE query stalls on State: "Freeing items"
CPU load generated by mysqld drops to nearly 0,
IO load generated by mysqld drops to 0,

Last entry in mysq.err:

140909  3:05:02 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-bin.034913' at position 53766638

Output head from "SHOW PROCESSLIST" (after "deadlock state" is in place for a few hours here, normally our DELETE queries take much less time to complete):

Id      User    Host    db      Command Time    State   Info    Progress

24611   system user             NULL    Connect 89203   Waiting for master to send event        NULL    0.000

24612   system user             NULL    Connect 17161   freeing items   NULL    0.000

35399   [user1-namechanged]  [ip-removed]:43814    NULL    Query   15421   Killing slave   stop slave      0.000

35400   [user2-namechanged]       [ip-removed]:50631    NULL    Query   15420   executing       SHOW /*!50002 GLOBAL */ STATUS  0.000

35401   [user2-namechanged]       [ip-removed]:51717    NULL    Sleep   15396           NULL    0.000

35402   [user2-namechanged]       [ip-removed]:51718    NULL    Sleep   15371           NULL    0.000

35403   [user3-namechanged] [ip-removed]:52953    NULL    Query   15347   executing       show global status      0.000

35404   [user2-namechanged]       [ip-removed]:51719    NULL    Sleep   15346           NULL    0.000

35405   [user2-namechanged]       [ip-removed]:51720    NULL    Sleep   15321           NULL    0.000

(full processlist attached)

Output head from running Poor Man's Profiler on mysqld in the "deadlock state":

    855 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,at,at,one_thread_per_connection_end,do_handle_one_connection,handle_one_connection,start_thread,clone,??

      2 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??

      1 read,vio_read,vio_read_buff,my_real_read,my_net_read,cli_safe_read,handle_slave_io,start_thread,clone,??

      1 pthread_cond_wait@@GLIBC_2.3.2,os_cond_wait,reset_sig_count=5),srv_purge_thread,start_thread,clone,??

      1 pthread_cond_wait@@GLIBC_2.3.2,os_cond_wait,reset_sig_count=43),srv_master_thread,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_monitor_thread,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_LRU_dump_restore_thread,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_lock_timeout_thread,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,os_cond_wait_timed,time_in_usec=<value,srv_error_monitor_thread,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,inline_mysql_cond_timedwait,term_lock=0x7f7a1cb26d48,,terminate_slave_threads,stop_slave,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??

      1 pthread_cond_timedwait@@GLIBC_2.3.2,inline_mysql_cond_timedwait,sleep_time=<value,ma_checkpoint_background,start_thread,clone,??

      1 poll,handle_connections_sockets,mysqld_main,__libc_start_main,_start

      1 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,tables=0x7f8e36dd2338,,get_schema_tables_result,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,execute_show_status,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??

      1 __lll_lock_wait,_L_lock_1172,pthread_mutex_lock,inline_mysql_mutex_lock,tables=0x7f8e36cd5338,,get_schema_tables_result,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,execute_show_status,at,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone,??

(full output attached)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

hanged_pmp.txt
55 kB
2014-10-01 19:38
hanged_processlist_edited.txt
55 kB
2014-10-02 12:11
master1.my.cnf
1 kB
2014-10-02 14:23
slave1.my.cnf
1 kB
2014-10-02 14:23

Activity

People

Assignee:: Sergei Golubchik

Reporter:: Maciej Zalewski

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2014-10-01 19:38

Updated:: 2015-05-04 21:28

Resolved:: 2015-05-04 21:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.