Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.2.2, 10.3.0, 10.4.0, 10.2.27, 10.5.0, 10.2.32
Description
The function thd_query_safe() is used in the implementation of the following INFORMATION_SCHEMA views:
- information_schema.innodb_trx
- information_schema.innodb_locks
- information_schema.innodb_lock_waits
- information_schema.rocksdb_trx
The implementation of the InnoDB views is in trx_i_s_common_fill_table(). This function invokes trx_i_s_possibly_fetch_data_into_cache(), which will acquire lock_sys->mutex and trx_sys->mutex in order to protect the set of active transactions and explicit locks. While holding those mutexes, it will traverse the collection of InnoDB transactions. For each transaction, thd_query_safe() will be invoked. That function is acquiring THD::LOCK_thd_data while holding the InnoDB locks. This is violates the latching order and causes a deadlock with THD::awake() (such as executing KILL QUERY):
mariadb-10.2.27 |
Thread 34 (Thread 0x7fba01f27700 (LWP 15746)):
|
#0 __lll_lock_wait (futex=futex@entry=0x7fb9b00024e8, private=0) at lowlevellock.c:52
|
#1 0x00007fbae78f3a25 in __GI___pthread_mutex_lock (mutex=0x7fb9b00024e8) at ../nptl/pthread_mutex_lock.c:135
|
#2 0x000055efc101b5c8 in inline_mysql_mutex_lock (that=0x7fb9b00024e8, src_file=<optimized out>, src_line=4536) at /mariadb/10.3o/include/mysql/psi/mysql_thread.h:683
|
#3 thd_query_safe (thd=0x7fb9b0000c48, buf=0x7fba01f24480 "\350\026\001\240\271\177", buflen=1025) at /mariadb/10.3o/sql/sql_class.cc:4536
|
#4 0x000055efc14625db in fill_trx_row (row=0x7fb9a00248e0, trx=0x7fbae44c4090, requested_lock_row=0x0, cache=0x55efc1d89988 <trx_i_s_cache_static>) at /mariadb/10.3o/storage/innobase/trx/trx0i_s.cc:488
|
#5 fetch_data_into_cache_low (cache=0x55efc1d89988 <trx_i_s_cache_static>, read_write=<optimized out>, trx_list=0x55efc712e460) at /mariadb/10.3o/storage/innobase/trx/trx0i_s.cc:1287
|
#6 0x000055efc1461903 in fetch_data_into_cache (cache=0x55efc1d89988 <trx_i_s_cache_static>) at /mariadb/10.3o/storage/innobase/trx/trx0i_s.cc:1313
|
#7 trx_i_s_possibly_fetch_data_into_cache (cache=0x55efc1d89988 <trx_i_s_cache_static>) at /mariadb/10.3o/storage/innobase/trx/trx0i_s.cc:1341
|
#8 0x000055efc1389a41 in trx_i_s_common_fill_table (thd=0x7fb9a0000c48, tables=0x7fb9a000f500) at /mariadb/10.3o/storage/innobase/handler/i_s.cc:1225
|
#9 0x000055efc10b050e in get_schema_tables_result (join=0x7fb9a00116e8, executed_place=PROCESSED_BY_JOIN_EXEC) at /mariadb/10.3o/sql/sql_show.cc:8431
|
#10 0x000055efc107f835 in JOIN::exec_inner (this=0x7fb9a00116e8) at /mariadb/10.3o/sql/sql_select.cc:3591
|
#11 0x000055efc106ad98 in JOIN::exec (this=0x7fb9a00116e8) at /mariadb/10.3o/sql/sql_select.cc:3422
|
#12 mysql_select (thd=0x7fb9a0000c48, tables=<optimized out>, wild_num=<optimized out>, fields=..., conds=<optimized out>, og_num=<optimized out>, order=<optimized out>, group=<optimized out>, having=<optimized out>, proc_param=<optimized out>, select_options=<optimized out>, result=<optimized out>, unit=<optimized out>, select_lex=<optimized out>) at /mariadb/10.3o/sql/sql_select.cc:3822
|
#13 0x000055efc106aa2f in handle_select (thd=0x7fb9a0000c48, lex=0x7fb9a00045c0, result=0x7fb9a00116c8, setup_tables_done_option=<optimized out>) at /mariadb/10.3o/sql/sql_select.cc:365
|
#14 0x000055efc104c2b6 in execute_sqlcom_select (thd=0x7fb9a0000c48, all_tables=0x7fb9a000f500) at /mariadb/10.3o/sql/sql_parse.cc:6226
|
#15 0x000055efc10454ae in mysql_execute_command (thd=<optimized out>) at /mariadb/10.3o/sql/sql_parse.cc:3533
|
#16 0x000055efc1042ee0 in mysql_parse (thd=0x7fb9a0000c48, rawbuf=0x7fb9a000f2c0 "SELECT * FROM INFORMATION_SCHEMA.innodb_trx", length=<optimized out>, parser_state=0x7fba01f26340, is_com_multi=<optimized out>, is_next_command=false) at /mariadb/10.3o/sql/sql_parse.cc:7760
|
#17 0x000055efc1040911 in dispatch_command (command=<optimized out>, thd=0x7fb9a0000c48, packet=0x7fb9a0006fa9 "SELECT * FROM INFORMATION_SCHEMA.innodb_trx", packet_length=2684416704, is_com_multi=false, is_next_command=false) at /mariadb/10.3o/sql/sql_parse.cc:1831
|
…
|
|
Thread 33 (Thread 0x7fba01f72700 (LWP 15570)):
|
…
|
#8 0x000055efc13af778 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=<optimized out>, n_spins=30, n_delay=6, name=<optimized out>, line=6580) at /mariadb/10.3o/storage/innobase/include/ib0mutex.h:637
|
#9 lock_trx_handle_wait (trx=0x7fbae44c4090) at /mariadb/10.3o/storage/innobase/lock/lock0lock.cc:6580
|
#10 0x000055efc11d7dd4 in kill_handlerton (thd=0x7fb9b0000c48, plugin=<optimized out>, level=0x7fba01f702b4) at /mariadb/10.3o/sql/handler.cc:843
|
#11 0x000055efc10555af in plugin_foreach_with_mask (thd=0x7fb9b0000c48, func=0x55efc11d7da0 <kill_handlerton(THD*, st_plugin_int*, void*)>, type=<optimized out>, state_mask=8, arg=0x7fba01f702b4) at /mariadb/10.3o/sql/sql_plugin.cc:2402
|
#12 0x000055efc11d7d90 in ha_kill_query (thd=0x55efc712bee4, level=THD_ABORT_ASAP) at /mariadb/10.3o/sql/handler.cc:850
|
#13 0x000055efc1011b4d in THD::awake (this=0x7fb9b0000c48, state_to_set=KILL_QUERY_HARD) at /mariadb/10.3o/sql/sql_class.cc:1723
|
#14 0x000055efc104fbf2 in kill_one_thread (thd=0x7fb99c000c48, id=<optimized out>, kill_signal=KILL_QUERY_HARD, type=KILL_TYPE_ID) at /mariadb/10.3o/sql/sql_parse.cc:8916
|
#15 0x000055efc1049a98 in sql_kill (thd=0x7fb99c000c48, id=8, state=NOT_KILLED, type=(KILL_TYPE_USER | unknown: 3884938724)) at /mariadb/10.3o/sql/sql_parse.cc:9022
|
#16 mysql_execute_command (thd=<optimized out>) at /mariadb/10.3o/sql/sql_parse.cc:5220
|
#17 0x000055efc1042ee0 in mysql_parse (thd=0x7fb99c000c48, rawbuf=0x7fb99c00f2c0 "kill query 8", length=<optimized out>, parser_state=0x7fba01f71340, is_com_multi=<optimized out>, is_next_command=false) at /mariadb/10.3o/sql/sql_parse.cc:7760
|
…
|
Thread 34 (fetch_data_into_cache_low() for information_schema.innodb_trx) is holding lock_sys->mutex, trx_sys->mutex and trying to acquire THD::LOCK_thd_data in thd_query_safe().
Thread 33 (THD::awake() for KILL QUERY) is holding THD::LOCK_thd_data on the killee connection (which Thread 34 is waiting for) while invoking lock_trx_handle_wait(), which is stuck, trying to acquire lock_sys->mutex (which Thread 34 is holding).
I can reproduce the hang with 4 client connections:
- A session that will be repeatedly killed
- A loop executing SHOW PROCESSLIST;
- A loop of KILL QUERY of Connection 1
- SELECT * FROM information_schema.…
I do not know why, but I was unable to reproduce the hang without executing SHOW PROCESSLIST.
It is unclear to me how information_schema.rocksdb_trx is protected against concurrent creation or destruction of transactions. I did not notice any mutex acquisition there. Thus, the server hang may be limited to the 3 information_schema.innodb_ views only.
Attachments
Issue Links
- relates to
-
MDEV-13983 Mariadb becomes unresponsive
- Closed