[MDEV-10245] Two concurrent OPTIMIZE on InnoDB table can cause a long timed wait Created: 2016-06-16  Updated: 2022-10-03  Resolved: 2022-10-03

Status: Closed
Project: MariaDB Server
Component/s: Admin statements, Storage Engine - InnoDB
Affects Version/s: 10.0, 10.1, 10.2
Fix Version/s: 10.2.3

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-7660 MySQL WL#6671 "Improve scalability by... Closed
relates to MDEV-13379 subsequent OPTIMIZEs on the same tabl... Closed

 Description   

When two threads execute OPTIMIZE on an InnoDB table simultaneously, they can easily lock each other. Technically it is not a deadlock, it times out upon lock_wait_timeout, but given that the default value for this variable is 1 year, it is as good as a genuine deadlock.

Reproducible on 10.0, 10.1, 10.2, both with XtraDB and InnoDB.
Not reproducible on 5.5, MySQL 5.6/5.7.

RQG grammar mdev10245.yy

query_init:
    CREATE TABLE IF NOT EXISTS `t1` (
      `entity_type` VARCHAR(32) NOT NULL DEFAULT '',
      `entity_id` INT(10) UNSIGNED NOT NULL DEFAULT '0',
      `revision_id` INT(10) UNSIGNED NOT NULL DEFAULT '0',
      `language` VARCHAR(32) NOT NULL DEFAULT '',
      `data` LONGBLOB NOT NULL,
      PRIMARY KEY (`entity_type`,`entity_id`,`revision_id`,`language`),
      KEY `type_revision` (`entity_type`,`revision_id`),
      KEY `myind1` (`entity_type`,`language`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
;
 
query:
    OPTIMIZE TABLE `t1`;

RQG command line

perl ./runall-new.pl --grammar=mdev10245.yy --threads=2 --queries=100M --skip-gendata --duration=300 --basedir=<your basedir> --vardir=<your vardir>

Thread 3 (Thread 0x7f4cc785db00 (LWP 30394)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f4cc7209a1e in safe_cond_timedwait (cond=0x7f4ca075f218, mp=0x7f4ca075f168, abstime=0x7f4cc7859350, file=0x7f4cc72b9700 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1202) at /data/src/10.1/mysys/thr_mutex.c:547
#2  0x00007f4cc6aadd8d in inline_mysql_cond_timedwait (that=0x7f4ca075f218, mutex=0x7f4ca075f168, abstime=0x7f4cc7859350, src_file=0x7f4cc72b995a "/data/src/10.1/sql/mdl.cc", src_line=1083) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1202
#3  0x00007f4cc6aaf073 in MDL_wait::timed_wait (this=0x7f4ca075f168, owner=0x7f4ca075f120, abs_timeout=0x7f4cc7859350, set_status_on_timeout=false, wait_state_name=0x7f4cc7a1fff0 <MDL_key::m_namespace_to_wait_state_name+48>) at /data/src/10.1/sql/mdl.cc:1083
#4  0x00007f4cc6ab0a07 in MDL_context::acquire_lock (this=0x7f4ca075f168, mdl_request=0x7f4cc7859410, lock_wait_timeout=31536000) at /data/src/10.1/sql/mdl.cc:2057
#5  0x00007f4cc6ab1138 in MDL_context::upgrade_shared_lock (this=0x7f4ca075f168, mdl_ticket=0x7f4c9713f080, new_type=MDL_EXCLUSIVE, lock_wait_timeout=31536000) at /data/src/10.1/sql/mdl.cc:2250
#6  0x00007f4cc690b559 in wait_while_table_is_used (thd=0x7f4ca075f070, table=0x7f4c9774da70, function=HA_EXTRA_PREPARE_FOR_RENAME) at /data/src/10.1/sql/sql_base.cc:1877
#7  0x00007f4cc6a35d27 in mysql_inplace_alter_table (thd=0x7f4ca075f070, table_list=0x7f4c9704f198, table=0x7f4c9774da70, altered_table=0x7f4c9716f070, ha_alter_info=0x7f4cc7859830, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f4cc78598a0, alter_ctx=0x7f4cc785a440) at /data/src/10.1/sql/sql_table.cc:7130
#8  0x00007f4cc6a3a556 in mysql_alter_table (thd=0x7f4ca075f070, new_db=0x0, new_name=0x0, create_info=0x7f4cc785b020, table_list=0x7f4c9704f198, alter_info=0x7f4cc785af90, order_num=0, order=0x0, ignore=false) at /data/src/10.1/sql/sql_table.cc:8935
#9  0x00007f4cc6a3ca4f in mysql_recreate_table (thd=0x7f4ca075f070, table_list=0x7f4c9704f198, table_copy=false) at /data/src/10.1/sql/sql_table.cc:9684
#10 0x00007f4cc6ab41db in admin_recreate_table (thd=0x7f4ca075f070, table_list=0x7f4c9704f198) at /data/src/10.1/sql/sql_admin.cc:58
#11 0x00007f4cc6ab70ea in mysql_admin_table(THD *, TABLE_LIST *, HA_CHECK_OPT *, const char *, thr_lock_type, bool, bool, uint, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *), struct {...}, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *)) (thd=0x7f4ca075f070, tables=0x7f4c9704f198, check_opt=0x7f4ca07638a0, operator_name=0x7f4cc72bb448 "optimize", lock_type=TL_WRITE, open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x7f4cc6bb3da4 <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0) at /data/src/10.1/sql/sql_admin.cc:954
#12 0x00007f4cc6ab832e in Sql_cmd_optimize_table::execute (this=0x7f4c9704f790, thd=0x7f4ca075f070) at /data/src/10.1/sql/sql_admin.cc:1296
#13 0x00007f4cc697d7ad in mysql_execute_command (thd=0x7f4ca075f070) at /data/src/10.1/sql/sql_parse.cc:5669
#14 0x00007f4cc6981aba in mysql_parse (thd=0x7f4ca075f070, rawbuf=0x7f4c9704f088 "OPTIMIZE TABLE `t1` /* QUERY_NO 3 CON_ID 6 */", length=45, parser_state=0x7f4cc785c630) at /data/src/10.1/sql/sql_parse.cc:7314
#15 0x00007f4cc69704e4 in dispatch_command (command=COM_QUERY, thd=0x7f4ca075f070, packet=0x7f4ca0765071 "", packet_length=46) at /data/src/10.1/sql/sql_parse.cc:1486
#16 0x00007f4cc696f21b in do_command (thd=0x7f4ca075f070) at /data/src/10.1/sql/sql_parse.cc:1107
#17 0x00007f4cc6aa4e45 in do_handle_one_connection (thd_arg=0x7f4ca075f070) at /data/src/10.1/sql/sql_connect.cc:1350
#18 0x00007f4cc6aa4ba9 in handle_one_connection (arg=0x7f4ca075f070) at /data/src/10.1/sql/sql_connect.cc:1262
#19 0x00007f4cc60700a4 in start_thread (arg=0x7f4cc785db00) at pthread_create.c:309
#20 0x00007f4cc422887d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
 
Thread 2 (Thread 0x7f4cc7813b00 (LWP 30396)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f4cc7209a1e in safe_cond_timedwait (cond=0x7f4c9780e188, mp=0x7f4c9785d108, abstime=0x7f4cc7810ef0, file=0x7f4cc746f370 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1202) at /data/src/10.1/mysys/thr_mutex.c:547
#2  0x00007f4cc72060a3 in inline_mysql_cond_timedwait (that=0x7f4c9780e188, mutex=0x7f4c9785d108, abstime=0x7f4cc7810ef0, src_file=0x7f4cc746f3a0 "/data/src/10.1/mysys/thr_lock.c", src_line=586) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1202
#3  0x00007f4cc7206719 in wait_for_lock (wait=0x7f4c9785d1d8, data=0x7f4c97183cc0, in_wait_list=0 '\000', lock_wait_timeout=31536000) at /data/src/10.1/mysys/thr_lock.c:586
#4  0x00007f4cc720726d in thr_lock (data=0x7f4c97183cc0, owner=0x7f4ca07818a8, lock_wait_timeout=31536000) at /data/src/10.1/mysys/thr_lock.c:1023
#5  0x00007f4cc7207b2e in thr_multi_lock (data=0x7f4c97824d20, count=1, owner=0x7f4ca07818a8, lock_wait_timeout=31536000) at /data/src/10.1/mysys/thr_lock.c:1302
#6  0x00007f4cc6c979db in mysql_lock_tables (thd=0x7f4ca0780070, sql_lock=0x7f4c97824d00, flags=0) at /data/src/10.1/sql/lock.cc:323
#7  0x00007f4cc6c9780e in mysql_lock_tables (thd=0x7f4ca0780070, tables=0x7f4c97822268, count=1, flags=0) at /data/src/10.1/sql/lock.cc:275
#8  0x00007f4cc691121f in open_ltable (thd=0x7f4ca0780070, table_list=0x7f4c97822198, lock_type=TL_WRITE, lock_flags=0) at /data/src/10.1/sql/sql_base.cc:5198
#9  0x00007f4cc6ab7215 in mysql_admin_table(THD *, TABLE_LIST *, HA_CHECK_OPT *, const char *, thr_lock_type, bool, bool, uint, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *), struct {...}, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *)) (thd=0x7f4ca0780070, tables=0x7f4c97822198, check_opt=0x7f4ca07848a0, operator_name=0x7f4cc72bb448 "optimize", lock_type=TL_WRITE, open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x7f4cc6bb3da4 <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0) at /data/src/10.1/sql/sql_admin.cc:970
#10 0x00007f4cc6ab832e in Sql_cmd_optimize_table::execute (this=0x7f4c97822790, thd=0x7f4ca0780070) at /data/src/10.1/sql/sql_admin.cc:1296
#11 0x00007f4cc697d7ad in mysql_execute_command (thd=0x7f4ca0780070) at /data/src/10.1/sql/sql_parse.cc:5669
#12 0x00007f4cc6981aba in mysql_parse (thd=0x7f4ca0780070, rawbuf=0x7f4c97822088 "OPTIMIZE TABLE `t1` /* QUERY_NO 3 CON_ID 7 */", length=45, parser_state=0x7f4cc7812630) at /data/src/10.1/sql/sql_parse.cc:7314
#13 0x00007f4cc69704e4 in dispatch_command (command=COM_QUERY, thd=0x7f4ca0780070, packet=0x7f4ca0786071 "", packet_length=46) at /data/src/10.1/sql/sql_parse.cc:1486
#14 0x00007f4cc696f21b in do_command (thd=0x7f4ca0780070) at /data/src/10.1/sql/sql_parse.cc:1107
#15 0x00007f4cc6aa4e45 in do_handle_one_connection (thd_arg=0x7f4ca0780070) at /data/src/10.1/sql/sql_connect.cc:1350
#16 0x00007f4cc6aa4ba9 in handle_one_connection (arg=0x7f4ca0780070) at /data/src/10.1/sql/sql_connect.cc:1262
#17 0x00007f4cc60700a4 in start_thread (arg=0x7f4cc7813b00) at pthread_create.c:309
#18 0x00007f4cc422887d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Processlist

+----+------+-----------------+-------+---------+------+---------------------------------+-----------------------------------------------+----------+
| Id | User | Host            | db    | Command | Time | State                           | Info                                          | Progress |
+----+------+-----------------+-------+---------+------+---------------------------------+-----------------------------------------------+----------+
|  2 | rqg  | localhost:32805 | mysql | Sleep   |   98 |                                 | NULL                                          |    0.000 |
|  6 | rqg  | localhost:32811 | test  | Query   |   97 | Waiting for table metadata lock | OPTIMIZE TABLE `t1` /* QUERY_NO 3 CON_ID 6 */ |    0.000 |
|  7 | rqg  | localhost:32812 | test  | Query   |   98 | Waiting for table level lock    | OPTIMIZE TABLE `t1` /* QUERY_NO 3 CON_ID 7 */ |    0.000 |
|  8 | root | localhost:44261 | NULL  | Query   |    0 | init                            | show processlist                              |    0.000 |
+----+------+-----------------+-------+---------+------+---------------------------------+-----------------------------------------------+----------+



 Comments   
Comment by Elena Stepanova [ 2016-06-17 ]

Also happens with OPTIMIZE + ALTER;

Comment by Marko Mäkelä [ 2020-12-17 ]

I think that the problem must reside outside InnoDB, possibly in the MDL code, which I understand is now maintained by wlad.

Comment by Elena Stepanova [ 2022-10-03 ]

The failure stopped happening on 10.2 after this commit in 10.2.3:

commit d665e79c5b8582f44dc280e5e6df4a8ff4945623
Author: Sergey Vojtovich
Date:   Fri May 6 13:44:07 2016 +0400
 
    MDEV-7660 - MySQL WL#6671 "Improve scalability by not using thr_lock.c locks
                for InnoDB tables"

Generated at Thu Feb 08 07:40:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.