Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6, 10.7(EOL)
Description
The MTR-like test below reproduces the failure for me quite easily, but it can depend on a build, for example, I had a problem reproducing it on certain ASAN builds. If it happens, any tool providing better concurrency can help.
When it is reproducible, it is also rr-able.
The deadlock obeys innodb_lock_wait_timeout (which is somewhat counter-intuitive, I would expect lock_wait_timeout. It doesn't obey max_statement_time or KILL QUERY, in the sense that the killed query stays in the process list.
Not for the regression suite!
--source include/have_innodb.inc
|
|
CREATE TABLE t1 (pk INT PRIMARY KEY) ENGINE=InnoDB; |
|
--connect (con1,localhost,root,,)
|
|
--let run= 2000
|
while ($run)
|
{
|
--send ALTER TABLE t1 FORCE |
--connection default |
--error 0,ER_CANT_CREATE_TABLE |
--eval CREATE OR REPLACE TABLE t2 (pk INT PRIMARY KEY, FOREIGN KEY (pk) REFERENCES t1 (pk)) ENGINE=InnoDB /* $run left */ |
--connection con1 |
--reap |
--dec $run |
}
|
10.6 58fe6b47 |
Thread 13 (Thread 0x7f9b89eff700 (LWP 168021)):
|
#0 futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f9b89efa320, clockid=<optimized out>, expected=0, futex_word=0x7f9b8b19a5b0) at ../sysdeps/nptl/futex-internal.h:320
|
#1 __pthread_cond_wait_common (abstime=0x7f9b89efa320, clockid=<optimized out>, mutex=0x5621c073a4a8 <lock_sys+232>, cond=0x7f9b8b19a588) at pthread_cond_wait.c:520
|
#2 __pthread_cond_timedwait (cond=0x7f9b8b19a588, mutex=0x5621c073a4a8 <lock_sys+232>, abstime=0x7f9b89efa320) at pthread_cond_wait.c:656
|
#3 0x00005621bf1ef5d0 in safe_cond_timedwait (cond=0x7f9b8b19a588, mp=0x5621c073a480 <lock_sys+192>, abstime=0x7f9b89efa320, file=0x5621bf62a740 "/data/src/10.6/storage/innobase/lock/lock0lock.cc", line=1831) at /data/src/10.6/mysys/thr_mutex.c:546
|
#4 0x00005621bede285e in lock_wait (thr=0x7f9b589dfbf8) at /data/src/10.6/storage/innobase/lock/lock0lock.cc:1831
|
#5 0x00005621beef387a in row_mysql_handle_errors (new_err=0x7f9b89efa574, trx=0x7f9b8b19a540, thr=0x7f9b589dfbf8, savept=0x0) at /data/src/10.6/storage/innobase/row/row0mysql.cc:686
|
#6 0x00005621bede9cbf in lock_table_for_trx (table=0x7f9b6421e168, trx=0x7f9b8b19a540, mode=LOCK_X) at /data/src/10.6/storage/innobase/lock/lock0lock.cc:3668
|
#7 0x00005621bed8783e in ha_innobase::commit_inplace_alter_table (this=0x7f9b589be810, altered_table=0x7f9b89efb270, ha_alter_info=0x7f9b89efb1b0, commit=true) at /data/src/10.6/storage/innobase/handler/handler0alter.cc:10864
|
#8 0x00005621be903fb0 in handler::ha_commit_inplace_alter_table (this=0x7f9b589be810, altered_table=0x7f9b89efb270, ha_alter_info=0x7f9b89efb1b0, commit=true) at /data/src/10.6/sql/handler.cc:5194
|
#9 0x00005621be65fe94 in mysql_inplace_alter_table (thd=0x7f9b58000db8, table_list=0x7f9b58014230, table=0x7f9b589d8cd8, altered_table=0x7f9b89efb270, ha_alter_info=0x7f9b89efb1b0, target_mdl_request=0x7f9b89efba70, ddl_log_state=0x7f9b89efb150, trigger_param=0x7f9b89efb640, alter_ctx=0x7f9b89efc5d0) at /data/src/10.6/sql/sql_table.cc:7436
|
#10 0x00005621be668674 in mysql_alter_table (thd=0x7f9b58000db8, new_db=0x7f9b580059b8, new_name=0x7f9b58005dd0, create_info=0x7f9b89efd3e0, table_list=0x7f9b58014230, alter_info=0x7f9b89efd2f0, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/src/10.6/sql/sql_table.cc:10204
|
#11 0x00005621be71e35f in Sql_cmd_alter_table::execute (this=0x7f9b58014918, thd=0x7f9b58000db8) at /data/src/10.6/sql/sql_alter.cc:550
|
#12 0x00005621be563b07 in mysql_execute_command (thd=0x7f9b58000db8, is_called_from_prepared_stmt=false) at /data/src/10.6/sql/sql_parse.cc:5997
|
#13 0x00005621be569cb3 in mysql_parse (thd=0x7f9b58000db8, rawbuf=0x7f9b58014160 "ALTER TABLE t1 FORCE", length=20, parser_state=0x7f9b89efe480) at /data/src/10.6/sql/sql_parse.cc:8030
|
#14 0x00005621be5560e3 in dispatch_command (command=COM_QUERY, thd=0x7f9b58000db8, packet=0x7f9b5800b879 "ALTER TABLE t1 FORCE", packet_length=20, blocking=true) at /data/src/10.6/sql/sql_parse.cc:1896
|
#15 0x00005621be554a7f in do_command (thd=0x7f9b58000db8, blocking=true) at /data/src/10.6/sql/sql_parse.cc:1404
|
#16 0x00005621be713123 in do_handle_one_connection (connect=0x5621c1b77408, put_in_cache=true) at /data/src/10.6/sql/sql_connect.cc:1418
|
#17 0x00005621be712db3 in handle_one_connection (arg=0x5621c1b77408) at /data/src/10.6/sql/sql_connect.cc:1312
|
#18 0x00005621bec4404d in pfs_spawn_thread (arg=0x5621c1baaba8) at /data/src/10.6/storage/perfschema/pfs.cc:2201
|
#19 0x00007f9b9523c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#20 0x00007f9b94e0f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
|
Thread 12 (Thread 0x7f9b89f4a700 (LWP 168018)):
|
#0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
|
#1 0x00005621bef6b07a in ssux_lock_impl<false>::wait (this=0x5621bfce8188 <dict_sys+72>, lk=2147483649) at /data/src/10.6/storage/innobase/sync/srw_lock.cc:354
|
#2 0x00005621bef6aa31 in ssux_lock_impl<false>::wr_wait (this=0x5621bfce8188 <dict_sys+72>, lk=2147483649) at /data/src/10.6/storage/innobase/sync/srw_lock.cc:509
|
#3 0x00005621bed5902e in ssux_lock_impl<false>::wr_lock (this=0x5621bfce8188 <dict_sys+72>) at /data/src/10.6/storage/innobase/include/srw_lock.h:232
|
#4 0x00005621bef6ac5d in srw_lock_impl<false>::psi_wr_lock (this=0x5621bfce8180 <dict_sys+64>, file=0x5621bf5ff328 "/data/src/10.6/storage/innobase/handler/ha_innodb.cc", line=13549) at /data/src/10.6/storage/innobase/sync/srw_lock.cc:598
|
#5 0x00005621bee3c974 in srw_lock_impl<false>::wr_lock (this=0x5621bfce8180 <dict_sys+64>, file=0x5621bf5ff328 "/data/src/10.6/storage/innobase/handler/ha_innodb.cc", line=13549) at /data/src/10.6/storage/innobase/include/srw_lock.h:494
|
#6 0x00005621bf066c87 in dict_sys_t::lock_wait (this=0x5621bfce8140 <dict_sys>, file=0x5621bf5ff328 "/data/src/10.6/storage/innobase/handler/ha_innodb.cc", line=13549) at /data/src/10.6/storage/innobase/dict/dict0dict.cc:961
|
#7 0x00005621bed53079 in dict_sys_t::lock (this=0x5621bfce8140 <dict_sys>, file=0x5621bf5ff328 "/data/src/10.6/storage/innobase/handler/ha_innodb.cc", line=13549) at /data/src/10.6/storage/innobase/include/dict0dict.h:1571
|
#8 0x00005621bed3dfd9 in ha_innobase::delete_table (this=0x7f9b640168f8, name=0x7f9b89f47820 "./test/t2") at /data/src/10.6/storage/innobase/handler/ha_innodb.cc:13549
|
#9 0x00005621be8f5712 in hton_drop_table (hton=0x5621c15ce578, path=0x7f9b89f47820 "./test/t2") at /data/src/10.6/sql/handler.cc:576
|
#10 0x00005621be8fbb0b in ha_delete_table (thd=0x7f9b64000db8, hton=0x5621c15ce578, path=0x7f9b89f47820 "./test/t2", db=0x7f9b89f475f0, alias=0x7f9b89f47600, generate_warning=false) at /data/src/10.6/sql/handler.cc:3100
|
#11 0x00005621be64dd07 in mysql_rm_table_no_locks (thd=0x7f9b64000db8, tables=0x7f9b89f47d70, current_db=0x7f9b64000e50, ddl_log_state=0x7f9b89f48810, if_exists=false, drop_temporary=false, drop_view=false, drop_sequence=false, dont_log_query=true, dont_free_locks=true) at /data/src/10.6/sql/sql_table.cc:1469
|
#12 0x00005621be656cea in create_table_impl (thd=0x7f9b64000db8, ddl_log_state_create=0x7f9b89f487f0, ddl_log_state_rm=0x7f9b89f48810, orig_db=..., orig_table_name=..., db=..., table_name=..., path=..., options=..., create_info=0x7f9b89f48ac0, alter_info=0x7f9b89f489d0, create_table_mode=0, is_trans=0x7f9b89f487ce, key_info=0x7f9b89f48530, key_count=0x7f9b89f48524, frm=0x7f9b89f48550) at /data/src/10.6/sql/sql_table.cc:4307
|
#13 0x00005621be65788f in mysql_create_table_no_lock (thd=0x7f9b64000db8, ddl_log_state_create=0x7f9b89f487f0, ddl_log_state_rm=0x7f9b89f48810, db=0x7f9b64015780, table_name=0x7f9b64015790, create_info=0x7f9b89f48ac0, alter_info=0x7f9b89f489d0, is_trans=0x7f9b89f487ce, create_table_mode=0, table_list=0x7f9b64015768) at /data/src/10.6/sql/sql_table.cc:4545
|
#14 0x00005621be657e20 in mysql_create_table (thd=0x7f9b64000db8, create_table=0x7f9b64015768, create_info=0x7f9b89f48ac0, alter_info=0x7f9b89f489d0) at /data/src/10.6/sql/sql_table.cc:4657
|
#15 0x00005621be66daa7 in Sql_cmd_create_table_like::execute (this=0x7f9b64015708, thd=0x7f9b64000db8) at /data/src/10.6/sql/sql_table.cc:11777
|
#16 0x00005621be563b07 in mysql_execute_command (thd=0x7f9b64000db8, is_called_from_prepared_stmt=false) at /data/src/10.6/sql/sql_parse.cc:5997
|
#17 0x00005621be569cb3 in mysql_parse (thd=0x7f9b64000db8, rawbuf=0x7f9b640155c0 "CREATE OR REPLACE TABLE t2 (pk INT PRIMARY KEY, FOREIGN KEY (pk) REFERENCES t1 (pk)) ENGINE=InnoDB /* 1948 left */", length=114, parser_state=0x7f9b89f49480) at /data/src/10.6/sql/sql_parse.cc:8030
|
#18 0x00005621be5560e3 in dispatch_command (command=COM_QUERY, thd=0x7f9b64000db8, packet=0x7f9b6400b879 "CREATE OR REPLACE TABLE t2 (pk INT PRIMARY KEY, FOREIGN KEY (pk) REFERENCES t1 (pk)) ENGINE=InnoDB /* 1948 left */", packet_length=114, blocking=true) at /data/src/10.6/sql/sql_parse.cc:1896
|
#19 0x00005621be554a7f in do_command (thd=0x7f9b64000db8, blocking=true) at /data/src/10.6/sql/sql_parse.cc:1404
|
#20 0x00005621be713123 in do_handle_one_connection (connect=0x5621c1b77328, put_in_cache=true) at /data/src/10.6/sql/sql_connect.cc:1418
|
#21 0x00005621be712db3 in handle_one_connection (arg=0x5621c1b29738) at /data/src/10.6/sql/sql_connect.cc:1312
|
#22 0x00005621bec4404d in pfs_spawn_thread (arg=0x5621c1ba9378) at /data/src/10.6/storage/perfschema/pfs.cc:2201
|
#23 0x00007f9b9523c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#24 0x00007f9b94e0f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
All threads are attached.
The failure started happening on 10.6 after this commit:
commit c3c53926c467c95386ae98d61ada87294bd61478
|
Author: Marko Mäkelä
|
Date: Mon Oct 18 18:03:12 2021 +0300
|
|
MDEV-26554: Races between INSERT on child and DDL on parent table
|
Attachments
Issue Links
- is caused by
-
MDEV-26554 Table-rebuilding DDL on parent table causes crash for INSERT into child table
-
- Closed
-
- relates to
-
MDEV-27276 CREATE TABLE with FOREIGN KEY constraint fails to apply in parallel
-
- Closed
-
When I run the test with
./mtr --mysqld=--innodb-lock-wait-timeout=2 innodb.dl
it will fail more quickly as follows:
10.6 dbd6c6dc01228fe6e63f3f7dc695eb56ca8cd28d
CREATE OR REPLACE TABLE t2 (pk INT PRIMARY KEY, FOREIGN KEY (pk) REFERENCES t1 (pk)) ENGINE=InnoDB /* 1983 left */;
connection con1;
innodb.dl 'innodb' [ fail ]
Test ended at 2021-10-29 11:06:00
CURRENT_TEST: innodb.dl
mysqltest: At line 17: query 'reap' failed: ER_LOCK_WAIT_TIMEOUT (1205): Lock wait timeout exceeded; try restarting transaction
This is caused by
MDEV-26554, where I implemented an InnoDB work-around of a deficiency of the SQL layer: DDL operations fail to acquire exclusive metadata locks (MDL) on all referencing (child).I was forced to do that, because
MDEV-26554was a huge pain in our internal stress testing, with enormous consequences (data dictionary corruption), and it looked like implementing a proper fix (acquiring the necessary MDL on DDL) would not be available soon. Furthermore, an alternative of revertingMDEV-21175would not only have caused a performance regression but also forced us to revert or rewrite many other changes, such asMDEV-24258andMDEV-25919.The work-around is: InnoDB DDL operations will acquire exclusive table locks on the child tables. As you may know, MDL and InnoDB locks are not part of the same deadlock detection. So, no deadlock will be detected or reported, and an InnoDB lock wait timeout will be reported instead (unless the timeout is infinite: innodb_lock_wait_timeout=100000000).
I think that this deadlock on ALTER TABLE t1 FORCE can only be avoided by acquiring the necessary exclusive MDL in the SQL layer and reverting the InnoDB work-around. That fix will have to be validated with the mtr stress test case that I posted to
MDEV-26554(executing TRUNCATE TABLE and two forms of ALTER TABLE concurrently with INSERT into a child table).