Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6.18, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.4, 11.5(EOL), 11.6(EOL), 11.7(EOL), 11.8
Description
When a group of tables have circular foreign key relationships, dropping those tables concurrently (with foreign_key_checks=OFF) can result in the DROP TABLE statements becoming blocked indefinitely. Both lock_wait_timeout and innodb_lock_wait_timeout get ignored in this state. Even using DROP TABLE ... NOWAIT makes no difference, the NOWAIT is effectively ignored and the DROP TABLE still blocks.
I can only repro this in MariaDB 10.6.18+ / May 2024 releases onwards, so I assume it is caused by MDEV-32899. Arguably this is a duplicate of MDEV-34253 but the circumstances are a bit different here, so I thought it might warrant a separate report.
The repro is inconsistent / non-deterministic, but running the pathological case below repeatedly in a loop seems to cause it quickly.
Setup:
SET foreign_key_checks = OFF; |
CREATE TABLE tbl0 ( |
id0 int unsigned not null, |
id1 int unsigned not null, |
id2 int unsigned not null, |
primary key (id0), |
foreign key fk01 (id1) references tbl1 (id1), |
foreign key fk02 (id2) references tbl2 (id2) |
) ENGINE=InnoDB;
|
CREATE TABLE tbl1 ( |
id1 int unsigned not null, |
id2 int unsigned not null, |
id3 int unsigned not null, |
primary key (id1), |
foreign key fk12 (id2) references tbl2 (id2), |
foreign key fk13 (id3) references tbl3 (id3) |
) ENGINE=InnoDB;
|
CREATE TABLE tbl2 ( |
id2 int unsigned not null, |
id3 int unsigned not null, |
id4 int unsigned not null, |
primary key (id2), |
foreign key fk23 (id3) references tbl3 (id3), |
foreign key fk24 (id4) references tbl4 (id4) |
) ENGINE=InnoDB;
|
CREATE TABLE tbl3 ( |
id3 int unsigned not null, |
id4 int unsigned not null, |
id5 int unsigned not null, |
primary key (id3), |
foreign key fk34 (id4) references tbl4 (id4), |
foreign key fk35 (id5) references tbl5 (id5) |
) ENGINE=InnoDB;
|
CREATE TABLE tbl4 ( |
id4 int unsigned not null, |
id5 int unsigned not null, |
id0 int unsigned not null, |
primary key (id4), |
foreign key fk45 (id5) references tbl5 (id5), |
foreign key fk40 (id0) references tbl0 (id0) |
) ENGINE=InnoDB;
|
CREATE TABLE tbl5 ( |
id5 int unsigned not null, |
id0 int unsigned not null, |
id1 int unsigned not null, |
primary key (id5), |
foreign key fk50 (id0) references tbl0 (id0), |
foreign key fk51 (id1) references tbl1 (id1) |
) ENGINE=InnoDB;
|
Then using 6 separate connections, concurrently drop the tables:
SET foreign_key_checks = OFF, lock_wait_timeout = 5, innodb_lock_wait_timeout = 5; |
DROP TABLE tbl{$n}; -- optionally add NOWAIT to demonstrate that doesn't help... |
Attachments
Issue Links
- is caused by
-
MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL
-
- Closed
-
- relates to
-
MDEV-34253 ALTER .. FOREIGN KEY stuck in "Committing alter table to storage engine" does not obey innodb_lock_wait_timeout
-
- Confirmed
-
Thank you for the test case. I was lucky to reproduce a hang on the first run of the following mtr test case:
--source include/have_innodb.inc
) ENGINE=InnoDB;
) ENGINE=InnoDB;
) ENGINE=InnoDB;
) ENGINE=InnoDB;
) ENGINE=InnoDB;
) ENGINE=InnoDB;
--connect con1,localhost,root
--connect con2,localhost,root
--connect con3,localhost,root
--connect con4,localhost,root
--connect con5,localhost,root
--disconnect con5
--connection con4
--reap
--disconnect con4
--connection con3
--reap
--disconnect con3
--connection con2
--reap
--disconnect con2
--connection con1
--reap
--disconnect con1
--connection default
There are several threads stuck in a metadata lock (MDL) wait like this:
#6 0x000055bf0b8455dd in inline_mysql_cond_timedwait (that=0x7f79a8000f60, mutex=0x7f79a8000eb0, abstime=0x7f79f416ec90, src_file=0x55bf0c4a201a "/mariadb/10.11/sql/mdl.cc", src_line=0x4b9) at /mariadb/10.11/include/mysql/psi/mysql_thread.h:1088
#7 0x000055bf0b846bd1 in MDL_wait::timed_wait (this=0x7f79a8000eb0, owner=0x7f79a8000e40, abs_timeout=0x7f79f416ec90, set_status_on_timeout=0x0, wait_state_name=0x55bf0cf15830 <MDL_key::m_namespace_to_wait_state_name+48>) at /mariadb/10.11/sql/mdl.cc:1209
#8 0x000055bf0b848960 in MDL_context::acquire_lock (this=0x7f79a8000eb0, mdl_request=0x7f79f416f050, lock_wait_timeout=86400) at /mariadb/10.11/sql/mdl.cc:2408
#9 0x000055bf0c17c859 in dict_acquire_mdl_shared<false> (table=0x7f79b81889b0, mdl_context=0x7f79a8000eb0, mdl=0x7f79f416f350, table_op=DICT_TABLE_OP_NORMAL) at /mariadb/10.11/storage/innobase/dict/dict0dict.cc:733
#10 0x000055bf0be812d5 in lock_table_children (table=0x7f79b818e930, trx=0x7f79f617c380) at /mariadb/10.11/storage/innobase/lock/lock0lock.cc:4144
#11 0x000055bf0bdbdd76 in ha_innobase::delete_table (this=0x7f79a80142d0, name=0x7f79f4170720 "./test/tbl4") at /mariadb/10.11/storage/innobase/handler/ha_innodb.cc:13534
#12 0x000055bf0ba3b748 in hton_drop_table (hton=0x55bf37fb3078, path=0x7f79f4170720 "./test/tbl4") at /mariadb/10.11/sql/handler.cc:575
#13 0x000055bf0ba4192e in ha_delete_table (thd=0x7f79a8000d58, hton=0x55bf37fb3078, path=0x7f79f4170720 "./test/tbl4", db=0x7f79f4170500, alias=0x7f79f4170510, generate_warning=0x1) at /mariadb/10.11/sql/handler.cc:3231
#14 0x000055bf0b776ac6 in mysql_rm_table_no_locks (thd=0x7f79a8000d58, tables=0x7f79a8013968, current_db=0x7f79a8000df8, ddl_log_state=0x7f79f4170540, if_exists=0x0, drop_temporary=0x0, drop_view=0x0, drop_sequence=0x0, dont_log_query=0x0, dont_free_locks=0x0) at /mariadb/10.11/sql/sql_table.cc:1644
#15 0x000055bf0b775845 in mysql_rm_table (thd=0x7f79a8000d58, tables=0x7f79a8013968, if_exists=0x0, drop_temporary=0x0, drop_sequence=0x0, dont_log_query=0x0) at /mariadb/10.11/sql/sql_table.cc:1211
#16 0x000055bf0b669fd9 in mysql_execute_command (thd=0x7f79a8000d58, is_called_from_prepared_stmt=0x0) at /mariadb/10.11/sql/sql_parse.cc:5071
#17 0x000055bf0b673da6 in mysql_parse (thd=0x7f79a8000d58, rawbuf=0x7f79a8013640 "SET STATEMENT foreign_key_checks=off FOR DROP TABLE tbl4", length=0x38, parser_state=0x7f79f41713e0) at /mariadb/10.11/sql/sql_parse.cc:8188
#18 0x000055bf0b65f509 in dispatch_command (command=COM_QUERY, thd=0x7f79a8000d58, packet=0x7f79a800b0c9 "SET STATEMENT foreign_key_checks=off FOR DROP TABLE tbl4", packet_length=0x38, blocking=0x1) at /mariadb/10.11/sql/sql_parse.cc:1905
I believe that this should be fixed as part of fixing MDEV-34253. The inline_mysql_cond_timedwait() would return 110 (ETIMEDOUT), and MDL_wait::timed_wait() returns MDL_wait::VICTIM to MDL_context::acquire_lock(), which in turn only returns true to dict_acquire_mdl_shared<false>(). InnoDB cannot know that it must stop retrying the lock acquisition, and therefore each dict_acquire_mdl_shared<false>() remains in an infinite loop.
Notably, the infinite wait loop cannot be interrupted by attempting to shut down the server (killall -QUIT mariadbd).