Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.5, 10.6, 10.11
-
None
Description
In the scenario below, simple DML is performed on an InnoDB table and the transaction is committed. Occasionally there is a table metadata lock held by a system thread after that.
Note that the table is created with InnoDB persistent statistics turned off.
The behavior apparently changed after this commit in 10.5.1
commit ea37b144094a0c2ebfc6774047fd473c1b2a8658
|
Author: Marko Mäkelä
|
Date: Tue Dec 10 15:42:50 2019 +0200
|
|
MDEV-16678 Prefer MDL to dict_sys.latch for innodb background tasks
|
The lock can be long enough to cause, for example, ER_LOCK_WAIT_TIMEOUT on non-waiting operations, even though from the user's point of view it is a single-threaded action. Possibly in more complicated concurrent scenarios unexpected deadlocks or at least delays (causing performance issues) can also occur.
The test cases are highly non-deterministic and need to be run with --repeat=N.
The test case with metadata lock info needs a result file to be recorded (with the empty result set upon SELECT from I_S), so that --repeat=N fails when a lock is found. It fails reasonably quickly for me on 10.5 (within ~5 tries). It takes much longer on 10.6, several hundred attempts, even longer on 10.11, although I got it a couple times there, and I've never seen it on 11.4+ so far (although given the probability trend, it doesn't mean much).
The test case with non-waiting drop table doesn't need a result file, but it fails more rarely.
METADATA_LOCK_INFO |
--source include/have_innodb.inc
|
|
install soname 'metadata_lock_info'; |
|
create table t (a int) engine=InnoDB stats_persistent=0; |
|
start transaction; |
insert into t set a = 1; |
select * from t; |
commit; |
|
select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; |
|
drop table t; |
uninstall soname 'metadata_lock_info'; |
10.5 fdb6db6b47f1825eabffde76c29d9b94545f1ef4 |
commit; |
select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; |
thread_id lock_mode lock_type table_name
|
+0 MDL_SHARED Table metadata lock t |
drop table t; |
uninstall soname 'metadata_lock_info'; |
|
Result length mismatch
|
Timeout upon DROP TABLE |
--source include/have_innodb.inc
|
|
create table t (a int) engine=InnoDB stats_persistent=0; |
set lock_wait_timeout=0, innodb_lock_wait_timeout=0; |
|
start transaction; |
insert into t set a = 1; |
select * from t; |
commit; |
|
drop table t; |
bug.t1 [ 20 fail ]
|
Test ended at 2024-12-04 14:29:07 |
|
CURRENT_TEST: bug.t1
|
mysqltest: At line 11: query 'drop table t' failed: 1205: Lock wait timeout exceeded; try restarting transaction |
Attachments
Issue Links
- is caused by
-
MDEV-16678 Use MDL for innodb background threads instead of dict_operation_lock
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
In the scenario below, simple DML is performed on an InnoDB table and the transaction is committed. Occasionally there is a table metadata lock held by a system thread after that.
Note that the table is created with InnoDB persistent statistics turned off. The behavior apparently changed after this commit in 10.5.1 {noformat} commit ea37b144094a0c2ebfc6774047fd473c1b2a8658 Author: Marko Mäkelä Date: Tue Dec 10 15:42:50 2019 +0200 {noformat} The lock can be long enough to cause, for example, ER_LOCK_WAIT_TIMEOUT on non-waiting operations, even though from the user's point of view it is a single-threaded action. Possibly in more complicated concurrent scenarios unexpected deadlocks or at least delays (causing performance issues) can also occur. The test cases are highly non-deterministic and need to be run with --repeat=N. The test case with metadata lock info needs a result file to be recorded (with the empty result set upon SELECT from I_S), so that --repeat=N fails when a lock is found. It fails reasonably quickly for me on 10.5 (within ~5 tries). It takes *much* longer on 10.6, several hundred attempts, even longer on 10.11, although I got it a couple times there, and I've never seen it on 11.4+ so far (although given the probability trend, it doesn't mean much). The test case with non-waiting drop table doesn't need a result file, but it fails more rarely. {code:sql|title=METADATA_LOCK_INFO} --source include/have_innodb.inc install soname 'metadata_lock_info'; create table t (a int) engine=InnoDB stats_persistent=0; start transaction; insert into t set a = 1; select * from t; commit; select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; drop table t; uninstall soname 'metadata_lock_info'; {code} {code:sql|title=10.5 fdb6db6b47f1825eabffde76c29d9b94545f1ef4} commit; select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; thread_id lock_mode lock_type table_name +0 MDL_SHARED Table metadata lock t drop table t; uninstall soname 'metadata_lock_info'; Result length mismatch {code} {code:sql|title=Timeout upon DROP TABLE} --source include/have_innodb.inc create table t (a int) engine=InnoDB stats_persistent=0; set lock_wait_timeout=0, innodb_lock_wait_timeout=0; start transaction; insert into t set a = 1; select * from t; commit; drop table t; {code} {code:sql} bug.t1 [ 20 fail ] Test ended at 2024-12-04 14:29:07 CURRENT_TEST: bug.t1 mysqltest: At line 11: query 'drop table t' failed: 1205: Lock wait timeout exceeded; try restarting transaction {code} |
In the scenario below, simple DML is performed on an InnoDB table and the transaction is committed. Occasionally there is a table metadata lock held by a system thread after that.
Note that the table is created with *InnoDB persistent statistics turned off*. The behavior apparently changed after this commit in 10.5.1 {noformat} commit ea37b144094a0c2ebfc6774047fd473c1b2a8658 Author: Marko Mäkelä Date: Tue Dec 10 15:42:50 2019 +0200 {noformat} The lock can be long enough to cause, for example, ER_LOCK_WAIT_TIMEOUT on non-waiting operations, even though from the user's point of view it is a single-threaded action. Possibly in more complicated concurrent scenarios unexpected deadlocks or at least delays (causing performance issues) can also occur. The test cases are highly non-deterministic and need to be run with --repeat=N. The test case with metadata lock info needs a result file to be recorded (with the empty result set upon SELECT from I_S), so that --repeat=N fails when a lock is found. It fails reasonably quickly for me on 10.5 (within ~5 tries). It takes *much* longer on 10.6, several hundred attempts, even longer on 10.11, although I got it a couple times there, and I've never seen it on 11.4+ so far (although given the probability trend, it doesn't mean much). The test case with non-waiting drop table doesn't need a result file, but it fails more rarely. {code:sql|title=METADATA_LOCK_INFO} --source include/have_innodb.inc install soname 'metadata_lock_info'; create table t (a int) engine=InnoDB stats_persistent=0; start transaction; insert into t set a = 1; select * from t; commit; select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; drop table t; uninstall soname 'metadata_lock_info'; {code} {code:sql|title=10.5 fdb6db6b47f1825eabffde76c29d9b94545f1ef4} commit; select thread_id, lock_mode, lock_type, table_name from information_schema.metadata_lock_info; thread_id lock_mode lock_type table_name +0 MDL_SHARED Table metadata lock t drop table t; uninstall soname 'metadata_lock_info'; Result length mismatch {code} {code:sql|title=Timeout upon DROP TABLE} --source include/have_innodb.inc create table t (a int) engine=InnoDB stats_persistent=0; set lock_wait_timeout=0, innodb_lock_wait_timeout=0; start transaction; insert into t set a = 1; select * from t; commit; drop table t; {code} {code:sql} bug.t1 [ 20 fail ] Test ended at 2024-12-04 14:29:07 CURRENT_TEST: bug.t1 mysqltest: At line 11: query 'drop table t' failed: 1205: Lock wait timeout exceeded; try restarting transaction {code} |
Summary | Unexpected able metadata locks held by background operations can cause phantom ER_LOCK_WAIT_TIMEOUT and possibly other errors | Unexpected table metadata locks held by background operations can cause phantom ER_LOCK_WAIT_TIMEOUT and possibly other errors |
Link |
This issue is caused by |
The error could be reported due an InnoDB persistent statistics table name, or due to the name of the being-dropped table itself. If stats_persistent=0 had been explicitly specified for the table, we could streamline ha_innobase::delete_table() and avoid opening and locking the persistent statistics tables so that any entries could be deleted from them. If no stats_persistent is specified, then we should still assume that innodb_stats_persistent had been enabled at some point in the history, and try to delete the persistent statistics.