Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.4(EOL), 10.5, 10.6
-
None
Description
Run the following, in continually random SQL order, using 150 or 200 threads, perpetually:
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
CREATE TABLE t1 (a INT) ENGINE=InnoDB;
|
INSERT INTO t1 VALUES (NULL);
|
XA BEGIN 'x';
|
UPDATE t1 SET a = 1;
|
DELETE FROM t1;
|
ALTER TABLE t1 ADD COLUMN extra INT;
|
INSERT INTO t1 (a) VALUES (2);
|
ROLLBACK;
|
DELETE FROM t1;
|
INSERT INTO t1 (a) VALUES (2);
|
XA END 'x';
|
XA ROLLBACK 'x';
|
DROP TABLE t1;
|
After about 50k-500k executions, and sufficient time to hit the 600sec timeout, this leads to:
10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Debug) |
2020-11-17 10:53:27 0 [Note] /test/MD211020-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld: ready for connections.
|
Version: '10.6.0-MariaDB-debug' socket: '/test/MD211020-mariadb-10.6.0-linux-x86_64-dbg/socket.sock' port: 12191 MariaDB Server
|
2020-11-17 10:53:35 499 [Note] InnoDB: Deferring DROP TABLE `test`.`t1`; renaming to test/#sql-ib33
|
2020-11-17 10:57:37 0 [Warning] InnoDB: A long semaphore wait:
|
--Thread 22831188305664 has waited at dict0dict.cc line 943 for 241.00 seconds the semaphore:
|
Mutex at 0x55e77c0b2a00, Mutex DICT_SYS created /test/10.6_dbg/storage/innobase/dict/dict0dict.cc:1032, lock var 2
|
Last time reserved in file /test/10.6_dbg/storage/innobase/row/row0mysql.cc line 3921
|
2020-11-17 10:57:37 0 [Note] InnoDB: A semaphore wait:
|
--Thread 22831188305664 has waited at dict0dict.cc line 943 for 241.00 seconds the semaphore:
|
Mutex at 0x55e77c0b2a00, Mutex DICT_SYS created /test/10.6_dbg/storage/innobase/dict/dict0dict.cc:1032, lock var 2
|
Last time reserved in file /test/10.6_dbg/storage/innobase/row/row0mysql.cc line 3921
|
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
|
InnoDB: Pending reads 0, writes 0
|
InnoDB: ###### Diagnostic info printed to the standard error stream
|
... many more (all on the same dict0dict.cc:943, dict0dict.cc:1032, and row0mysql.cc line 3921)...
|
2020-11-17 11:03:47 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
|
201117 11:03:47 [ERROR] mysqld got signal 6 ;
|
10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Debug) |
Core was generated by `/test/MD211020-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
|
Program terminated with signal SIGABRT, Aborted.
|
#0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
|
at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
|
[Current thread is 1 (Thread 0x14c3ce1f0700 (LWP 827875))]
|
(gdb) bt
|
#0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
|
#1 0x000055e77b698021 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
|
#2 0x000055e77addf321 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:330
|
#3 <signal handler called>
|
#4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#5 0x000014c48b161859 in __GI_abort () at abort.c:79
|
#6 0x000055e77b4574b9 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/10.6_dbg/storage/innobase/ut/ut0ut.cc:580
|
#7 0x000055e77b3e5b04 in srv_error_monitor_task () at /test/10.6_dbg/storage/innobase/include/ut0ut.h:346
|
#8 0x000055e77b623746 in tpool::thread_pool_generic::timer_generic::run (this=0x55e77edb1af0) at /test/10.6_dbg/tpool/tpool_generic.cc:299
|
#9 tpool::thread_pool_generic::timer_generic::execute (arg=0x55e77edb1af0) at /test/10.6_dbg/tpool/tpool_generic.cc:317
|
#10 0x000055e77b62467f in tpool::task::execute (this=0x55e77edb1b30) at /test/10.6_dbg/tpool/task.cc:52
|
#11 0x000055e77b62325f in tpool::thread_pool_generic::worker_main (this=0x55e77e3f1b20, thread_var=0x55e77e42c4d0) at /test/10.6_dbg/tpool/tpool_generic.cc:518
|
#12 0x000055e77b623598 in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__t=<optimized out>, __f=<optimized out>) at /usr/include/c++/9/bits/invoke.h:89
|
#13 std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=<optimized out>) at /usr/include/c++/9/bits/invoke.h:95
|
#14 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=<optimized out>) at /usr/include/c++/9/thread:244
|
#15 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=<optimized out>) at /usr/include/c++/9/thread:251
|
#16 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=<optimized out>) at /usr/include/c++/9/thread:195
|
#17 0x000014c48b553d84 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
|
#18 0x000014c48b66f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#19 0x000014c48b25e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Optimized) |
Core was generated by `/test/MD201020-mariadb-10.6.0-linux-x86_64-opt/bin/mysqld --no-defaults --core-'.
|
Program terminated with signal SIGABRT, Aborted.
|
#0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
|
at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
|
[Current thread is 1 (Thread 0x148ea2df6700 (LWP 1128501))]
|
(gdb) bt
|
#0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
|
#1 0x000055b083224a4f in my_write_core (sig=sig@entry=6) at /test/10.6_opt/mysys/stacktrace.c:424
|
#2 0x000055b082c48130 in handle_fatal_signal (sig=6) at /test/10.6_opt/sql/signal_handler.cc:330
|
#3 <signal handler called>
|
#4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#5 0x0000148fb2181859 in __GI_abort () at abort.c:79
|
#6 0x000055b08290d1f7 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/10.6_opt/storage/innobase/ut/ut0ut.cc:580
|
#7 0x000055b082902fde in srv_error_monitor_task () at /test/10.6_opt/storage/innobase/include/ut0ut.h:346
|
#8 0x000055b0831b7159 in tpool::thread_pool_generic::timer_generic::run (this=0x55b08599f5a0) at /test/10.6_opt/tpool/tpool_generic.cc:299
|
#9 tpool::thread_pool_generic::timer_generic::execute (arg=0x55b08599f5a0) at /test/10.6_opt/tpool/tpool_generic.cc:317
|
#10 0x000055b0831b815a in tpool::task::execute (this=0x55b08599f5e0) at /test/10.6_opt/tpool/task.cc:52
|
#11 tpool::task::execute (this=0x55b08599f5e0) at /test/10.6_opt/tpool/task.cc:42
|
#12 0x000055b0831b6c47 in tpool::thread_pool_generic::worker_main (this=0x55b0853a52e0, thread_var=0x55b0853dec00) at /test/10.6_opt/tpool/tpool_generic.cc:518
|
#13 0x0000148fb2573d84 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
|
#14 0x0000148fb268f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#15 0x0000148fb227e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
Attachments
Issue Links
- relates to
-
MDEV-21175 Remove dict_table_t::n_foreign_key_checks_running from InnoDB
-
- Closed
-
-
MDEV-21602 CREATE TABLE…PRIMARY KEY…SELECT workaround causes DROP TABLE to ignore locks
-
- Closed
-
-
MDEV-15532 XA: Assertion `!log->same_pk' failed in row_log_table_apply_delete
-
- Closed
-
In both MDEV-24225.txt
and MDEV-24225_10.5.9_7f30a5c423249c7fcfc9f9ac1578d06263aa37ac_dbg.txt
, the progress of most InnoDB threads is blocked by dict_sys.mutex and dict_sys.latch that are being held by a thread that does not seem to be blocked. In MDEV-24225_10.5.9_7f30a5c423249c7fcfc9f9ac1578d06263aa37ac_dbg.txt
that thread is
10.5 7f30a5c423249c7fcfc9f9ac1578d06263aa37ac
Thread 2 (Thread 0x1507ac1ad700 (LWP 4193329)):
#0 0x000055579e9f0e76 in MutexDebug<TTASEventMutex<GenericPolicy> >::is_owned (this=0x55579f897270 <dict_sys+16>) at /test/10.5_dbg/storage/innobase/include/sync0policy.h:158
#1 0x000055579ed2e505 in PolicyMutex<TTASEventMutex<GenericPolicy> >::is_owned (this=<optimized out>) at /test/10.5_dbg/storage/innobase/include/ib0mutex.h:635
#2 dict_get_first_table_name_in_db (name=name@entry=0x150774025478 "test/") at /test/10.5_dbg/storage/innobase/dict/dict0load.cc:223
#3 0x000055579eb6d82b in row_drop_database_for_mysql (name=<optimized out>, name@entry=0x150774025478 "test/", trx=trx@entry=0x1507a7c03690, found=found@entry=0x1507ac1ab368) at /test/10.5_dbg/storage/innobase/row/row0mysql.cc:3921
#4 0x000055579e9c68e9 in innobase_drop_database (hton=<optimized out>, path=<optimized out>) at /test/10.5_dbg/storage/innobase/handler/ha_innodb.cc:13427
This is blocking (among other threads) the master task that is executing row_drop_tables_for_mysql_in_background().
Likewise, in MDEV-24225.txt
, row_drop_tables_for_mysql_in_background() and many others are waiting for the completion of row_drop_database_for_mysql() in Thread 4, which does not seem to be blocked.
It is hard to say what trips the InnoDB watchdog. There seemingly is some progress, and not everything is stuck.
But, in any case, the entire "background drop table queue" is an InnoDB workaround for bugs at the storage engine layer. I think that this should be fixed by
MDEV-15532,MDEV-21175,MDEV-21602, or similar. The entire workaround should be made redundant and removed.