[MDEV-35154] dict_sys_t::load_table() is holding exclusive dict_sys.latch for unnecessarily long time - Jira

Details

Type: Bug
Status: Confirmed (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6, 10.11, 11.4, 10.5(EOL), 11.6(EOL)
Fix Version/s: 10.6, 10.11, 11.4
Component/s: Storage Engine - InnoDB
Labels:
- performance

Description

Server crashes when using more than 8K partitions in a table. Startup options used to reproduce the issue --sql_mode= --deadlock-timeout-short=10 --deadlock-timeout-long=10 --deadlock-search-depth-short=10 --deadlock-search-depth-long=33 --innodb-fatal-semaphore-wait-threshold=2 --innodb-read-io-threads=1

Could not reproduce the issue in `rr`. The issue seems to be system specific and reproduced only on busy server

Attached test case in.sql and full back trace full_back_trace.log .

CS 11.6.2 ba7088d462c326c9df7de97a46fe69419cd7e116 (Debug)
Core was generated by `/test/MD031024-mariadb-11.6.2-linux-x86_64-dbg/bin/mariadbd --no-defaults --max'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x14bf685f2700 (LWP 1603475))]
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x000014bf92473859 in __GI_abort () at abort.c:79
#2 0x000055a9eebdf701 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/11.6_dbg/storage/innobase/ut/ut0ut.cc:488
#3 0x000055a9eec95a58 in dict_sys_t::lock_wait (this=this@entry=0x55a9efa07d40 <dict_sys>, file=file@entry=0x55a9ef2c8d40 "/test/11.6_dbg/storage/innobase/trx/trx0purge.cc", line=line@entry=1135) at /test/11.6_dbg/storage/innobase/include/ut0ut.h:323
#4 0x000055a9eeb9cabb in dict_sys_t::lock (line=1135, file=0x55a9ef2c8d40 "/test/11.6_dbg/storage/innobase/trx/trx0purge.cc", this=<optimized out>) at /test/11.6_dbg/storage/innobase/include/dict0dict.h:1494
#5 trx_purge_table_open (table_id=18, mdl_context=mdl_context@entry=0x55a9f1f49ee0, mdl=mdl@entry=0x14bf685f1aa8) at /test/11.6_dbg/storage/innobase/trx/trx0purge.cc:1135
#6 0x000055a9eeba2ea2 in trx_purge_attach_undo_recs (n_work_items=<synthetic pointer>, thd=<optimized out>) at /test/11.6_dbg/storage/innobase/trx/trx0purge.cc:1250
#7 trx_purge (n_tasks=n_tasks@entry=4, history_size=<optimized out>) at /test/11.6_dbg/storage/innobase/trx/trx0purge.cc:1368
#8 0x000055a9eeb8c1b1 in purge_coordinator_state::do_purge (this=0x55a9f036cc20 <purge_state>) at /test/11.6_dbg/storage/innobase/srv/srv0srv.cc:1426
#9 purge_coordinator_callback () at /test/11.6_dbg/storage/innobase/srv/srv0srv.cc:1510
#10 0x000055a9eed65e6d in tpool::task_group::execute (this=0x55a9f036ca80 <purge_coordinator_task_group>, t=t@entry=0x55a9f036c9e0 <purge_coordinator_task>) at /test/11.6_dbg/tpool/task_group.cc:73
#11 0x000055a9eed65ef5 in tpool::task::execute (this=0x55a9f036c9e0 <purge_coordinator_task>) at /test/11.6_dbg/tpool/task.cc:32
#12 0x000055a9eed6433d in tpool::thread_pool_generic::worker_main (this=0x55a9f1772930, thread_var=0x55a9f1ad2e20) at /test/11.6_dbg/tpool/tpool_generic.cc:583
#13 0x000055a9eed64f78 in std::__invoke_impl<void, void (tpool::thread_pool_generic::)(tpool::worker_data), tpool::thread_pool_generic, tpool::worker_data> (__t=<optimized out>, __f=<optimized out>) at /usr/include/c++/9/bits/invoke.h:89
#14 std::__invoke<void (tpool::thread_pool_generic::)(tpool::worker_data), tpool::thread_pool_generic, tpool::worker_data> (__fn=<optimized out>) at /usr/include/c++/9/bits/invoke.h:95
#15 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::)(tpool::worker_data), tpool::thread_pool_generic, tpool::worker_data> >::_M_invoke<0ul, 1ul, 2ul> (this=<optimized out>) at /usr/include/c++/9/thread:244
#16 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::)(tpool::worker_data), tpool::thread_pool_generic, tpool::worker_data> >::operator() (this=<optimized out>) at /usr/include/c++/9/thread:251
#17 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::)(tpool::worker_data), tpool::thread_pool_generic, tpool::worker_data> > >::_M_run (this=<optimized out>) at /usr/include/c++/9/thread:195
#18 0x000014bf9286ade4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#19 0x000014bf92984609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x000014bf92570133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

full_back_trace.log
63 kB
2024-10-15 03:38
in.sql
405 kB
2024-10-15 03:38

Issue Links

blocks

MDEV-34988 InnoDB locks dict_sys.latch for a long time during ALTER TABLE

Open

relates to

MDEV-28730 Remove internal parser usage from InnoDB fts

Stalled

MDEV-35424 False alarm/crash: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch

Needs Feedback

MDEV-17805 Do not add temporary tables to dict_sys->table_hash

Stalled

MDEV-33594 Invoking log_free_check() while holding exclusive dictionary latch may block most InnoDB threads for a long time

Confirmed

MDEV-34999 ha_innobase::open() should not acquire dict_sys.latch twice

Open

(1 relates to)

dict_sys_t::load_table() is holding exclusive dict_sys.latch for unnecessarily long time

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates