Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Duplicate
-
10.6.18
-
CloudLinux 8
Virtual Machine on AMD
80G RAM
32 Cores
Description
The server hangs with a lot of queries in 'Killed' status. The main culprit to the hang seems to be a massive update on a wp_options (wordpress) table. This in itself is not odd.
What is weird is that MariaDB stops accepting new queries and when the max_connection limit is reach it's restarted. The restart does not work, ends up crashing with error code 6. No core dump dumped as of yet, but I did manage to produce stack traces for all threads (attachment).
[ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ |
240623 8:17:35 [ERROR] mysqld got signal 6 ;
|
Sorry, we probably made a mistake, and this is a bug.
|
|
Your assistance in bug reporting will enable us to fix this for the next release. |
To report this bug, see https://mariadb.com/kb/en/reporting-bugs |
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.6.18-MariaDB-cll-lve source revision: 887bb3f73555ff8a50138a580ca8308b9b5c069c |
key_buffer_size=402653184
|
read_buffer_size=2097152
|
max_used_connections=801
|
max_threads=802
|
thread_count=802
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3699409 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation. |
|
Thread pointer: 0x0
|
Attempting backtrace. You can use the following information to find out |
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x0 thread_stack 0x49000
|
/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55fdf610648e] |
mysys/stacktrace.c:216(my_print_stacktrace)[0x55fdf5be1d45] |
/lib64/libpthread.so.0(+0x12d20)[0x7f0ff4915d20] |
/lib64/libc.so.6(gsignal+0x10f)[0x7f0ff3c6052f] |
/lib64/libc.so.6(abort+0x127)[0x7f0ff3c33e65] |
/usr/sbin/mariadbd(+0x678245)[0x55fdf5883245] |
ut/ut0ut.cc:476(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x55fdf587bcdc] |
srv/srv0srv.cc:1233(Wsrep_server_service::log_dummy_write_set(wsrep::client_state&, wsrep::ws_meta const&))[0x55fdf609a1c8] |
tpool/tpool_generic.cc:346(tpool::thread_pool_generic::timer_generic::execute(void*))[0x55fdf609af0b] |
tpool/task.cc:38(tpool::task::execute())[0x55fdf6098a7f] |
/lib64/libstdc++.so.6(+0xc2b23)[0x7f0ff442cb23] |
/lib64/libpthread.so.0(+0x81ca)[0x7f0ff490b1ca] |
/lib64/libc.so.6(clone+0x43)[0x7f0ff3c4b8d3] |
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains |
information that should help you find out what is causing the crash. |
Writing a core file... |
Working directory at /var/lib/mysql |
Resource Limits:
|
Limit Soft Limit Hard Limit Units
|
Max cpu time unlimited unlimited seconds |
Max file size unlimited unlimited bytes |
Max data size unlimited unlimited bytes
|
Max stack size 8388608 unlimited bytes
|
Max core file size unlimited unlimited bytes |
Max resident set unlimited unlimited bytes |
Max processes 642336 642336 processes
|
Max open files 80000 80000 files |
Max locked memory 65536 65536 bytes
|
Max address space unlimited unlimited bytes
|
Max file locks unlimited unlimited locks |
Max pending signals 642336 642336 signals
|
Max msgqueue size 819200 819200 bytes
|
Max nice priority 0 0 |
Max realtime priority 0 0
|
Max realtime timeout unlimited unlimited us
|
Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e |
Attachments
Issue Links
- duplicates
-
MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled
-
- Closed
-
- relates to
-
MDEV-34469 mariadb10.6.18 InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch.
-
- Closed
-
-
MDEV-29660 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch
-
- Closed
-
I analysed the latest stack trace in output.log.anon.gz in details.
There are total 784 threads.
1. 9 Background threads waiting for appropriate trigger [all expected]
mysqld_main, handle_manager, timer_handler, signal_hand, ma_checkpoint_background, tpool::aio_linux::getevent_thread_routine, tpool::thread_pool_generic::wait_for_tasks, tpool::thread_pool_generic::wait_for_tasks, trx_purge_wait_for_workers_to_complete
2. 1 Server monitor task raising fatal error after detecting too long wait on dictionary system lock
Thread 1 (Thread 0x7fb949ffb700 (LWP 2630941)):
#5 0x00007fbb34a58e65 in __GI_abort () at abort.c:79
#6 0x000055b467cbb245 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/ut/ut0ut.cc:476
#7 0x000055b467cb3cdc in srv_monitor_task ()
3. 627 user DML/QUERY threads waiting on dict_sys_t::freeze : S lock dictionary system
#4 ssux_lock_impl<false>::rd_wait (this=this@entry=0x55b468f93788 <dict_sys+72>)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/sync/srw_lock.cc:397
...
#8 dict_sys_t::freeze (line=line@entry=1039,
file=0x55b4687ecab0 "/builddir/build/BUILD/cl-MariaDB106-10.6.18/mariadb-10.6.18/storage/innobase/dict/dict0dict.cc", this=0x55b468f93740 <dict_sys>)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0dict.cc:1006
#9 0x000055b468455342 in dict_table_open_on_name (
table_name=0x7fb9ea187b20 "anonymous1", dict_locked=<optimized out>,
ignore_err=DICT_ERR_IGNORE_FK_NOKEY)
4. 13 DML/QUERY threads waiting dict_sys_t::lock : X lock for dictionary system
#6 dict_sys_t::lock_wait (this=0x55b468f93740 <dict_sys>, file=<optimized out>, line=225)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0dict.cc:995
#7 0x000055b4684551f0 in dict_sys_t::lock (
file=0x55b4687ecab0 "/builddir/build/BUILD/cl-MariaDB106-10.6.18/mariadb-10.6.18/storage/innobase/dict/dict0dict.cc", line=225, this=0x55b468f93740 <dict_sys>)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/include/dict0dict.h:1525
#8 dict_table_close (table=0x7fb62c0bfb60)
5. 48 user DML/QUERY threads waiting on S/X lock for buffer pool pages
#1 0x000055b4683de748 in srw_mutex_impl<true>::wait (lk=<optimized out>, this=0x7fba88130898)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/sync/srw_lock.cc:243
...
#5 sux_lock<ssux_lock_impl<true> >::x_lock (for_io=false, this=0x7fba88130898)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/include/sux_lock.h:395
#6 buf_page_get_low (page_id=..., zip_size=<optimized out>, rw_latch=<optimized out>,
guess=<optimized out>, mode=<optimized out>, mtr=<optimized out>, err=<optimized out>,
allow_ibuf_merge=<optimized out>)
6. 84 DML/QUERY threads waiting to free Page in the middle of a mini-transaction. Blocking threads in [5] waiting for page lock.
#2 __pthread_cond_timedwait (cond=cond@entry=0x55b468f926b0 <buf_pool+17328>,
mutex=mutex@entry=0x55b468f8e300 <buf_pool>, abstime=abstime@entry=0x7fba7f2f15e0)
at pthread_cond_wait.c:667
#3 0x000055b468445e02 in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0lru.cc:452
#4 0x000055b467d1c937 in buf_page_init_for_read (mode=132, page_id=..., zip_size=0,
unzip=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0rea.cc:121
#5 0x000055b46844629f in buf_read_page_low (space=0x7fba70060c28, sync=true, mode=132, page_id=...,
zip_size=0, unzip=<optimized out>)
7. 1 DML/QUERY thread is holding X dictionary system lock and waiting for free page. This one blocks all threads in [3] and [4] asking for dictionary system locks eventually causing the fatal error.
Thread 296 (Thread 0x7fba7ec09700 (LWP 2781974)):
#2 __pthread_cond_timedwait (cond=cond@entry=0x55b468f926b0 <buf_pool+17328>,
mutex=mutex@entry=0x55b468f8e300 <buf_pool>, abstime=abstime@entry=0x7fba7ec047d0)
at pthread_cond_wait.c:667
#3 0x000055b468445e02 in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0lru.cc:452
#4 0x000055b467d1c937 in buf_page_init_for_read (mode=132, page_id=..., zip_size=0,
unzip=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0rea.cc:121
...
#8 0x000055b46841ec06 in btr_cur_t::search_leaf (this=this@entry=0x7fba7ec05810,
tuple=tuple@entry=0x7fba7ec05750, mode=mode@entry=PAGE_CUR_GE,
latch_mode=latch_mode@entry=BTR_SEARCH_LEAF, mtr=mtr@entry=0x7fba7ec05c20)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/btr/btr0cur.cc:1219
...
#11 dict_load_table_one (name=..., ignore_err=<optimized out>, fk_tables=...)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0load.cc:2368
#12 0x000055b468462943 in dict_sys_t::load_table (this=this@entry=0x55b468f93740 <dict_sys>, name=...,
ignore=ignore@entry=DICT_ERR_IGNORE_FK_NOKEY)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0load.cc:2570
anonymous403
dict_locked=<optimized out>, ignore_err=DICT_ERR_IGNORE_FK_NOKEY)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0dict.cc:1060
#14 0x000055b4682fd741 in ha_innobase::open(char const*, int, unsigned int) ()
at /usr/src/debug/MariaDB-/src_0/storage/innobase/handler/ha_innodb.cc:6191
8. 1 Page cleaner background thread is in "furious flush mode". This is blocking all threads in [6] and [7] by not generating any free page. The thread itself is stuck in synchronous flush as there are blocked mtrs holding page latches.
Thread 8 (Thread 0x7fba84e9e700 (LWP 3086274)):
#0 0x000055b4684412f8 in buf_do_flush_list_batch (lsn=76535143611932, max_n=2000)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1586
#1 buf_flush_list_holding_mutex (lsn=76535143611932, max_n=2000)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1586
#2 buf_flush_list_holding_mutex (max_n=2000, lsn=76535143611932)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1567
#3 0x000055b467cc3331 in buf_flush_list (max_n=2000, lsn=lsn@entry=76535143611932)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1612
#4 0x000055b467cc372d in buf_flush_sync_for_checkpoint (lsn=76535143611932)
at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:2023
#5 buf_flush_page_cleaner () at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:2294
It is clear from the analysis that the symptom is exactly same as
MDEV-34265fixed in 10.6.19 and there are not any other visible symptoms. Please upgrade to 10.6.19 and confirm.