[MDEV-29660] [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch Created: 2022-09-28  Updated: 2023-09-18  Resolved: 2023-05-30

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.6.7
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Claudio Nanni Assignee: Claudio Nanni
Resolution: Incomplete Votes: 1
Labels: None

Issue Links:
Relates
relates to MDEV-29835 Partial server freeze Closed
relates to MDEV-29843 Server hang in thd_decrement_pending_... Closed
relates to MDEV-29883 Deadlock between InnoDB statistics up... Closed
relates to MDEV-32187 InnoDB: innodb_fatal_semaphore_wait_t... Open
relates to MDEV-27026 innodb_fts.concurrent_insert failed i... Closed

 Description   

Similar to MDEV-27026 but still in 10.6.7.
No core dump available.

2022-09-28  1:32:37 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
220928  1:32:37 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.6.7-MariaDB-1:10.6.7+maria~buster-log
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=130
max_threads=282
thread_count=131
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 637332 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x30000
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
Printing to addr2line failed
/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55ecb25bd36e]
/usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x55ecb20897b5]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fbfd459f730]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7fbfd40f67bb]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7fbfd40e1535]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/usr/sbin/mariadbd(+0x651ec7)[0x55ecb1d43ec7]
/usr/sbin/mariadbd(+0x64a105)[0x55ecb1d3c105]
/usr/sbin/mariadbd(tpool::thread_pool_generic::timer_generic::execute(void*)+0x35)[0x55ecb25516b5]
/usr/sbin/mariadbd(tpool::task::execute()+0x2b)[0x55ecb25525eb]
/usr/sbin/mariadbd(tpool::thread_pool_generic::worker_main(tpool::worker_data*)+0x4f)[0x55ecb25511ff]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbbb2f)[0x7fbfd44bfb2f]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7fbfd4594fa3]
mariadbd: /home/buildbot/buildbot/build/mariadb-10.6.7/tpool/task_group.cc:86: tpool::task_group::~task_group(): Assertion `m_queue.empty()' failed.
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fbfd41b7eff]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql



 Comments   
Comment by Marko Mäkelä [ 2022-09-28 ]

claudio.nanni, without the stack traces of all threads that were active at the time the watchdog task killed the process, we cannot possibly diagnose the source of the hang.

The stack trace of the watchdog task itself is totally uninteresting. We need to know what the other threads in the process were doing.

Comment by Claudio Nanni [ 2022-09-28 ]

marko only that is available and crash happened once now. I can't predict when it will happen again. I can ask to setup to get a core dump for next time.

Comment by Claudio Nanni [ 2022-10-03 ]

Hello Julien sure I agree. They had another crash but they did not provide a core dump, reminded that that we need that to proceed.

Comment by Marko Mäkelä [ 2022-10-25 ]

This might be the same as MDEV-29843.

Comment by Marko Mäkelä [ 2022-10-29 ]

Even more likely explanations of this hang are MDEV-29835 and MDEV-29883.

Comment by Marko Mäkelä [ 2023-03-06 ]

Part of MDEV-29835 was fixed in MDEV-30400. There have also been other hangs fixed in more recent releases of MariaDB 10.6, such as MDEV-29883.

Can you please provide fully resolved stack traces of the hung server?

Comment by Marko Mäkelä [ 2023-04-26 ]

Are there more stack traces of hanging threads?

Note that for releases between 10.6.9 and 10.6.12, MDEV-29835 is a very popular source of InnoDB hangs. That as well as MDEV-31132 will be fixed in the upcoming 10.6.13 release.

Comment by Andras [ 2023-09-14 ]

I got here from searching the error message
got the same error: I have a dump but it is 32GB

2023-09-14 10:50:31 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
230914 10:50:31 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.6.15-MariaDB source revision: 0d16eb35bc981023ce2f4912e8ecde68ca381f4e
key_buffer_size=29360128
read_buffer_size=1048576
max_used_connections=1001
max_threads=1002
thread_count=1001
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2107012 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x5579d4abf37e]
/usr/sbin/mariadbd(handle_fatal_signal+0x307)[0x5579d4511cc7]
sigaction.c:0(__restore_rt)[0x7ff507390630]
/lib64/libc.so.6(gsignal+0x37)[0x7ff5067db387]
/lib64/libc.so.6(abort+0x148)[0x7ff5067dca78]
/usr/sbin/mariadbd(+0xe23fd0)[0x5579d4949fd0]
/usr/sbin/mariadbd(+0xddb4a6)[0x5579d49014a6]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x40)[0x5579d4a490d0]
/usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x2b)[0x5579d4a4a05b]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x61)[0x5579d4a47441]
/lib64/libstdc++.so.6(+0xb5330)[0x7ff506f2a330]
pthread_create.c:0(start_thread)[0x7ff507388ea5]
/lib64/libc.so.6(clone+0x6d)[0x7ff5068a3b0d]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 256582 256582 processes
Max open files 1048576 1048576 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 256582 256582 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: core

Kernel version: Linux version 3.10.0-1160.59.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Wed Feb 23 16:47:03 UTC 2022

Comment by Olivier LEVILLAIN [ 2023-09-16 ]

Same for me on 11.1.2 on OCI, VM.Standard.E2.1.Micro with Oracle Linux 8, MariaDB running in Docker

{{Version: '11.1.2-MariaDB-1:11.1.2+maria~ubu2204' socket: '/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution
2023-09-12 5:45:33 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
230912 6:05:04 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204 source revision: 9bc25d98209df6810f7a7d5e7dd3ae677a313ab5
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=140
max_threads=153
thread_count=114
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 468041 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
mariadbd(my_print_stacktrace+0x32)[0x55b02f35c7c2]
Printing to addr2line failed
mariadbd(handle_fatal_signal+0x488)[0x55b02ee35cf8]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f8640fcf520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f8641023a7c]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f8640fcf476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f8640fb57f3]
mariadbd(+0x6b1a21)[0x55b02ea41a21]
mariadbd(+0x6a7eea)[0x55b02ea37eea]
mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x40)[0x55b02f2f1d20]
mariadbd(_ZN5tpool4task7executeEv+0x3a)[0x55b02f2f27da]
mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x57)[0x55b02f2f0867]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7f8641378253]
/lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f8641021b43]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7f86410b2bb4]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 1048576 1048576 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 3385 3385 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e

Kernel version: Linux version 5.15.0-103.114.4.el8uek.x86_64 (mockbuild@host-100-100-224-28) (gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9.1.0.3), GNU ld version 2.36.1-2.0.1.el8) #2 SMP Mon Jun 26 10:13:01
PDT 2023
}}
I can't find the core dump

Comment by Marko Mäkelä [ 2023-09-18 ]

In MDEV-32049, stack traces were provided from a Docker container. That could be the hang that you are experiencing. Without seeing the stack traces of all threads, it is impossible to conclude anything.

Generated at Thu Feb 08 10:10:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.