[MDEV-34469] mariadb10.6.18 InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Incomplete
Affects Version/s: 10.6.18
Fix Version/s: 10.6.19, 10.11.9, 11.1.6, 11.2.5, 11.4.3
Component/s: Storage Engine - InnoDB
Labels:
- hang
Environment:
windows

Description

A dump file appeared during the runtime of mysqld.exe，Please refer to the attachment for specific information.

2024-06-25 1:46:21 103 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:46:31 112 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:46:31 113 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:46:41 133 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:46:51 139 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:47:01 140 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:47:11 141 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 1:47:21 143 [Warning] Access denied for user 'ޔ^p?'?Vf?J>???'@'192.168.1.91' (using password: YES)
2024-06-25 2:21:00 197 [Warning] InnoDB: Could not free any blocks in the buffer pool! 129856 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size.
2024-06-25 2:33:44 0 [Warning] InnoDB: Long wait (150 seconds) for dict_sys.latch
2024-06-25 2:36:14 0 [Warning] InnoDB: Long wait (300 seconds) for dict_sys.latch
2024-06-25 2:38:44 0 [Warning] InnoDB: Long wait (450 seconds) for dict_sys.latch
2024-06-25 2:41:14 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
240625 2:41:14 [ERROR] mysqld got exception 0x80000003 ;
Sorry, we probably made a mistake, and this is a bug.

Your assistance in bug reporting will enable us to fix this for the next release.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.6.18-MariaDB-log source revision: 887bb3f73555ff8a50138a580ca8308b9b5c069c
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=173
max_threads=65537
thread_count=163
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1082300030 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
server.dll!my_parameter_handler()
ucrtbase.dll!raise()
ucrtbase.dll!abort()
server.dll!json_valid()
server.dll!json_valid()
server.dll!json_valid()
ntdll.dll!TpReleaseWork()
ntdll.dll!RtlInitializeResource()
KERNEL32.DLL!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
information that should help you find out what is causing the crash.
Writing a core file at C:\DS\DS Server\mysql\data
Minidump written to C:\DS\DS Server\mysql\data\mysqld.dmp

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

all_threads_stack.txt
180 kB
2024-06-27 04:46
all_threads_stacks_with_parameters.txt
693 kB
2024-06-27 05:42
lQLPJwlcXeTbu1XNAQPNBJ-w5wT7buvv410GZs8gyy2rAQ_1183_259.png
36 kB
2024-06-27 01:17
lQLPKdPqIHHZLFXNAXDNAwawhbjO1HdK7gMGZs8gyy2rAA_774_368.png
21 kB
2024-06-27 01:17
mysqld.dmp
1.61 MB
2024-06-27 01:17

Issue Links

duplicates

MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled

Closed

relates to

MDEV-34442 InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch

Closed

Activity

Ascending order - Click to sort in descending order

Vladislav Vaintroub added a comment - 2024-06-27 04:51

resolved all threads stack using valerii excellent documentation in https://mysqlentomologist.blogspot.com/2018/03/windows-tools-for-mysql-dbas-basic.html

Vladislav Vaintroub added a comment - 2024-06-27 04:51 resolved all threads stack using valerii excellent documentation in https://mysqlentomologist.blogspot.com/2018/03/windows-tools-for-mysql-dbas-basic.html

Marko Mäkelä added a comment - 2024-06-27 05:05

The all_threads_stack.txt does not contain any function parameters, so we can’t know which page latches are being waited for. That said, this looks similar to ~~MDEV-34442~~, to which debarun posted an analysis. That bug was likely fixed in ~~MDEV-34265~~ in the upcoming 10.6.19 release. Do we have any packages of development snapshots for Microsoft Windows, for kennyliao to test?

Marko Mäkelä added a comment - 2024-06-27 05:05 The all_threads_stack.txt does not contain any function parameters, so we can’t know which page latches are being waited for. That said, this looks similar to MDEV-34442 , to which debarun posted an analysis. That bug was likely fixed in MDEV-34265 in the upcoming 10.6.19 release. Do we have any packages of development snapshots for Microsoft Windows, for kennyliao to test?

Vladislav Vaintroub added a comment - 2024-06-27 05:43

I added all_threads_stacks_with_parameters.txt , not sure it is better, but larger it is.
I do not know about development snapshots, that question is probably to vladbogo

Vladislav Vaintroub added a comment - 2024-06-27 05:43 I added all_threads_stacks_with_parameters.txt , not sure it is better, but larger it is. I do not know about development snapshots, that question is probably to vladbogo

Debarun Banerjee added a comment - 2024-07-16 06:38

I could analyzed the stack trace.
1. page cleaner is in furious flush stage.

000000f0`383ff8f0 00007ffd`d5bf2f8b ntdll!RtlEnterCriticalSection+0x13

(Inline Function) --------`-------- server!inline_mysql_mutex_lock+0x31 [D:\winx64-packages\build\src\include\mysql\psi\mysql_thread.h @ 752]

000000f0`383ff920 00007ffd`d5bf486d server!buf_flush_list_holding_mutex+0x3b [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 1574]

(Inline Function) --------`-------- server!buf_flush_list+0x3b [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 1612]

000000f0`383ff950 00007ffd`d5bf4108 server!buf_flush_sync_for_checkpoint+0x5d [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2023]

000000f0`383ff990 00007ffd`d5ba4f4b server!buf_flush_page_cleaner+0xc78 [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2295]

(Inline Function) --------`-------- server!std::invoke+0x2 [C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include\type_traits @ 1729]

2. Multiple threads waiting for free page in buf_LRU_get_free_block. The system is out of free pages.

2024-06-25  2:21:00 197 [Warning] InnoDB: Could not free any blocks in the buffer pool! 129856 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size.

One such DML thread is stuck while creating undo log segment holding rollback segment latch.

3. Stat background holds dictionary lock and blocked on the one of the rseg latch which is likely to be the one held by above DML.

(Inline Function) --------`-------- server!srw_lock_impl<1>::wr_lock(void)+0x36 [D:\winx64-packages\build\src\storage\innobase\include\srw_lock.h @ 528]

000000f0`3899e210 00007ffd`d5d04a1e server!trx_undo_assign_low<0>(

000000f0`3899e270 00007ffd`d5c5d3bc server!trx_undo_report_row_operation(

000000f0`3899e6f0 00007ffd`d5ca2e0e server!btr_cur_del_mark_set_clust_rec(

(Inline Function) --------`-------- server!row_upd_del_mark_clust_rec(void)+0x7a [D:\winx64-packages\build\src\storage\innobase\row\row0upd.cc @ 2515]

000000f0`3899e760 00007ffd`d5ca0b82 server!row_upd_clust_step(

000000f0`3899ee00 00007ffd`d5ca4503 server!row_upd(

000000f0`3899ee40 00007ffd`d5c83269 server!row_upd_step(

000000f0`3899ee80 00007ffd`d5c8277a server!que_thr_step(

(Inline Function) --------`-------- server!que_run_threads_low(void)+0x2a [D:\winx64-packages\build\src\storage\innobase\que\que0que.cc @ 644]

(Inline Function) --------`-------- server!que_run_threads(void)+0x68 [D:\winx64-packages\build\src\storage\innobase\que\que0que.cc @ 664]

000000f0`3899eec0 00007ffd`d5c4199e server!que_eval_sql(

000000f0`3899eef0 00007ffd`d5c430c9 server!dict_stats_save(

4. Other query and DML trying to open any innodb table is stuck acquiring dict_sys mutex resulting in

2024-06-25 2:38:44 0 [Warning] InnoDB: Long wait (450 seconds) for dict_sys.latch

2024-06-25 2:41:14 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch.

So, it is showing the symptom fixed by ~~MDEV-34265~~ (10.6.19) and also seen in ~~MDEV-34442~~. It should be resolve in 10.6.19.

Copying update from @marko in ~~MDEV-34442~~ for possible early verification.
_
The 10.6.19 release is expected to occur in a couple of months. Meanwhile, it would be great if you can test if this problem goes away with a 10.6 development snapshot. You can find some installation packages at https://ci.mariadb.org/47867/ for the currently latest 10.6 build._

Debarun Banerjee added a comment - 2024-07-16 06:38 I could analyzed the stack trace. 1. page cleaner is in furious flush stage. 000000f0`383ff8f0 00007ffd`d5bf2f8b ntdll!RtlEnterCriticalSection+0x13 (Inline Function) --------`-------- server!inline_mysql_mutex_lock+0x31 [D:\winx64-packages\build\src\include\mysql\psi\mysql_thread.h @ 752] 000000f0`383ff920 00007ffd`d5bf486d server!buf_flush_list_holding_mutex+0x3b [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 1574] (Inline Function) --------`-------- server!buf_flush_list+0x3b [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 1612] 000000f0`383ff950 00007ffd`d5bf4108 server!buf_flush_sync_for_checkpoint+0x5d [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2023] 000000f0`383ff990 00007ffd`d5ba4f4b server!buf_flush_page_cleaner+0xc78 [D:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2295] (Inline Function) --------`-------- server!std::invoke+0x2 [C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include\type_traits @ 1729] 2. Multiple threads waiting for free page in buf_LRU_get_free_block. The system is out of free pages. 2024-06-25 2:21:00 197 [Warning] InnoDB: Could not free any blocks in the buffer pool! 129856 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size. One such DML thread is stuck while creating undo log segment holding rollback segment latch. 3. Stat background holds dictionary lock and blocked on the one of the rseg latch which is likely to be the one held by above DML. (Inline Function) --------`-------- server!srw_lock_impl<1>::wr_lock(void)+0x36 [D:\winx64-packages\build\src\storage\innobase\include\srw_lock.h @ 528] 000000f0`3899e210 00007ffd`d5d04a1e server!trx_undo_assign_low<0>( 000000f0`3899e270 00007ffd`d5c5d3bc server!trx_undo_report_row_operation( 000000f0`3899e6f0 00007ffd`d5ca2e0e server!btr_cur_del_mark_set_clust_rec( (Inline Function) --------`-------- server!row_upd_del_mark_clust_rec(void)+0x7a [D:\winx64-packages\build\src\storage\innobase\row\row0upd.cc @ 2515] 000000f0`3899e760 00007ffd`d5ca0b82 server!row_upd_clust_step( 000000f0`3899ee00 00007ffd`d5ca4503 server!row_upd( 000000f0`3899ee40 00007ffd`d5c83269 server!row_upd_step( 000000f0`3899ee80 00007ffd`d5c8277a server!que_thr_step( (Inline Function) --------`-------- server!que_run_threads_low(void)+0x2a [D:\winx64-packages\build\src\storage\innobase\que\que0que.cc @ 644] (Inline Function) --------`-------- server!que_run_threads(void)+0x68 [D:\winx64-packages\build\src\storage\innobase\que\que0que.cc @ 664] 000000f0`3899eec0 00007ffd`d5c4199e server!que_eval_sql( 000000f0`3899eef0 00007ffd`d5c430c9 server!dict_stats_save( 4. Other query and DML trying to open any innodb table is stuck acquiring dict_sys mutex resulting in 2024-06-25 2:38:44 0 [Warning] InnoDB: Long wait (450 seconds) for dict_sys.latch 2024-06-25 2:41:14 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. So, it is showing the symptom fixed by MDEV-34265 (10.6.19) and also seen in MDEV-34442 . It should be resolve in 10.6.19. Copying update from @marko in MDEV-34442 for possible early verification. _ The 10.6.19 release is expected to occur in a couple of months. Meanwhile, it would be great if you can test if this problem goes away with a 10.6 development snapshot. You can find some installation packages at https://ci.mariadb.org/47867/ for the currently latest 10.6 build._

Elena Stepanova added a comment - 2024-08-18 10:51

10.6.19 has been released now, so hopefully kennyliao can now check whether the upgrade helps using an official release.

Elena Stepanova added a comment - 2024-08-18 10:51 10.6.19 has been released now, so hopefully kennyliao can now check whether the upgrade helps using an official release.

Debarun Banerjee added a comment - 2024-09-12 07:21

Since the symptom is exactly same as ~~MDEV-34265~~, I am closing this issue as duplicate. It should be fixed in 10.6.19.

Debarun Banerjee added a comment - 2024-09-12 07:21 Since the symptom is exactly same as MDEV-34265 , I am closing this issue as duplicate. It should be fixed in 10.6.19.

Muhammad Haroon added a comment - 2025-01-28 08:46

The same issue i'm facing in 10.6.19 version .

Logs details:
2025-01-28 8:36:40 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
250128 8:36:40 [ERROR] mysqld got signal 6 ;
Sorry, we probably made a mistake, and this is a bug.

Your assistance in bug reporting will enable us to fix this for the next release.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.6.19-MariaDB-ubu2204 source revision: 8f020508c87461bbeee551c3001acac56d506ae7
key_buffer_size=104857600
read_buffer_size=131072
max_used_connections=64
max_threads=302
thread_count=49
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 767495 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
2025-01-28 8:36:46 0 [Note] /usr/sbin/mariadbd (initiated by: unknown): Normal shutdown
/usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x5578f342db52]
/usr/sbin/mariadbd(handle_fatal_signal+0x478)[0x5578f2ecfa18]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fc5324f7520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7fc53254b9fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7fc5324f7476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7fc5324dd7f3]

Version :
mariadb.service - MariaDB 10.6.19 database server
Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)

Muhammad Haroon added a comment - 2025-01-28 08:46 The same issue i'm facing in 10.6.19 version . Logs details: 2025-01-28 8:36:40 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ 250128 8:36:40 [ERROR] mysqld got signal 6 ; Sorry, we probably made a mistake, and this is a bug. Your assistance in bug reporting will enable us to fix this for the next release. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.6.19-MariaDB-ubu2204 source revision: 8f020508c87461bbeee551c3001acac56d506ae7 key_buffer_size=104857600 read_buffer_size=131072 max_used_connections=64 max_threads=302 thread_count=49 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 767495 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x49000 2025-01-28 8:36:46 0 [Note] /usr/sbin/mariadbd (initiated by: unknown): Normal shutdown /usr/sbin/mariadbd(my_print_stacktrace+0x32) [0x5578f342db52] /usr/sbin/mariadbd(handle_fatal_signal+0x478) [0x5578f2ecfa18] /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7fc5324f7520] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7fc53254b9fc] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7fc5324f7476] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7fc5324dd7f3] Version : mariadb.service - MariaDB 10.6.19 database server Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)

Debarun Banerjee added a comment - 2025-01-28 10:51

Hello haroonmasood
"innodb_fatal_semaphore_wait_threshold was exceeded" is rather generic error and dict_sys.latch wait is induced by another thread. The issue could actually be completely unrelated to the current bug and a different one. One definite way to verify this is to find out the stack trace for all other threads look for buf_flush_page_cleaner stack. For the current bug the page cleaner should be stuck in buf_flush_sync_for_checkpoint.

inline_mysql_mutex_lock

buf_flush_list_holding_mutex

buf_flush_list

buf_flush_sync_for_checkpoint

buf_flush_page_cleaner

Since you face this issue 10.6.19, I suspect it is a different issue and not related to page cleaner.
Can you please collect the stack trace for all the threads (similar to all_threads.txt uploaded for current bug) ? Please file another bug, if the page cleaner stack is different,

Debarun Banerjee added a comment - 2025-01-28 10:51 Hello haroonmasood "innodb_fatal_semaphore_wait_threshold was exceeded" is rather generic error and dict_sys.latch wait is induced by another thread. The issue could actually be completely unrelated to the current bug and a different one. One definite way to verify this is to find out the stack trace for all other threads look for buf_flush_page_cleaner stack. For the current bug the page cleaner should be stuck in buf_flush_sync_for_checkpoint. inline_mysql_mutex_lock buf_flush_list_holding_mutex buf_flush_list buf_flush_sync_for_checkpoint buf_flush_page_cleaner Since you face this issue 10.6.19, I suspect it is a different issue and not related to page cleaner. Can you please collect the stack trace for all the threads (similar to all_threads.txt uploaded for current bug) ? Please file another bug, if the page cleaner stack is different,

People

Assignee:: Debarun Banerjee

Reporter:: kennyliao

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2024-06-27 01:21

Updated:: 2025-01-28 10:54

Resolved:: 2024-09-12 07:17

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration