Debian 11 Bullseye; MariaDB from the Maria repositories.
Description
I've had two different servers this week hit an issue with a semaphore wait lasting forever. This bug report isn't about what caused those wait-forever situations: it's about how the server handled them.
If I'm reading the code in storage/innobase/srv/srv0srv.cc correctly, there is supposed to be a warning at 1/4 of the innodb_fatal_semaphore_wait_threshold value, at 1/2 of that value, and at 3/4 of that value before killing the server once the threshold value is reached.
Examining the syslog, however, shows this doesn't seem to be working. One of my servers gave no warnings at all. It waited the full 600 seconds and then gave the fatal error as it crashed itself. The "Long wait" message never appeared at all.
The other server did give the "Long wait" warning, starting at 150 seconds, which is the correct time to start (1/4 of 600). I would expect a warning at 300 seconds, then 450 seconds. However it instead warned again at 159 seconds, and then every 10 seconds after that until 289 at which time I killed the server myself rather than waiting the full 600.
I would like to change my innodb_fatal_semaphore_wait_threshold setting to a dramatically lower number. I'll be able to tell whether it's safe to do so by making incremental changes and observing the presence or absence of "Long wait" warnings. However, if these warnings aren't behaving the way they're supposed to, that won't work.
Apologies if I'm misunderstanding what the behavior is supposed to be. It doesn't seem like zero warnings in some cases and warnings every 10 seconds in other cases would be the intended behavior.
Attachments
Issue Links
is duplicated by
MDEV-32909Hang not caught by semaphore wait threshold
Closed
relates to
MDEV-24911Missing warning before [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.mutex
The "long semaphore wait" diagnostics and sync_array (MDEV-11972) used to cover all InnoDB rw-locks and mutexes until MariaDB Server 10.4. In 10.5, some mutexes were converted to normal ones. In 10.6, MDEV-21452, MDEV-24167, and MDEV-24142 finally got rid of the sync_array. I thought that most waits involve the data dictionary latch at some point, and therefore I implemented a watchdog on dict_sys.latch only. It looks like this watchdog finally kicked in also in your case, but with more delay.
Can you please provide stack traces of all threads during the hangs? That is the only reliable way to identify and to fix hangs. A known hang that affects the currently latest releases is MDEV-32049.
Marko Mäkelä
added a comment - The "long semaphore wait" diagnostics and sync_array ( MDEV-11972 ) used to cover all InnoDB rw-locks and mutexes until MariaDB Server 10.4. In 10.5, some mutexes were converted to normal ones. In 10.6, MDEV-21452 , MDEV-24167 , and MDEV-24142 finally got rid of the sync_array . I thought that most waits involve the data dictionary latch at some point, and therefore I implemented a watchdog on dict_sys.latch only. It looks like this watchdog finally kicked in also in your case, but with more delay.
Can you please provide stack traces of all threads during the hangs? That is the only reliable way to identify and to fix hangs. A known hang that affects the currently latest releases is MDEV-32049 .
I'm not sure I understand your explanation of what ended up causing my weird pattern of "long wait" warnings, but I certainly appreciate your help!
I'm afraid the only stack trace I have is the one where no warnings were given. For the other one, the one where warnings were appearing every 10 seconds, I killed the server manually rather than letting it time out and die. I've shrunk the timeout from 600 seconds to 152 seconds, so maybe something will be happening more frequently going forward.
Here's what went into the syslog:
2023-10-04 1:31:50 0 [ERROR][FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-my
sqld/
231004 1:31:50 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.11.5-MariaDB-1:10.11.5+maria~deb11-log source revision: 7875294b6b74b53dd3aaa723e6cc103d2bb47b2c
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=188
max_threads=505
thread_count=197
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1243339 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
2023-10-04 1:31:51 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.68234S), skipping check
/usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55bb081c7eee]
/usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x55bb07ca5365]
sigaction.c:0(__restore_rt)[0x7f553d5f1140]
2023-10-04 1:31:58 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.59027S), skipping check
linux/raise.c:51(__GI_raise)[0x7f553cd53ce1]
stdlib/abort.c:81(__GI_abort)[0x7f553cd3d537]
2023-10-04 1:32:02 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.56904S), skipping check
/usr/sbin/mariadbd(+0x69d8c7)[0x55bb078d38c7]
/usr/sbin/mariadbd(+0x694151)[0x55bb078ca151]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x38)[0x55bb0815fc18]
/usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x32)[0x55bb081609c2]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x4f)[0x55bb0815e73f]
2023-10-04 1:32:05 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.51115S), skipping check
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xceed0)[0x7f553d101ed0]
2023-10-04 1:32:09 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.8615S), skipping check
nptl/pthread_create.c:478(start_thread)[0x7f553d5e5ea7]
2023-10-04 1:32:12 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.52984S), skipping check
x86_64/clone.S:97(_GI__clone)[0x7f553ce16a2f]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 514437 514437 processes
Max open files 1000000 1000000 files
Max locked memory 2097152 2097152 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 514437 514437 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: core
Kernel version: Linux version 6.1.0-0.deb11.11-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4~bpo11+1 (2023-08-08)
Xan Charbonnet
added a comment - Marko,
I'm not sure I understand your explanation of what ended up causing my weird pattern of "long wait" warnings, but I certainly appreciate your help!
I'm afraid the only stack trace I have is the one where no warnings were given. For the other one, the one where warnings were appearing every 10 seconds, I killed the server manually rather than letting it time out and die. I've shrunk the timeout from 600 seconds to 152 seconds, so maybe something will be happening more frequently going forward.
Here's what went into the syslog:
2023-10-04 1:31:50 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-my
sqld/
231004 1:31:50 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.11.5-MariaDB-1:10.11.5+maria~deb11-log source revision: 7875294b6b74b53dd3aaa723e6cc103d2bb47b2c
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=188
max_threads=505
thread_count=197
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1243339 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
2023-10-04 1:31:51 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.68234S), skipping check
/usr/sbin/mariadbd(my_print_stacktrace+0x2e) [0x55bb081c7eee]
/usr/sbin/mariadbd(handle_fatal_signal+0x485) [0x55bb07ca5365]
sigaction.c:0(__restore_rt) [0x7f553d5f1140]
2023-10-04 1:31:58 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.59027S), skipping check
linux/raise.c:51(__GI_raise) [0x7f553cd53ce1]
stdlib/abort.c:81(__GI_abort) [0x7f553cd3d537]
2023-10-04 1:32:02 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.56904S), skipping check
/usr/sbin/mariadbd(+0x69d8c7) [0x55bb078d38c7]
/usr/sbin/mariadbd(+0x694151) [0x55bb078ca151]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x38) [0x55bb0815fc18]
/usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x32) [0x55bb081609c2]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x4f) [0x55bb0815e73f]
2023-10-04 1:32:05 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.51115S), skipping check
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xceed0) [0x7f553d101ed0]
2023-10-04 1:32:09 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.8615S), skipping check
nptl/pthread_create.c:478(start_thread) [0x7f553d5e5ea7]
2023-10-04 1:32:12 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.52984S), skipping check
x86_64/clone.S:97(_ GI __clone) [0x7f553ce16a2f]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 514437 514437 processes
Max open files 1000000 1000000 files
Max locked memory 2097152 2097152 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 514437 514437 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: core
Kernel version: Linux version 6.1.0-0.deb11.11-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4~bpo11+1 (2023-08-08)
xan@biblionix.com, the stack trace that was written to the log is basically useless, because it apparently is from the thread that intentionally kills the server after noticing a hang. (The InnoDB function names in the stack trace were not even resolved by the built-in stack trace generator; this is a known problem.)
To analyze the root cause of any hang, we really need the stack traces of all threads of the process.
Marko Mäkelä
added a comment - xan@biblionix.com , the stack trace that was written to the log is basically useless, because it apparently is from the thread that intentionally kills the server after noticing a hang. (The InnoDB function names in the stack trace were not even resolved by the built-in stack trace generator; this is a known problem.)
To analyze the root cause of any hang, we really need the stack traces of all threads of the process.
In https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/#enabling-core-dumps you should be able to find how to enable core dumps and how to extract stack traces from a core dump.
I enabled core dumps and now have a stack trace for two crashes which have occurred over the past few days. There is definitely a lot more information in these stack traces.
In fact there's so much information that I'm not sure I can guarantee it's appropriate for public display. Is there a more private way for me to get these stack traces to you?
Xan Charbonnet
added a comment - Marko,
I enabled core dumps and now have a stack trace for two crashes which have occurred over the past few days. There is definitely a lot more information in these stack traces.
In fact there's so much information that I'm not sure I can guarantee it's appropriate for public display. Is there a more private way for me to get these stack traces to you?
Thank you, I got the stack traces of two hung processes. I will need to analyze them some time later, once I am done with some urgent tasks.
This looks like a new type of hang. I see many threads invoking buf_pool_t::corrupted_evict(), trying to remove a corrupted page from the buffer pool. We did test this logic with some debug instrumentation during the development of MDEV-13542, but such a test might not be part of our regular stress test battery.
Marko Mäkelä
added a comment - Thank you, I got the stack traces of two hung processes. I will need to analyze them some time later, once I am done with some urgent tasks.
This looks like a new type of hang. I see many threads invoking buf_pool_t::corrupted_evict() , trying to remove a corrupted page from the buffer pool. We did test this logic with some debug instrumentation during the development of MDEV-13542 , but such a test might not be part of our regular stress test battery.
In one of the provided sets of stack traces, I found that a thread is waiting in buf_page_get_zip() here:
if (discard_attempted || !bpage->frame)
{
/* Even when we are holding a hash_lock, it should be
acceptable to wait for a page S-latch here, because
buf_page_t::read_complete() will not wait for buf_pool.mutex,
and because S-latch would not conflict with a U-latch
that would be protecting buf_page_t::write_complete(). */
bpage->lock.s_lock();
hash_lock.unlock_shared();
break;
}
The comment may have been truthful before MDEV-13542. Here, we are holding the buf_pool.page_hash latch for page 0x19b1a:0x17880 while several other threads are actually waiting for buf_pool.mutex in the read completion callback so that they could throw out corrupted pages. Possibly, one of the pages is exactly this page that is being requested here.
There is also one thread apparently in a busy loop, looking for a page of a ROW_FORMAT=COMPRESSED table:
I checked another set of stack traces and found the same pattern. In the most recent one trace_2023-10-31_21-13-25 we have Thread 76 waiting for buf_pool.mutex:
mariadb-10.11.5
Thread 76 (Thread 0x7fc87dae5700 (LWP 3435174)):
…
#2 0x00005638ead3f85d in inline_mysql_mutex_lock (src_file=0x5638eb8d8228 "./storage/innobase/buf/buf0lru.cc", src_line=1231, that=0x5638ebf68b00 <buf_pool>) at ./include/mysql/psi/mysql_thread.h:752
No locals.
#3 buf_pool_t::corrupted_evict (this=0x5638ebf68b00 <buf_pool>, bpage=0x5639908a14e8, state=2147483648) at ./storage/innobase/buf/buf0lru.cc:1231
The page ID (452504869516550 == 0x19b8d0001b506) is almost identical with what we have in another thread, but it does not seem to share the buf_pool.page_hash latch with it (0x5638f1ceca00 is not 0x5638f1cec900):
mariadb-10.11.5
Thread 58 (Thread 0x7fc87f573700 (LWP 3738762)):
…
#6 sux_lock<ssux_lock_impl<true> >::s_lock (this=0x563973caed20) at ./storage/innobase/include/sux_lock.h:362
No locals.
#7 buf_page_get_zip (page_id={m_id = 452504869516544}, zip_size=zip_size@entry=8192) at ./storage/innobase/buf/buf0buf.cc:2233
Bingo, this thread is actually holding buf_pool.mutex (blocking a lot of threads), and waiting for the page hash latch that Thread 58 is holding while holding the page latch that Thread 58 is waiting for:
mysql_mutex_lock(&mutex);
hash_lock.lock();
ut_ad(!bpage->oldest_modification());
bpage->set_corrupt_id();
auto unfix= state - buf_page_t::FREED;
auto s= bpage->zip.fix.fetch_sub(unfix) - unfix;
bpage->lock.x_unlock(true);
The function buf_page_get_zip() needs to be revised so that it will not wait for the page latch while holding the buf_pool.page_hash latch. This function is fetching BLOB pages of ROW_FORMAT=COMPRESSED tables.
Marko Mäkelä
added a comment - In one of the provided sets of stack traces, I found that a thread is waiting in buf_page_get_zip() here:
if (discard_attempted || !bpage->frame)
{
/* Even when we are holding a hash_lock, it should be
acceptable to wait for a page S-latch here, because
buf_page_t::read_complete() will not wait for buf_pool.mutex,
and because S-latch would not conflict with a U-latch
that would be protecting buf_page_t::write_complete(). */
bpage->lock.s_lock();
hash_lock.unlock_shared();
break ;
}
The comment may have been truthful before MDEV-13542 . Here, we are holding the buf_pool.page_hash latch for page 0x19b1a:0x17880 while several other threads are actually waiting for buf_pool.mutex in the read completion callback so that they could throw out corrupted pages. Possibly, one of the pages is exactly this page that is being requested here.
There is also one thread apparently in a busy loop, looking for a page of a ROW_FORMAT=COMPRESSED table:
mariadb-10.11.5
#3 buf_page_get_low (page_id={m_id = 105604655874051}, zip_size=8192, rw_latch=1, guess=<optimized out>, mode=10, mtr=0x7f44cddd9e90, err=0x7f44cddd93f0, allow_ibuf_merge=false) at ./storage/innobase/buf/buf0buf.cc:2611
I checked another set of stack traces and found the same pattern. In the most recent one trace_2023-10-31_21-13-25 we have Thread 76 waiting for buf_pool.mutex :
mariadb-10.11.5
Thread 76 (Thread 0x7fc87dae5700 (LWP 3435174)):
…
#2 0x00005638ead3f85d in inline_mysql_mutex_lock (src_file=0x5638eb8d8228 "./storage/innobase/buf/buf0lru.cc", src_line=1231, that=0x5638ebf68b00 <buf_pool>) at ./include/mysql/psi/mysql_thread.h:752
No locals.
#3 buf_pool_t::corrupted_evict (this=0x5638ebf68b00 <buf_pool>, bpage=0x5639908a14e8, state=2147483648) at ./storage/innobase/buf/buf0lru.cc:1231
id = {m_id = <optimized out>}
chain = @0x5638f1ceca18: {first = 0x56396e8cc258}
hash_lock = @0x5638f1ceca00: {lk = {writer = {lock = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 0}, <No data fields>}, static HOLDER = 2147483648}, readers = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 0}, <No data fields>}, static WRITER = 2147483648}}
unfix = <optimized out>
s = <optimized out>
#4 0x00005638eb5271d5 in buf_page_t::read_complete (this=0x5639908a14e8, node=@0x5638f0b3e1d8: {space = 0x5638f0b3d8d8, name = 0x5638f0b1f4b8 "./atoz/c_log#P#p13.ibd", handle = {m_file = 262, m_psi = 0x0}, on_ssd = 1, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 198656, init_size = 0, max_size = 4294967295, being_extended = {m = {_M_base = {static _S_alignment = 1, _M_i = false}}}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at ./storage/innobase/buf/buf0buf.cc:3627
database_corrupted = <optimized out>
expected_id = {m_id = 452504869516550}
read_frame = <optimized out>
err = DB_FAIL
recovery = <optimized out>
ibuf_may_exist = <optimized out>
The page ID (452504869516550 == 0x19b8d0001b506) is almost identical with what we have in another thread, but it does not seem to share the buf_pool.page_hash latch with it (0x5638f1ceca00 is not 0x5638f1cec900):
mariadb-10.11.5
Thread 58 (Thread 0x7fc87f573700 (LWP 3738762)):
…
#6 sux_lock<ssux_lock_impl<true> >::s_lock (this=0x563973caed20) at ./storage/innobase/include/sux_lock.h:362
No locals.
#7 buf_page_get_zip (page_id={m_id = 452504869516544}, zip_size=zip_size@entry=8192) at ./storage/innobase/buf/buf0buf.cc:2233
discard_attempted = <optimized out>
chain = @0x5638f1cec928: {first = 0x563973caed08}
hash_lock = @0x5638f1cec900: {lk = {writer = {lock = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2147483649}, <No data fields>}, static HOLDER = 2147483648}, readers = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2147483649}, <No data fields>}, static WRITER = 2147483648}}
bpage = 0x563973caed08
By searching for 0x5638f1cec900 I found a thread that seems to be a culprit in this deadlock:
mariadb-10.11.5
Thread 139 (Thread 0x7fc8762d6700 (LWP 3435180)):
…
#4 page_hash_latch::lock (this=0x5638f1cec900) at ./storage/innobase/include/buf0types.h:214
No locals.
#5 buf_pool_t::corrupted_evict (this=0x5638ebf68b00 <buf_pool>, bpage=0x563969c1c648, state=2147483648) at ./storage/innobase/buf/buf0lru.cc:1232
id = {m_id = <optimized out>}
chain = @0x5638f1cec920: {first = 0x563969c1c648}
hash_lock = @0x5638f1cec900: {lk = {writer = {lock = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2147483649}, <No data fields>}, static HOLDER = 2147483648}, readers = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2147483649}, <No data fields>}, static WRITER = 2147483648}}
unfix = <optimized out>
s = <optimized out>
#6 0x00005638eb5271d5 in buf_page_t::read_complete (this=0x563969c1c648, node=@0x5638f0b3e1d8: {space = 0x5638f0b3d8d8, name = 0x5638f0b1f4b8 "./atoz/c_log#P#p13.ibd", handle = {m_file = 262, m_psi = 0x0}, on_ssd = 1, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 198656, init_size = 0, max_size = 4294967295, being_extended = {m = {_M_base = {static _S_alignment = 1, _M_i = false}}}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at ./storage/innobase/buf/buf0buf.cc:3627
database_corrupted = <optimized out>
expected_id = {m_id = 452504869516545}
read_frame = <optimized out>
err = DB_FAIL
recovery = <optimized out>
ibuf_may_exist = <optimized out>
Bingo, this thread is actually holding buf_pool.mutex (blocking a lot of threads), and waiting for the page hash latch that Thread 58 is holding while holding the page latch that Thread 58 is waiting for:
mysql_mutex_lock(&mutex);
hash_lock.lock();
ut_ad(!bpage->oldest_modification());
bpage->set_corrupt_id();
auto unfix= state - buf_page_t::FREED;
auto s= bpage->zip.fix.fetch_sub(unfix) - unfix;
bpage->lock.x_unlock( true );
The function buf_page_get_zip() needs to be revised so that it will not wait for the page latch while holding the buf_pool.page_hash latch. This function is fetching BLOB pages of ROW_FORMAT=COMPRESSED tables.
origin/10.6-MDEV-32371 963a88be2e89100260e09e46e90e2b3271444899 2023-11-23T12:11:39+02:00
performed well in RQG testing.
Matthias Leich
added a comment - origin/10.6- MDEV-32371 963a88be2e89100260e09e46e90e2b3271444899 2023-11-23T12:11:39+02:00
performed well in RQG testing.
Marko Mäkelä
added a comment - xan@biblionix.com , in case you are interested, at https://ci.mariadb.org/41154/ you can find a packages of the current 10.11 development branch, which includes a fix of this bug.
@marko, I ended up reccompiling the Debian package after making your modification. Have not seen a hang since. Thanks!
Xan Charbonnet
added a comment - @marko, I ended up reccompiling the Debian package after making your modification. Have not seen a hang since. Thanks!
I think we got caught with this on Debian bookworm during backup of databases with compressed pages - thanks for the fix. Put on the version from testing to get it.
Laurence Parry
added a comment - I think we got caught with this on Debian bookworm during backup of databases with compressed pages - thanks for the fix. Put on the version from testing to get it.
People
Marko Mäkelä
Xan Charbonnet
Votes:
0Vote for this issue
Watchers:
5Start watching this issue
Dates
Created:
Updated:
Resolved:
Git Integration
Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.
The "long semaphore wait" diagnostics and sync_array (
MDEV-11972) used to cover all InnoDB rw-locks and mutexes until MariaDB Server 10.4. In 10.5, some mutexes were converted to normal ones. In 10.6,MDEV-21452,MDEV-24167, andMDEV-24142finally got rid of the sync_array. I thought that most waits involve the data dictionary latch at some point, and therefore I implemented a watchdog on dict_sys.latch only. It looks like this watchdog finally kicked in also in your case, but with more delay.Can you please provide stack traces of all threads during the hangs? That is the only reliable way to identify and to fix hangs. A known hang that affects the currently latest releases is
MDEV-32049.