[MDEV-32049] Deadlock due to log_free_check(), involving trx_purge_truncate_rseg_history() and trx_undo_assign_low() Created: 2023-08-31 Updated: 2023-11-08 Resolved: 2023-08-31 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5.20, 10.6.13, 10.6.15 |
| Fix Version/s: | 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2, 11.3.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Cuchac | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | crash, deadlock, innodb, regression, semaphore | ||
| Environment: |
K8s, amd64, occurs both in official mariadb:10.6.15 and quay.io/mariadb-foundation/mariadb-debug:10.6 |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
Hi, Finally today we were able to get decent coredump, using quay.io/mariadb-foundation/mariadb-debug:10.6 image. Exact version is 10.6.16-MariaDB-1:10.6.16+maria~ubu2004-log source revision: 07494006dd0887ebfb31564a8fd4c59cf1b299e9, exact image version docker.io/library/mariadb@sha256:fcbe381e5fef20c7a2932b52a070f58987b770c651aedf705332e54d1dfd465f SELECTs seems to be running OK, DML queries are blocked. Some in "opening table" some in "sending data". I'm attaching both server log, full backtrace and I also have coredump, but it is 700MB bzipped so not attaching but is available. |
| Comments |
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
Thank you for an excellent report, with a useful mariadbd_full_bt_all_threads.txt
This is executing an INSERT, so it can’t be holding an exclusive dict_sys.latch. It is waiting for an exclusive rseg->latch so that it can assign the first undo log page for the transaction. We have also Thread 234 and Thread 230 waiting for the same rseg->latch == trx_sys+16648 at the same spot. This should still be normal; the interesting threads would be ones that are waiting for something else while holding dict_sys.latch or the rseg->latch. In the end, I filtered out these less interesting waits from the output of
using replace-regexp in GNU Emacs and got the following:
We can ignore Thread 149, which is invoking the io_getevents system call; this image does not use the newer io_uring ( I will check the mtr_t::m_memo in all threads deeper, to check if this is related to MDEV-31815. | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
In 64-bit builds, the offset of buf_block_t::lock is 0x18. Thread 196 is waiting for a page latch on the block at 0x7feb2c090380. An exclusive latch on that page is being held by Thread 42, which is waiting for rseg->latch==trx_sys+16656. Thread 159 is waiting for a shared dict_index_t::lock on the index 0x7fea000302f0. An exclusive latch is held by Thread 82, which is waiting for rseg->latch==trx_sys+16656. Thread 44 is waiting for a latch on the block 0x7fea9c096640, which is exclusively latched by Thread 52, which is waiting for rseg->latch==trx_sys+16640. I will have to double check if I made a mistake in my filtering of the grep output. Currently, this does not look related to | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
I forgot that GNU libc does not call the syscall() wrapper function, but invokes system calls directly. Thread 104 should be blocking the progress of any thread that is trying to allocate the first undo log page for a modification that they are about to make:
This is waiting for a log checkpoint to be completed, so that the ib_logfile0 will not be overwritten while we try to free garbage undo log pages. We see that the page cleaner thread is trying to fulfill an earlier request to urgently advance the checkpoint LSN to an earlier LSN:
I do not think that this thread is not blocked; it just happens to be waiting for buf_pool.flush_list_mutex at this very moment. But, the checkpoint cannot be advanced, presumably because one of the active threads is holding an exclusive latch on a page whose oldest_modification is earlier than the desired checkpoint LSN. We can only allow shared page latches to be held by other threads while a page is being written out from the buffer pool. It will be tricky to fix this. Setting a larger innodb_log_file_size should reduce the probability of this hang. Thanks to | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
In | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
By the way, the exclusive dict_sys.latch is held by Thread 45, which invoked log_free_check() inside dict_stats_update(), while opening tables for executing an UPDATE statement. The culprit for the hang remains trx_purge_truncate_rseg_history(). The call to log_free_check() has to be moved to its caller trx_purge_truncate_history(), before rseg->mutex or rseg->latch is acquired. | ||||||||||||||||||||||||||||||||||
| Comment by Cuchac [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
Oh, you are co amazing! Fully analyzed and existing patch in 2 hours, incredible, thank you! I will be happy to test in production, once this patch lands to some image. | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
I have now pushed a fix to 10.5 and merged it up to 10.6. Unfortunately, I think that it is very hard to reproduce this; we never encountered this hang in our internal tests. | ||||||||||||||||||||||||||||||||||
| Comment by Cuchac [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
Hello, | ||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-08-31 ] | ||||||||||||||||||||||||||||||||||
|
something got stuck in the build pipeline. Both 10.5 and 10.6 images containing this fix will be on https://quay.io/repository/mariadb-foundation/mariadb-devel?tab=tags in a few hours. | ||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-09-01 ] | ||||||||||||||||||||||||||||||||||
|
Both are there now on mariadb-devel. Debug images are currently being pushed. | ||||||||||||||||||||||||||||||||||
| Comment by Cuchac [ 2023-09-01 ] | ||||||||||||||||||||||||||||||||||
|
Great, I was hoping mariadb-devel will get the update. I just deployed the fix to our cluster. Will let you know. Hopefully this patch will fix it and we will be happy again | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-01 ] | ||||||||||||||||||||||||||||||||||
|
Thank you cuchac, I am eager to know the outcome. | ||||||||||||||||||||||||||||||||||
| Comment by Cuchac [ 2023-09-03 ] | ||||||||||||||||||||||||||||||||||
|
Great work everybody! 2 days without deadlock, yay | ||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-09-04 ] | ||||||||||||||||||||||||||||||||||
|
Thanks for the feedback. Great to have something clear and actionable. As this is the first bug report using quay.io/mariadb-foundation/mariadb-debug, what where some of the good/bad points about it and what would you (request me to) improve? | ||||||||||||||||||||||||||||||||||
| Comment by Cuchac [ 2023-09-05 ] | ||||||||||||||||||||||||||||||||||
|
Hello, it is great such image exists. I think we would not be able to produce a working backtrace without the image. I had some issues that can be fixed by improving documentation: 2) We failed to produce coredump several times. We did not realize kernel.core_pattern is not namespaced and our k8s cluster provider had to change the value from default ubuntu value (apport) to plain file name. It is quite weird that kernel.core_pattern is inside container when file path is specified but it is in root namespace when "pipe" format is used (as is default in ubuntu apport). | ||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-09-22 ] | ||||||||||||||||||||||||||||||||||
|
cuchac Thanks for the feedback. Documented. | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-08 ] | ||||||||||||||||||||||||||||||||||
|
A high-level description of this bug that InnoDB could hang when removing the history of committed transactions. As far as I can tell, this bug can occur independent of any configuration parameters. All that is needed is some transactions that modify non-temporary InnoDB tables, and sufficiently bad luck. A smaller setting of innodb_log_file_size could increase the probability of hitting this hang. |