Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.6, 10.11, 11.4, 11.8
-
Can result in hang or crash
-
In rare cases, shutdown might hang
Description
saahil reported a shutdown hang in https://github.com/MariaDB/server/pull/4264#discussion_r2372053903 while testing MDEV-37482.
As far as I can tell (please see the details in the link), the dedicated timer_handler() thread is holding LOCK_timer and later waiting on a mutex that was acquired in tpool::thread_pool_generic::timer_generic::disarm(), which is invoked deep inside the function srv_thread_pool_end(). After this, tpool::thread_pool_generic::timer_generic::disarm() (still holding that mutex) will invoke thr_timer_end(), which will wait on LOCK_timer, which is being held by the timer_handler() thread.
This is obvious lock order inversion: one thread waits for LOCK_timer while holding the other mutex, and timer_handler() has the waits in the opposite way.
A lock order inversion does not always cause a deadlock, but it is a prerequisite for one. In this case, we got evidence of an actual hang due to this deadlock.
Attachments
Issue Links
- relates to
-
MDEV-16264 Implement a common work queue for InnoDB background tasks
-
- Closed
-
-
MDEV-37482 Contention on btr_sea::partition::latch
-
- In Testing
-