Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37728

Shutdown hang due to deadlock between timer_handler and srv_thread_pool_end

    XMLWordPrintable

Details

    • Can result in hang or crash
    • In rare cases, shutdown might hang

    Description

      saahil reported a shutdown hang in https://github.com/MariaDB/server/pull/4264#discussion_r2372053903 while testing MDEV-37482.

      As far as I can tell (please see the details in the link), the dedicated timer_handler() thread is holding LOCK_timer and later waiting on a mutex that was acquired in tpool::thread_pool_generic::timer_generic::disarm(), which is invoked deep inside the function srv_thread_pool_end(). After this, tpool::thread_pool_generic::timer_generic::disarm() (still holding that mutex) will invoke thr_timer_end(), which will wait on LOCK_timer, which is being held by the timer_handler() thread.

      This is obvious lock order inversion: one thread waits for LOCK_timer while holding the other mutex, and timer_handler() has the waits in the opposite way.

      A lock order inversion does not always cause a deadlock, but it is a prerequisite for one. In this case, we got evidence of an actual hang due to this deadlock.

      Attachments

        Issue Links

          Activity

            People

              wlad Vladislav Vaintroub
              marko Marko Mäkelä
              Vladislav Vaintroub Vladislav Vaintroub
              Saahil Alam Saahil Alam
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.