Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34445

Rare Futex deadlocks

    XMLWordPrintable

Details

    Description

      Every ~2-7 days, around midnight, one of our SQL servers is experiencing an issue where it deadlocks near-completely. The log, as expected, just stops abruptly with no indication of what's wrong.

      I can still connect using the 'root' account using a unix socket while this happens. Active queries (show processlist) seems independent.

      Not the queries are deadlocking, the program itself is. No queries will process or complete as the program internally waits endlessly for mutexes.

      I researched a possible cause; the most common appears to be calling unsafe functions in signal handlers.

      I'm not too well versed in gdb. I don't know how to reproduce the problem (that's our entire issue). It can and does occur periodically. THe situation in [info threads] of gdb looks a bit like this:

      • About 300 threads stuck at `0x7f7bf8fa86c0 (LWP 684969) "mariadbd" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38`
      • 10-12 threads inside __futex_abstimed_wait_common64
      • 5-6 threads in _GI__poll
      • About twenty entries like this (unknown); `Thread 0x7f7bf838d6c0 (LWP 329146) "iou-wrk-298768" 0x0000000000000000 in ?? ()`

      Attachments

        1. backtrace.log
          2.65 MB
        2. keyQuery.txt
          112 kB

        Activity

          People

            debarun Debarun Banerjee
            npr npr
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.