Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29293

MariaDB stuck on starting commit state (waiting on commit order critical section)

    XMLWordPrintable

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.5.15
    • 10.5
    • Galera
    • Galera version: 26.4.6-buster
      Debian Version: 10.6
      Kernel version: 4.19.0-12-amd64

    Description

      In an environment running Galera Cluster with 6 MariaDB nodes, 1 arbitrator node, some replicas and a ProxySQL, after a network issue that triggered a state transfer on two nodes,
      for some reason, almost all the transactions hang in:

      • “starting” state on the commit statement or on "".
      • "acquiring total order isolation" on the "KILL CONNECTION" statement (The "KILL CONNECTION" was requested by the ProxySQL)
        We tried to restart the service but it hangs on stopping, ProxySQL detected this node as down and switched the traffic to another node.

      By looking at the backtrace it seems that we have a kind of "pthread_cond_wait() deadlock" executed by lock.wait() on the enter() function on the commit monitor during the commit order critical section.

      Unfortunately, we didn't find a way to reproduce the problem

      Attachments

        1. backtraces.txt
          315 kB
        2. innodb_status.txt
          67 kB
        3. process_list.txt
          467 kB
        4. processlist.png
          processlist.png
          701 kB
        5. process-list-sample.txt
          2 kB

        Activity

          People

            seppo Seppo Jaakola
            williamwelter William Welter
            Votes:
            4 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.