Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23328

Server hang due to Galera lock conflict resolution

Details

    Description

      When a SQL KILL statement is requesting a transaction to be aborted at the same time when the same transaction is chosen as a victim in the Galera transaction certification process, the server can hang.

      There have been attempts to fix this problem earlier. A suggested fix for MDEV-18464 had been pushed and soon thereafter reverted because of issues. Another fix (which adds another field to THD, expanding the potential state space) was pushed to 10.4 and 10.5 in MDEV-21910, but it fails to prevent such hangs.

      It seems possible that something related to this caused MDEV-17092, which I had worked around by changing the code in InnoDB.

      Attachments

        1. atomics.cc
          3 kB
        2. kill_test.diff
          6 kB
        3. mdev-23328.pl
          2 kB
        4. mdev-23328-spin.txt
          9 kB
        5. mdev-pre-21010-spin.txt
          11 kB
        6. mdev-pre-21910.pl
          1 kB

        Issue Links

          Activity

            After updating from 10.4.17 to 10.4.18 I run into the "complete frozen cluster" issue twice now. After reading the changelog I assume the fix for this MDEV is introducing a regression. The "follow up" MDEV might be MDEV-24294 which has already been mentioned.

            fbezdeka Florian Bezdeka added a comment - After updating from 10.4.17 to 10.4.18 I run into the "complete frozen cluster" issue twice now. After reading the changelog I assume the fix for this MDEV is introducing a regression. The "follow up" MDEV might be MDEV-24294 which has already been mentioned.

            Why was this issue closed? I can't find a fix for 10.4.x...

            fbezdeka Florian Bezdeka added a comment - Why was this issue closed? I can't find a fix for 10.4.x...

            In 10.6.0, this was fixed in a simpler way by MDEV-24915.

            marko Marko Mäkelä added a comment - In 10.6.0, this was fixed in a simpler way by MDEV-24915 .
            stephanvos Stephan Vos added a comment -

            OK so to check this issue has been fixed correctly in the 10.5.9 or is it still a problem?
            I'm planning to upgrade from 10.5.6 to 10.5.17 and want to make sure this fixed wont cause issues.

            stephanvos Stephan Vos added a comment - OK so to check this issue has been fixed correctly in the 10.5.9 or is it still a problem? I'm planning to upgrade from 10.5.6 to 10.5.17 and want to make sure this fixed wont cause issues.
            khaiping.loh Khai Ping added a comment -

            i am still seeing this in 10.6.9. Reported in MDEV-29346

            khaiping.loh Khai Ping added a comment - i am still seeing this in 10.6.9. Reported in MDEV-29346

            People

              serg Sergei Golubchik
              marko Marko Mäkelä
              Votes:
              3 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.