Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32371

Deadlock between buf_page_get_zip() and buf_pool_t::corrupted_evict() on InnoDB ROW_FORMAT=COMPRESSED table corruption

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.11.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
    • 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2
    • Debian 11 Bullseye; MariaDB from the Maria repositories.

    Description

      I've had two different servers this week hit an issue with a semaphore wait lasting forever. This bug report isn't about what caused those wait-forever situations: it's about how the server handled them.

      If I'm reading the code in storage/innobase/srv/srv0srv.cc correctly, there is supposed to be a warning at 1/4 of the innodb_fatal_semaphore_wait_threshold value, at 1/2 of that value, and at 3/4 of that value before killing the server once the threshold value is reached.

      Examining the syslog, however, shows this doesn't seem to be working. One of my servers gave no warnings at all. It waited the full 600 seconds and then gave the fatal error as it crashed itself. The "Long wait" message never appeared at all.

      The other server did give the "Long wait" warning, starting at 150 seconds, which is the correct time to start (1/4 of 600). I would expect a warning at 300 seconds, then 450 seconds. However it instead warned again at 159 seconds, and then every 10 seconds after that until 289 at which time I killed the server myself rather than waiting the full 600.

      I would like to change my innodb_fatal_semaphore_wait_threshold setting to a dramatically lower number. I'll be able to tell whether it's safe to do so by making incremental changes and observing the presence or absence of "Long wait" warnings. However, if these warnings aren't behaving the way they're supposed to, that won't work.

      Apologies if I'm misunderstanding what the behavior is supposed to be. It doesn't seem like zero warnings in some cases and warnings every 10 seconds in other cases would be the intended behavior.

      Attachments

        Issue Links

          Activity

            xan@biblionix.com Xan Charbonnet created issue -
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            Status Open [ 1 ] Needs Feedback [ 10501 ]
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ]
            Status Needs Feedback [ 10501 ] Open [ 1 ]
            marko Marko Mäkelä made changes -
            Summary innodb_fatal_semaphore_wait_threshold warnings not happening Deadlock between buf_page_get_zip() and buf_pool_t::corrupted_evict() on InnoDB ROW_FORMAT=COMPRESSED table corruption
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.2 [ 28603 ]
            Fix Version/s 11.3 [ 28565 ]
            Affects Version/s 10.6 [ 24028 ]
            Affects Version/s 10.7 [ 24805 ]
            Affects Version/s 10.8 [ 26121 ]
            Affects Version/s 10.9 [ 26905 ]
            Affects Version/s 10.10 [ 27530 ]
            Affects Version/s 10.11 [ 27614 ]
            Affects Version/s 11.0 [ 28320 ]
            Affects Version/s 11.1 [ 28549 ]
            Affects Version/s 11.2 [ 28603 ]
            Affects Version/s 11.3 [ 28565 ]
            Labels hang
            Priority Major [ 3 ] Critical [ 2 ]
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Thirunarayanan Balathandayuthapani [ thiru ]
            Status Confirmed [ 10101 ] In Review [ 10002 ]
            thiru Thirunarayanan Balathandayuthapani made changes -
            Assignee Thirunarayanan Balathandayuthapani [ thiru ] Marko Mäkelä [ marko ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2023-11-30 10:01:43.0 2023-11-30 10:01:43.1
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6.17 [ 29518 ]
            Fix Version/s 10.11.7 [ 29519 ]
            Fix Version/s 11.0.5 [ 29520 ]
            Fix Version/s 11.1.4 [ 29024 ]
            Fix Version/s 11.2.3 [ 29521 ]
            Fix Version/s 11.3.2 [ 29522 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.2 [ 28603 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]

            People

              marko Marko Mäkelä
              xan@biblionix.com Xan Charbonnet
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.