Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32371

Deadlock between buf_page_get_zip() and buf_pool_t::corrupted_evict() on InnoDB ROW_FORMAT=COMPRESSED table corruption

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.11.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
    • 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2
    • Debian 11 Bullseye; MariaDB from the Maria repositories.

    Description

      I've had two different servers this week hit an issue with a semaphore wait lasting forever. This bug report isn't about what caused those wait-forever situations: it's about how the server handled them.

      If I'm reading the code in storage/innobase/srv/srv0srv.cc correctly, there is supposed to be a warning at 1/4 of the innodb_fatal_semaphore_wait_threshold value, at 1/2 of that value, and at 3/4 of that value before killing the server once the threshold value is reached.

      Examining the syslog, however, shows this doesn't seem to be working. One of my servers gave no warnings at all. It waited the full 600 seconds and then gave the fatal error as it crashed itself. The "Long wait" message never appeared at all.

      The other server did give the "Long wait" warning, starting at 150 seconds, which is the correct time to start (1/4 of 600). I would expect a warning at 300 seconds, then 450 seconds. However it instead warned again at 159 seconds, and then every 10 seconds after that until 289 at which time I killed the server myself rather than waiting the full 600.

      I would like to change my innodb_fatal_semaphore_wait_threshold setting to a dramatically lower number. I'll be able to tell whether it's safe to do so by making incremental changes and observing the presence or absence of "Long wait" warnings. However, if these warnings aren't behaving the way they're supposed to, that won't work.

      Apologies if I'm misunderstanding what the behavior is supposed to be. It doesn't seem like zero warnings in some cases and warnings every 10 seconds in other cases would be the intended behavior.

      Attachments

        Issue Links

          Activity

            Transition Time In Source Status Execution Times
            Marko Mäkelä made transition -
            Open Needs Feedback
            5h 19m 1
            Marko Mäkelä made transition -
            Needs Feedback Open
            12d 4h 42m 1
            Marko Mäkelä made transition -
            Open Confirmed
            14d 2h 8m 1
            Marko Mäkelä made transition -
            Confirmed In Review
            20d 19h 49m 1
            Thirunarayanan Balathandayuthapani made transition -
            In Review Stalled
            4d 22h 16m 1
            Marko Mäkelä made transition -
            Stalled Closed
            2d 1h 24m 1

            People

              marko Marko Mäkelä
              xan@biblionix.com Xan Charbonnet
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.