Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32371

Deadlock between buf_page_get_zip() and buf_pool_t::corrupted_evict() on InnoDB ROW_FORMAT=COMPRESSED table corruption

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.11.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
    • 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2
    • Debian 11 Bullseye; MariaDB from the Maria repositories.

    Description

      I've had two different servers this week hit an issue with a semaphore wait lasting forever. This bug report isn't about what caused those wait-forever situations: it's about how the server handled them.

      If I'm reading the code in storage/innobase/srv/srv0srv.cc correctly, there is supposed to be a warning at 1/4 of the innodb_fatal_semaphore_wait_threshold value, at 1/2 of that value, and at 3/4 of that value before killing the server once the threshold value is reached.

      Examining the syslog, however, shows this doesn't seem to be working. One of my servers gave no warnings at all. It waited the full 600 seconds and then gave the fatal error as it crashed itself. The "Long wait" message never appeared at all.

      The other server did give the "Long wait" warning, starting at 150 seconds, which is the correct time to start (1/4 of 600). I would expect a warning at 300 seconds, then 450 seconds. However it instead warned again at 159 seconds, and then every 10 seconds after that until 289 at which time I killed the server myself rather than waiting the full 600.

      I would like to change my innodb_fatal_semaphore_wait_threshold setting to a dramatically lower number. I'll be able to tell whether it's safe to do so by making incremental changes and observing the presence or absence of "Long wait" warnings. However, if these warnings aren't behaving the way they're supposed to, that won't work.

      Apologies if I'm misunderstanding what the behavior is supposed to be. It doesn't seem like zero warnings in some cases and warnings every 10 seconds in other cases would be the intended behavior.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              xan@biblionix.com Xan Charbonnet
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.