Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27416

InnoDB hang in buf_flush_wait_flushed(), on log checkpoint

    XMLWordPrintable

    Details

      Description

      On our CI systems, on builders that run on real storage and not RAM disk, we see occasional failures of the IMPORT TABLESPACE tests, because a wait for a log checkpoint is hanging with a stack trace like this:

      buf_flush_wait_flushed
      log_make_checkpoint
      row_import_cleanup
      row_import_for_mysql
      

      Actually, the checkpoint there should be unnecessary, but that is not the main point here.
      I was able to reproduce this on ext4fs on storage with 4096-byte physical block size using a RelWithDebInfo build. Previous attempts on an NVMe with 512-byte physical block size failed.

      innodb.innodb-wl5522 'innodb,strict_crc32' w11 [ 7 fail ]  timeout after 900 seconds
      

      The test invocation was:

      ./mtr --parallel=100 --repeat=100 {innodb.innodb-wl5522{,-1},innodb_zip.wl5522_zip}{,,,,,,,,,,,,,,,,,,}
      

      I think that applying the first commit of MDEV-26827 (to invoke buf_flush_list() from fewer threads) might fix this. That commit is also removing the log_make_checkpoint() call from row_import_cleanup(), but when testing the fix we obviously must retain that call, because our test case cannot fail if that call is not present.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.