Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27416

InnoDB hang in buf_flush_wait_flushed(), on log checkpoint

    XMLWordPrintable

Details

    Description

      On our CI systems, on builders that run on real storage and not RAM disk, we see occasional failures of the IMPORT TABLESPACE tests, because a wait for a log checkpoint is hanging with a stack trace like this:

      buf_flush_wait_flushed
      log_make_checkpoint
      row_import_cleanup
      row_import_for_mysql
      

      Actually, the checkpoint there should be unnecessary, but that is not the main point here.
      I was able to reproduce this on ext4fs on storage with 4096-byte physical block size using a RelWithDebInfo build. Previous attempts on an NVMe with 512-byte physical block size failed.

      innodb.innodb-wl5522 'innodb,strict_crc32' w11 [ 7 fail ]  timeout after 900 seconds
      

      The test invocation was:

      ./mtr --parallel=100 --repeat=100 {innodb.innodb-wl5522{,-1},innodb_zip.wl5522_zip}{,,,,,,,,,,,,,,,,,,}
      

      I think that applying the first commit of MDEV-26827 (to invoke buf_flush_list() from fewer threads) might fix this. That commit is also removing the log_make_checkpoint() call from row_import_cleanup(), but when testing the fix we obviously must retain that call, because our test case cannot fail if that call is not present.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.