Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27416

InnoDB hang in buf_flush_wait_flushed(), on log checkpoint

Details

    Description

      On our CI systems, on builders that run on real storage and not RAM disk, we see occasional failures of the IMPORT TABLESPACE tests, because a wait for a log checkpoint is hanging with a stack trace like this:

      buf_flush_wait_flushed
      log_make_checkpoint
      row_import_cleanup
      row_import_for_mysql
      

      Actually, the checkpoint there should be unnecessary, but that is not the main point here.
      I was able to reproduce this on ext4fs on storage with 4096-byte physical block size using a RelWithDebInfo build. Previous attempts on an NVMe with 512-byte physical block size failed.

      innodb.innodb-wl5522 'innodb,strict_crc32' w11 [ 7 fail ]  timeout after 900 seconds
      

      The test invocation was:

      ./mtr --parallel=100 --repeat=100 {innodb.innodb-wl5522{,-1},innodb_zip.wl5522_zip}{,,,,,,,,,,,,,,,,,,}
      

      I think that applying the first commit of MDEV-26827 (to invoke buf_flush_list() from fewer threads) might fix this. That commit is also removing the log_make_checkpoint() call from row_import_cleanup(), but when testing the fix we obviously must retain that call, because our test case cannot fail if that call is not present.

      Attachments

        Issue Links

          Activity

            Unfortunately, the code cleanup did not fix this:

            innodb.innodb-wl5522 'innodb,strict_crc32' w52 [ 5 fail ]  timeout after 900 seconds
            #3  0x000055a52aed8b8b in buf_flush_wait (lsn=2592590) at /home/marko/server/storage/innobase/buf/buf0flu.cc:1825
            #4  0x000055a52a7f16b7 in buf_flush_wait_flushed (sync_lsn=2592590) at /home/marko/server/storage/innobase/buf/buf0flu.cc:1881
            #5  0x000055a52a7f171f in log_make_checkpoint () at /usr/include/c++/9/bits/atomic_base.h:413
            #6  0x000055a52a7d9853 in row_import_cleanup (prebuilt=0x7f174809d2a8, err=DB_ERROR) at /home/marko/server/storage/innobase/row/row0import.cc:2192
            

            The core dump excludes the buffer pool, so I will have to revert MDEV-10814 and try again (tomorrow).

            marko Marko Mäkelä added a comment - Unfortunately, the code cleanup did not fix this: innodb.innodb-wl5522 'innodb,strict_crc32' w52 [ 5 fail ] timeout after 900 seconds … #3 0x000055a52aed8b8b in buf_flush_wait (lsn=2592590) at /home/marko/server/storage/innobase/buf/buf0flu.cc:1825 #4 0x000055a52a7f16b7 in buf_flush_wait_flushed (sync_lsn=2592590) at /home/marko/server/storage/innobase/buf/buf0flu.cc:1881 #5 0x000055a52a7f171f in log_make_checkpoint () at /usr/include/c++/9/bits/atomic_base.h:413 #6 0x000055a52a7d9853 in row_import_cleanup (prebuilt=0x7f174809d2a8, err=DB_ERROR) at /home/marko/server/storage/innobase/row/row0import.cc:2192 The core dump excludes the buffer pool, so I will have to revert MDEV-10814 and try again (tomorrow).

            MDEV-24278 introduced an unbounded wait in buf_flush_page_cleaner() when the page cleaner is supposed to be idle. When the shared atomic variable buf_flush_sync_lsn is set (to indicate that some threads are waiting for a log checkpoint to be completed), in a rare case the page cleaner thread could enter the unbounded wait. At the same time, the thread that requested the checkpoint is waiting for a completion signal from the page cleaner thread.

            The hang is prevented if we invoke buf_pool.page_cleaner_set_idle(false) before waking up the page cleaner. My test campaign of 100 rounds of 100 concurrent instances of the test innodb.innodb-wl5522 on a 104-core server has proceeded to the 93rd round without a hang, and I expect it to complete in a few minutes. Without this fix, one of the 100 concurrent instances used to hang on the 7th round.

            I suspect that this hang is only possible if the server is otherwise idle during the checkpoint request. If any other threads conducted any writes to persistent InnoDB tables, the 2 hung threads would be ‘rescued’ by the page cleaner thread being eventually woken up.

            marko Marko Mäkelä added a comment - MDEV-24278 introduced an unbounded wait in buf_flush_page_cleaner() when the page cleaner is supposed to be idle. When the shared atomic variable buf_flush_sync_lsn is set (to indicate that some threads are waiting for a log checkpoint to be completed), in a rare case the page cleaner thread could enter the unbounded wait. At the same time, the thread that requested the checkpoint is waiting for a completion signal from the page cleaner thread. The hang is prevented if we invoke buf_pool.page_cleaner_set_idle(false) before waking up the page cleaner. My test campaign of 100 rounds of 100 concurrent instances of the test innodb.innodb-wl5522 on a 104-core server has proceeded to the 93rd round without a hang, and I expect it to complete in a few minutes. Without this fix, one of the 100 concurrent instances used to hang on the 7th round. I suspect that this hang is only possible if the server is otherwise idle during the checkpoint request. If any other threads conducted any writes to persistent InnoDB tables, the 2 hung threads would be ‘rescued’ by the page cleaner thread being eventually woken up.

            I forgot that there were 2 combinations of the test, for two values of innodb_checksum_algorithm. Also the other combination was fine:

            innodb.innodb-wl5522 'innodb,strict_full_crc32' w41 [ 100 pass ]   8184
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w61 [ 99 pass ]   8074
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w65 [ 100 pass ]   8128
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w59 [ 100 pass ]   8259
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w63 [ 100 pass ]   8236
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w22 [ 100 pass ]   7595
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w77 [ 100 pass ]   5554
            innodb.innodb-wl5522 'innodb,strict_full_crc32' w61 [ 100 pass ]   5325
            --------------------------------------------------------------------------
            The servers were restarted 100 times
            Spent 641767.028 of 6539 seconds executing testcases
             
            Completed: All 20000 tests were successful.
            

            marko Marko Mäkelä added a comment - I forgot that there were 2 combinations of the test, for two values of innodb_checksum_algorithm . Also the other combination was fine: innodb.innodb-wl5522 'innodb,strict_full_crc32' w41 [ 100 pass ] 8184 innodb.innodb-wl5522 'innodb,strict_full_crc32' w61 [ 99 pass ] 8074 innodb.innodb-wl5522 'innodb,strict_full_crc32' w65 [ 100 pass ] 8128 innodb.innodb-wl5522 'innodb,strict_full_crc32' w59 [ 100 pass ] 8259 innodb.innodb-wl5522 'innodb,strict_full_crc32' w63 [ 100 pass ] 8236 innodb.innodb-wl5522 'innodb,strict_full_crc32' w22 [ 100 pass ] 7595 innodb.innodb-wl5522 'innodb,strict_full_crc32' w77 [ 100 pass ] 5554 innodb.innodb-wl5522 'innodb,strict_full_crc32' w61 [ 100 pass ] 5325 -------------------------------------------------------------------------- The servers were restarted 100 times Spent 641767.028 of 6539 seconds executing testcases   Completed: All 20000 tests were successful.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.