Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33363

CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn

Details

    Description

      The failure is observed in st-10.11-MDEV-33361. Since the assert looks unrelated to the change in the branch, the issue must be from the base branch.

      The assert is introduced in MDEV-14425 and likely to have introduced the issue. I am setting the affected version as 10.8.1 when MDEV-14425 was merged.

      https://buildbot.mariadb.org/#/builders/234/builds/26196/steps/9/logs/stdio
      Test: innodb.import_corrupted
      Branch: st-10.11-MDEV-33361
      Commit: 93189df44
      Platform: amd64-windows

      innodb.import_corrupted 'innodb'         w6 [ fail ]
              Test ended at 2024-02-02 10:01:21
       
      CURRENT_TEST: innodb.import_corrupted
      mysqltest: At line 65: query 'ALTER TABLE t2 IMPORT TABLESPACE' failed with wrong errno <Unknown> (2013): 'Lost connection to server during query', instead of ER_NOT_KEYFILE (1034)...
      

      Server [mysqld.1 - pid: 47680, winpid: 47680, exit: 0] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      2024-02-02 10:01:20 124 [Note] InnoDB: Sync to disk of `test`.`t1` started.
      2024-02-02 10:01:20 124 [Note] InnoDB: Stopping purge
      2024-02-02 10:01:21 124 [Note] InnoDB: Writing table metadata to './test/t1.cfg'
      2024-02-02 10:01:21 124 [Note] InnoDB: Table `test`.`t1` flushed to disk
      2024-02-02 10:01:21 124 [Note] InnoDB: Deleting the meta-data file './test/t1.cfg'
      2024-02-02 10:01:21 124 [Note] InnoDB: Resuming purge
      2024-02-02 10:01:21 124 [Note] InnoDB: Importing tablespace for table 'test/t1' that was exported from host 'WIN-5PHL1LR4M8G'
      2024-02-02 10:01:21 124 [Note] InnoDB: Phase I - Update all pages
      2024-02-02 10:01:21 124 [Note] InnoDB: Sync to disk
      2024-02-02 10:01:21 124 [Note] InnoDB: Sync to disk - done!
      2024-02-02 10:01:21 124 [Note] InnoDB: Phase II - Purge records from index `PRIMARY`
      Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn, file C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc, line 1982
      

      server.dll!my_sigabrt_handler()[my_thr_init.c:465]
      ucrtbase.dll!raise()
      ucrtbase.dll!abort()
      ucrtbase.dll!_get_wpgmptr()
      ucrtbase.dll!_wassert()
      server.dll!log_checkpoint_low()[buf0flu.cc:1982]
      server.dll!log_checkpoint()[buf0flu.cc:2042]
      server.dll!buf_flush_page_cleaner()[buf0flu.cc:2509]
      

      Attachments

        Issue Links

          Activity

            Here is some code up to the failing assertion in log_checkpoint_low():

            10.11 93189df44eeb431110bf9ca759bafa856ae90c82

              if (oldest_lsn == log_sys.last_checkpoint_lsn /* || … */) {
                log_sys.latch.wr_unlock();
                return true;
              }
             
              ut_ad(!recv_no_log_write);
              ut_ad(oldest_lsn > log_sys.last_checkpoint_lsn);
            

            This would seem to guarantee that the equality cannot hold. So, we must have had an oldest_lsn that is smaller than the log_sys.last_checkpoint_lsn. That would seem to be a sign of a severe error in the checkpoint logic.

            The parameter oldest_lsn was computed by the following in log_checkpoint():

            10.11 93189df44eeb431110bf9ca759bafa856ae90c82

              log_sys.latch.wr_lock(SRW_LOCK_CALL);
              const lsn_t end_lsn= log_sys.get_lsn();
              mysql_mutex_lock(&buf_pool.flush_list_mutex);
              const lsn_t oldest_lsn= buf_pool.get_oldest_modification(end_lsn);
              mysql_mutex_unlock(&buf_pool.flush_list_mutex);
              return log_checkpoint_low(oldest_lsn, end_lsn);
            

            The field log_sys.last_checkpoint_lsn is protected by log_sys.latch (and write_lock and flush_lock) when log_t::write_checkpoint() is modifying it. Also the end_lsn=log_sys.lsn is protected by the exclusive log_sys.latch. Here are some local variables from the stack trace:

            10.11 93189df44eeb431110bf9ca759bafa856ae90c82

            ucrtbase!wassert
            Unable to enumerate locals, Win32 error 0n87
            Private symbols (symbols.pri) are required for locals.
            Type ".hh dbgerr005" for details.
            _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
            server!log_checkpoint_low [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 1982]
            unsigned int64 oldest_lsn = 0xdbbfe9
            unsigned int64 end_lsn = 0xdc0a71
            unsigned int64 flush_lsn = 0x00000216`d35ab7b0
            _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
            server!log_checkpoint [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2042]
            unsigned int64 oldest_lsn = 0xdbbfe9
            unsigned int64 end_lsn = 0xdc0a71
            _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
            server!buf_flush_page_cleaner [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2509]
            unsigned int64 n = 0
            unsigned int64 dirty_blocks = 0x1b5
            unsigned int64 n_flushed = 0
            unsigned int64 oldest_lsn = 0xdbbfe9
            double pct_lwm = 0
            unsigned int64 soft_lsn_limit = 0
            double dirty_pct = 87.2255489021956123
            unsigned int64 lsn_limit = 0
            struct timespec abstime = struct timespec
            unsigned int64 last_pages = 0x71a
            unsigned int64 last_activity_count = 0
            _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
            

            The variable flush_lsn, whose scope had not started before the failing assertion was evaluated, is garbage, but the rest looks reasonable.

            The oldest modified block in the buf_pool.flush_list must have an buf_page_t::oldest_modification_ of 0xdbbfe9. For some reason, the latest checkpoint LSN (log_sys.last_checkpoint_lsn) is larger than this. This means that the log checkpoint had been advanced prematurely on a previous invocation of log_checkpoint_low(), while some older page modifications had not yet been written back from the buffer pool.

            marko Marko Mäkelä added a comment - Here is some code up to the failing assertion in log_checkpoint_low() : 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 if (oldest_lsn == log_sys.last_checkpoint_lsn /* || … */ ) { log_sys.latch.wr_unlock(); return true ; }   ut_ad(!recv_no_log_write); ut_ad(oldest_lsn > log_sys.last_checkpoint_lsn); This would seem to guarantee that the equality cannot hold. So, we must have had an oldest_lsn that is smaller than the log_sys.last_checkpoint_lsn . That would seem to be a sign of a severe error in the checkpoint logic. The parameter oldest_lsn was computed by the following in log_checkpoint() : 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 log_sys.latch.wr_lock(SRW_LOCK_CALL); const lsn_t end_lsn= log_sys.get_lsn(); mysql_mutex_lock(&buf_pool.flush_list_mutex); const lsn_t oldest_lsn= buf_pool.get_oldest_modification(end_lsn); mysql_mutex_unlock(&buf_pool.flush_list_mutex); return log_checkpoint_low(oldest_lsn, end_lsn); The field log_sys.last_checkpoint_lsn is protected by log_sys.latch (and write_lock and flush_lock ) when log_t::write_checkpoint() is modifying it. Also the end_lsn=log_sys.lsn is protected by the exclusive log_sys.latch . Here are some local variables from the stack trace: 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 ucrtbase!wassert Unable to enumerate locals, Win32 error 0n87 Private symbols (symbols.pri) are required for locals. Type ".hh dbgerr005" for details. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!log_checkpoint_low [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 1982] unsigned int64 oldest_lsn = 0xdbbfe9 unsigned int64 end_lsn = 0xdc0a71 unsigned int64 flush_lsn = 0x00000216`d35ab7b0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!log_checkpoint [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2042] unsigned int64 oldest_lsn = 0xdbbfe9 unsigned int64 end_lsn = 0xdc0a71 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!buf_flush_page_cleaner [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2509] unsigned int64 n = 0 unsigned int64 dirty_blocks = 0x1b5 unsigned int64 n_flushed = 0 unsigned int64 oldest_lsn = 0xdbbfe9 double pct_lwm = 0 unsigned int64 soft_lsn_limit = 0 double dirty_pct = 87.2255489021956123 unsigned int64 lsn_limit = 0 struct timespec abstime = struct timespec unsigned int64 last_pages = 0x71a unsigned int64 last_activity_count = 0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The variable flush_lsn , whose scope had not started before the failing assertion was evaluated, is garbage, but the rest looks reasonable. The oldest modified block in the buf_pool.flush_list must have an buf_page_t::oldest_modification_ of 0xdbbfe9 . For some reason, the latest checkpoint LSN ( log_sys.last_checkpoint_lsn ) is larger than this. This means that the log checkpoint had been advanced prematurely on a previous invocation of log_checkpoint_low() , while some older page modifications had not yet been written back from the buffer pool.

            MDEV-22027 was a bug like this.

            marko Marko Mäkelä added a comment - MDEV-22027 was a bug like this.

            I just got this failure in a 10.11 non debug build on Centos:

            innodb.import_corrupted
            Server log from this test:
            ---------SERVER LOG START----------
            2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk of `test`.`t1` started.
            2024-02-12 15:03:37 129 [Note] InnoDB: Stopping purge
            2024-02-12 15:03:37 129 [Note] InnoDB: Writing table metadata to './test/t1.cfg'
            2024-02-12 15:03:37 129 [Note] InnoDB: Table `test`.`t1` flushed to disk
            2024-02-12 15:03:37 129 [Note] InnoDB: Deleting the meta-data file './test/t1.cf
            g'
            2024-02-12 15:03:37 129 [Note] InnoDB: Resuming purge
            2024-02-12 15:03:37 129 [Note] InnoDB: Importing tablespace for table 'test/t1'
            that was exported from host 'localhost.localdomain'
            2024-02-12 15:03:37 129 [Note] InnoDB: Phase I - Update all pages
            2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk
            2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk - done!
            2024-02-12 15:03:37 129 [Note] InnoDB: Phase II - Purge records from index `PRIM
            ARY`
            mariadbd: /my/maria-10.11/storage/innobase/buf/buf0flu.cc:1982: bool log_checkpo
            int_low(lsn_t, lsn_t): Assertion `oldest_lsn > log_sys.last_checkpoint_lsn' fail
            ed.
            240212 15:03:37 [ERROR] mysqld got signal 6 ;
            Sorry, we probably made a mistake, and this is a bug.
            mysys/stacktrace.c:215(my_print_stacktrace)[0x113e1e5]
            sql/signal_handler.cc:238(handle_fatal_signal)[0xa2abe8]
            sigaction.c:0(__restore_rt)[0x7f54d630d5d0]
            /lib64/libc.so.6(gsignal+0x37)[0x7f54d4326207]
            /lib64/libc.so.6(abort+0x148)[0x7f54d43278f8]
            /lib64/libc.so.6(+0x2f026)[0x7f54d431f026]
            /lib64/libc.so.6(+0x2f0d2)[0x7f54d431f0d2]
            buf/buf0flu.cc:1994(log_checkpoint_low(unsigned long, unsigned long))[0x102a018]
            buf/buf0flu.cc:2042(log_checkpoint())[0x102a2f2]
            buf/buf0flu.cc:2511(buf_flush_page_cleaner())[0x102cdb9]
            4.8.2/thread:115(std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run())
            [0x102e0f3]

            monty Michael Widenius added a comment - I just got this failure in a 10.11 non debug build on Centos: innodb.import_corrupted Server log from this test: --------- SERVER LOG START ---------- 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk of `test`.`t1` started. 2024-02-12 15:03:37 129 [Note] InnoDB: Stopping purge 2024-02-12 15:03:37 129 [Note] InnoDB: Writing table metadata to './test/t1.cfg' 2024-02-12 15:03:37 129 [Note] InnoDB: Table `test`.`t1` flushed to disk 2024-02-12 15:03:37 129 [Note] InnoDB: Deleting the meta-data file './test/t1.cf g' 2024-02-12 15:03:37 129 [Note] InnoDB: Resuming purge 2024-02-12 15:03:37 129 [Note] InnoDB: Importing tablespace for table 'test/t1' that was exported from host 'localhost.localdomain' 2024-02-12 15:03:37 129 [Note] InnoDB: Phase I - Update all pages 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk - done! 2024-02-12 15:03:37 129 [Note] InnoDB: Phase II - Purge records from index `PRIM ARY` mariadbd: /my/maria-10.11/storage/innobase/buf/buf0flu.cc:1982: bool log_checkpo int_low(lsn_t, lsn_t): Assertion `oldest_lsn > log_sys.last_checkpoint_lsn' fail ed. 240212 15:03:37 [ERROR] mysqld got signal 6 ; Sorry, we probably made a mistake, and this is a bug. mysys/stacktrace.c:215(my_print_stacktrace) [0x113e1e5] sql/signal_handler.cc:238(handle_fatal_signal) [0xa2abe8] sigaction.c:0(__restore_rt) [0x7f54d630d5d0] /lib64/libc.so.6(gsignal+0x37) [0x7f54d4326207] /lib64/libc.so.6(abort+0x148) [0x7f54d43278f8] /lib64/libc.so.6(+0x2f026) [0x7f54d431f026] /lib64/libc.so.6(+0x2f0d2) [0x7f54d431f0d2] buf/buf0flu.cc:1994(log_checkpoint_low(unsigned long, unsigned long)) [0x102a018] buf/buf0flu.cc:2042(log_checkpoint()) [0x102a2f2] buf/buf0flu.cc:2511(buf_flush_page_cleaner()) [0x102cdb9] 4.8.2/thread:115(std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run()) [0x102e0f3]

            In absence of any repeatable test, I started my analysis from the design and code. It was quite informative going through the design changes done to improve the concurrency for mini-transaction commit and flush list ordering. Finally, I could find the exact reason and precise testcase to repeat the issue precisely. I will upload a possible solution for review and discuss within Innodb team. The testcase took sometime as it needs specific sequence of operations between background and session threads. The current test fails > 50% of the time with the assert consistently which should be good for verification.

            Some details of the analysis.

            1. The checkpoint path: The code looks good and there don't look to be any race across multiple concurrent checkpoint.

            2. The mtr commit: There is one vulnerable piece of code in log_t::append_prepare_wait which is somewhat hard to reach generally but should be fixed. This turned out to be different from the current issue and is not feasible to reach by the test. I will file a separate bug for this one with more details and a repeatable test.

            3. The import path: The issue is found here. In "MDEV-28708: increased congestion on buf_pool.flush_list_mutex", along with other concurrency enhancement, the LSN assignment for non-logging pages are changed from system LSN to last checkpoint LSN. This apparently harmless change has caused this issue. During checkpoint we release the log sys latch, while writing the checkpoint block log_t::write_checkpoint(). The state is set to "checkpoint_pending" which effectively synchronizing with other concurrent checkpoint calls. In this interval the IMPORT pages are added to the flush list with previous checkpoint LSN causing the next checkpoint LSN to get a lower value.

            One way is to use the system LSN for the non-redo logged pages. It seems right and is inline with previous design. If we do want to add the pages in flush list from other direction, it is possible to check the pending state and use the next checkpoint LSN.

            debarun Debarun Banerjee added a comment - In absence of any repeatable test, I started my analysis from the design and code. It was quite informative going through the design changes done to improve the concurrency for mini-transaction commit and flush list ordering. Finally, I could find the exact reason and precise testcase to repeat the issue precisely. I will upload a possible solution for review and discuss within Innodb team. The testcase took sometime as it needs specific sequence of operations between background and session threads. The current test fails > 50% of the time with the assert consistently which should be good for verification. Some details of the analysis. 1. The checkpoint path: The code looks good and there don't look to be any race across multiple concurrent checkpoint. 2. The mtr commit: There is one vulnerable piece of code in log_t::append_prepare_wait which is somewhat hard to reach generally but should be fixed. This turned out to be different from the current issue and is not feasible to reach by the test. I will file a separate bug for this one with more details and a repeatable test. 3. The import path: The issue is found here. In " MDEV-28708 : increased congestion on buf_pool.flush_list_mutex", along with other concurrency enhancement, the LSN assignment for non-logging pages are changed from system LSN to last checkpoint LSN. This apparently harmless change has caused this issue. During checkpoint we release the log sys latch, while writing the checkpoint block log_t::write_checkpoint(). The state is set to "checkpoint_pending" which effectively synchronizing with other concurrent checkpoint calls. In this interval the IMPORT pages are added to the flush list with previous checkpoint LSN causing the next checkpoint LSN to get a lower value. One way is to use the system LSN for the non-redo logged pages. It seems right and is inline with previous design. If we do want to add the pages in flush list from other direction, it is possible to check the pending state and use the next checkpoint LSN.

            Hi marko

            Please review the analysis and patch.
            https://github.com/MariaDB/server/pull/3061

            debarun Debarun Banerjee added a comment - Hi marko Please review the analysis and patch. https://github.com/MariaDB/server/pull/3061

            Great work! It would be nice if you can make the DEBUG_SYNC instrumented test more stable so that it can be included in the regression test suite.

            marko Marko Mäkelä added a comment - Great work! It would be nice if you can make the DEBUG_SYNC instrumented test more stable so that it can be included in the regression test suite.

            Attached the testcase that can repeat the issue fairly accurately. I am excluding the test from final patch as in some cases it results is debug sync timeout due to concurrent background operations. It could be hard to maintain such tests with stable output in CI.

            debarun Debarun Banerjee added a comment - Attached the testcase that can repeat the issue fairly accurately. I am excluding the test from final patch as in some cases it results is debug sync timeout due to concurrent background operations. It could be hard to maintain such tests with stable output in CI.

            People

              debarun Debarun Banerjee
              debarun Debarun Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.