Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33137

Assertion `end_lsn == page_lsn' failed in recv_recover_page

Details

    Description

      11.1 50799752dc6306c35a47405c8f556d0b9639a1a9

      2023-12-28 23:21:30 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=159699380
      2023-12-28 23:21:31 0 [Note] InnoDB: End of log at LSN=173644026
      2023-12-28 23:21:31 0 [Note] InnoDB: To recover: 1098 pages
      mysqld: /data/MDEV-33137/11.1/storage/innobase/log/log0recv.cc:3139: buf_block_t* recv_recover_page(buf_block_t*, mtr_t&, page_recv_t&, fil_space_t*, lsn_t): Assertion `end_lsn == page_lsn' failed.
      (rr) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x000014fdbb011859 in __GI_abort () at abort.c:79
      #2  0x000014fdbb011729 in __assert_fail_base (fmt=0x14fdbb1a7588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55fb4d1115df "end_lsn == page_lsn", 
          file=0x55fb4d10f8b8 "/data/MDEV-33137/11.1/storage/innobase/log/log0recv.cc", line=3139, function=<optimized out>) at assert.c:92
      #3  0x000014fdbb022fd6 in __GI___assert_fail (assertion=0x55fb4d1115df "end_lsn == page_lsn", file=0x55fb4d10f8b8 "/data/MDEV-33137/11.1/storage/innobase/log/log0recv.cc", line=3139, 
          function=0x55fb4d111410 "buf_block_t* recv_recover_page(buf_block_t*, mtr_t&, page_recv_t&, fil_space_t*, lsn_t)") at assert.c:101
      #4  0x000055fb4c86451c in recv_recover_page (block=0x68201ad0, mtr=..., recs=..., space=0x55fb4ff8be70, init_lsn=0) at /data/MDEV-33137/11.1/storage/innobase/log/log0recv.cc:3139
      #5  0x000055fb4c865308 in recv_recover_page (space=0x55fb4ff8be70, bpage=0x68201ad0) at /data/MDEV-33137/11.1/storage/innobase/log/log0recv.cc:3360
      #6  0x000055fb4ca72dd5 in buf_page_t::read_complete (this=0x68201ad0, node=...) at /data/MDEV-33137/11.1/storage/innobase/buf/buf0buf.cc:3532
      #7  0x000055fb4cafbccd in IORequest::read_complete (this=0x612b26e7c2b0, io_error=0) at /data/MDEV-33137/11.1/storage/innobase/fil/fil0fil.cc:2980
      #8  0x000055fb4c8a9521 in read_io_callback (c=0x612b26e7c208) at /data/MDEV-33137/11.1/storage/innobase/os/os0file.cc:3416
      #9  0x000055fb4cb8df7c in tpool::task_group::execute (this=0x55fb4ff5c6c0, t=0x612b26e7c290) at /data/MDEV-33137/11.1/tpool/task_group.cc:70
      #10 0x000055fb4cb8e2c8 in tpool::task::execute (this=0x612b26e7b228) at /data/MDEV-33137/11.1/tpool/task.cc:32
      #11 0x000055fb4cb877c1 in tpool::thread_pool_generic::worker_main (this=0x55fb4ff18370, thread_var=0x55fb4ff18730) at /data/MDEV-33137/11.1/tpool/tpool_generic.cc:583
      #12 0x000055fb4cb8dd70 in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__f=
          @0x55fb4ffb1238: (void (tpool::thread_pool_generic::*)(class tpool::thread_pool_generic * const, struct tpool::worker_data *)) 0x55fb4cb876f2 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>, __t=@0x55fb4ffb1230: 0x55fb4ff18370) at /usr/include/c++/9/bits/invoke.h:73
      #13 0x000055fb4cb8dc52 in std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=
          @0x55fb4ffb1238: (void (tpool::thread_pool_generic::*)(class tpool::thread_pool_generic * const, struct tpool::worker_data *)) 0x55fb4cb876f2 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>) at /usr/include/c++/9/bits/invoke.h:95
      #14 0x000055fb4cb8db71 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (
          this=0x55fb4ffb1228) at /usr/include/c++/9/thread:244
      #15 0x000055fb4cb8db0c in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=0x55fb4ffb1228)
          at /usr/include/c++/9/thread:251
      #16 0x000055fb4cb8daec in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run
          (this=0x55fb4ffb1220) at /usr/include/c++/9/thread:195
      #17 0x000055fb4e428de4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
      #18 0x00000c5f66746609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #19 0x000014fdbb10e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      It looks like the failure appeared after this merge in 10.11

      commit 583a7452992acbb3c9bd74122c4f0086b9e16997
      Merge: 45fadb64c1c 9c5600adde6
      Author: Marko Mäkelä
      Date:   Tue Nov 21 10:23:11 2023 +0200
       
          Merge 10.6 into 10.11
      

      However I couldn't reproduce it on 10.6.

      A couple of test runs which reproduce the failure currently fairly well:

      randgen 3468ff84518decaa5752778562bfe79444b201c6

      # on 11.1, 11.2
      perl ./run.pl --compatibility=110299 --engine=InnoDB --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encrypt-log --mysqld=--innodb-encryption-threads=4 --mysqld=--plugin-load-add=file_key_management --mysqld=--file-key-management-filename=`pwd`/util/file_key_management_keys.txt  --grammar=conf/yy/desc_indexes.yy --scenario=AtomicDDL --filter=conf/ff/restrict_dynamic_vars.ff  --gendata=conf/zz/trx_stress.zz  --queries=1000000   --threads=6 --duration=200 --mysqld=--max-statement-time=30 --mysqld=--lock-wait-timeout=15 --mysqld=--innodb-lock-wait-timeout=10  --base-port=14000 --basedir=/data/bld/10.11-asan --vardir=/dev/shm/var12 --seed=1703604571 --nometadata-reload
       
      # on 10.11
      perl ./run.pl --compatibility=101199 --duration=600 --mysqld=--max-statement-time=60 --mysqld=--lock-wait-timeout=30 --mysqld=--innodb-lock-wait-timeout=15 --threads=6 --mysqld=--in_predicate_conversion_threshold=0 --mysqld=--session_track_transaction_info=STATE --mysqld=--strict_password_validation=OFF --mysqld=--expensive_subquery_limit=1000  --queries=1000000 --reporters=Backtrace,Deadlock,MemoryUsage,FeatureUsage --mysqld=--plugin-maturity=experimental --mysqld=--loose-debug_assert_on_not_freed_memory=0 --gendata=advanced --gendata=simple --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encrypt-log --mysqld=--innodb-encryption-threads=4 --mysqld=--plugin-load-add=file_key_management --mysqld=--file-key-management-filename=`pwd`/util/file_key_management_keys.txt --mysqld=--character-set-server=latin1 --mysqld=--collation-server=latin1_swedish_ci --engine=InnoDB  --gendata=simple --grammar=conf/yy/replication.yy --grammar=conf/yy/current_timestamp.yy --gendata=conf/zz/current_timestamp.zz   --grammar=conf/yy/alter_table.yy --grammar=conf/yy/all_selects.yy:0.0001 --scenario=AtomicDDL --filter=conf/ff/restrict_dynamic_vars.ff --variator=ExecuteAsUnion --base-port=14000 --basedir=/data/bld/10.11-bug-lsn --vardir=/dev/shm/var5 --seed=1701900615
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            I couldn't reproduce the failure on 10.11 or 11.2 with the patch above in a reasonable amount of attempts, so it apparently fixes at least the reported failure.

            For the record, this is the test I used this time, both for 10.11 and 11.2:

            randgen ff3b5a8b62b5735821092c043e718930f8c116ca

            perl ./run.pl --compatibility=101199 --duration=600 --mysqld=--max-statement-time=60 --mysqld=--lock-wait-timeout=30 --mysqld=--innodb-lock-wait-timeout=15 --threads=6 --queries=1000000 --reporters=Backtrace  --mysqld=--loose-debug_assert_on_not_freed_memory=0  --gendata=simple --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encrypt-log --mysqld=--innodb-encryption-threads=4 --mysqld=--plugin-load-add=file_key_management --mysqld=--file-key-management-filename=`pwd`/util/file_key_management_keys.txt  --engine=InnoDB  --gendata=simple --grammar=conf/yy/dml.yy --grammar=conf/yy/current_timestamp.yy --gendata=conf/zz/current_timestamp.zz    --grammar=conf/yy/all_selects.yy:0.0001 --scenario=AtomicDDL --filter=conf/ff/restrict_dynamic_vars.ff --variator=ExecuteAsUnion --base-port=10700 --basedir=/data/bld/11.2-patch-MDEV-33137/ --vardir=/dev/shm/var-MDEV33137-5 --seed=1701900615 --trials=15 --mysqld=--enforce-storage-engine=InnoDB
            

            Comparing to the tests in the description, it enforces InnoDB storage engine and is somewhat simpler (less DDL), to avoid irrelevant failures. This way, without the patch I got the failure in 1-2 attempts on each branch, and with the patch got a clean run of 15 attempts on each branch.

            However, it only checks that there are no crashes, assertion failures or corruption-like errors upon/after crash recovery. It doesn't verify the functionality or performance.

            elenst Elena Stepanova added a comment - - edited I couldn't reproduce the failure on 10.11 or 11.2 with the patch above in a reasonable amount of attempts, so it apparently fixes at least the reported failure. For the record, this is the test I used this time, both for 10.11 and 11.2: randgen ff3b5a8b62b5735821092c043e718930f8c116ca perl ./run.pl --compatibility=101199 --duration=600 --mysqld=--max-statement-time=60 --mysqld=--lock-wait-timeout=30 --mysqld=--innodb-lock-wait-timeout=15 --threads=6 --queries=1000000 --reporters=Backtrace --mysqld=--loose-debug_assert_on_not_freed_memory=0 --gendata=simple --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encrypt-log --mysqld=--innodb-encryption-threads=4 --mysqld=--plugin-load-add=file_key_management --mysqld=--file-key-management-filename=`pwd`/util/file_key_management_keys.txt --engine=InnoDB --gendata=simple --grammar=conf/yy/dml.yy --grammar=conf/yy/current_timestamp.yy --gendata=conf/zz/current_timestamp.zz --grammar=conf/yy/all_selects.yy:0.0001 --scenario=AtomicDDL --filter=conf/ff/restrict_dynamic_vars.ff --variator=ExecuteAsUnion --base-port=10700 --basedir=/data/bld/11.2-patch-MDEV-33137/ --vardir=/dev/shm/var-MDEV33137-5 --seed=1701900615 --trials=15 --mysqld=--enforce-storage-engine=InnoDB Comparing to the tests in the description, it enforces InnoDB storage engine and is somewhat simpler (less DDL), to avoid irrelevant failures. This way, without the patch I got the failure in 1-2 attempts on each branch, and with the patch got a clean run of 15 attempts on each branch. However, it only checks that there are no crashes, assertion failures or corruption-like errors upon/after crash recovery. It doesn't verify the functionality or performance.

            Thank you, elenst. I think that the MDEV-18976 page checksum records should be rather good at catching recovery errors for DML workloads.

            marko Marko Mäkelä added a comment - Thank you, elenst . I think that the MDEV-18976 page checksum records should be rather good at catching recovery errors for DML workloads.

            1. Setting MTR_MEMO_PAGE_X_MODIFY without actually modifying the page could result in the problem. So, I agree here but we should work on a mtr test too.

            • It seems apparently harmless though for release builds

            2. There is a second issue with setting MTR_MEMO_PAGE_X_MODIFY directly without the page being modified later.

            • This could result in mtr commit to add page to flush list without taking the flush order mutex. Which could in turn result in wrong checkpoint calculation and eventual data loss / corruption .
            • So, we should absolutely remove all direct setting of MTR_MEMO_PAGE_X_MODIFY other than mtr_t::set_modified(). Perhaps somehow enforce it or at least document well.
            • Perhaps assert in mtr commit
            • Possibly add a test

            3. In purge code perhaps better/cleaner to reacquire the pages as opposed to pinning it. Usually pinning page to keep it alive is a special case and is better to be used when absolutely required. I suspect it is not going to cause any harm during purge if we get the required pages again.

            debarun Debarun Banerjee added a comment - 1. Setting MTR_MEMO_PAGE_X_MODIFY without actually modifying the page could result in the problem. So, I agree here but we should work on a mtr test too. It seems apparently harmless though for release builds 2. There is a second issue with setting MTR_MEMO_PAGE_X_MODIFY directly without the page being modified later. This could result in mtr commit to add page to flush list without taking the flush order mutex. Which could in turn result in wrong checkpoint calculation and eventual data loss / corruption . So, we should absolutely remove all direct setting of MTR_MEMO_PAGE_X_MODIFY other than mtr_t::set_modified(). Perhaps somehow enforce it or at least document well. Perhaps assert in mtr commit Possibly add a test 3. In purge code perhaps better/cleaner to reacquire the pages as opposed to pinning it. Usually pinning page to keep it alive is a special case and is better to be used when absolutely required. I suspect it is not going to cause any harm during purge if we get the required pages again.

            debarun, I mostly agree with you. For test coverage, we occasionally had some crash recovery tests failing due to this:

            10.6 aff5ed3988a3add17d607e6184288f82b567dbee

            innodb_zip.bug56680 '4k,innodb'          w9 [ fail ]  Found warnings/errors in server log file!
                    Test ended at 2023-12-17 11:57:59
            line
            2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1315593 is not equal to page LSN 1315626
            2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1315984 is not equal to page LSN 1316017
            2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1329658 is not equal to page LSN 1329707
            ^ Found warnings in /home/buildbot/amd64-ubuntu-2204/build/mysql-test/var/9/log/mysqld.1.err
            

            I do not have any ideas how to improve the regression test suite with respect to this.

            I did think of adding an assertion to mtr_t::memo_push() that would disallow parameters that include the MODIFY flag, but I decided against it, because then buf_page_free() would have had to be changed to less optimal code.

            I prefer to buffer-fix some blocks across mini-transaction boundaries in operations that need to be split into multiple small mini-transactions. The buffer-fix will not block page writes or log checkpoints, but it will prevent page eviction. I agree that there are some dangers around buffer-fixing, like MDEV-31767 taught me. If a block may have just been read into the buffer pool, we must not only acquire a page latch, but we must also verify that the page was not evicted due to corruption (MDEV-13542). In this case, this validation was already performed when the page was first looked up by buf_page_get_gen(). The buffer-fix will ensure that an already validated block can be safely accessed in a subsequent mini-transaction, after reacquiring the page latch (which will block file I/O again).

            The benefit of the buffer-fix trick is twofold:

            • Correctness: Preventing a hard-to-test scenario that across the mini-transaction boundary, the block was written out, evicted, and read back as corrupted.
            • Performance: Being able to remove some code for attempting to handle the above scenario. Avoiding expensive operations like Block_hint::buffer_fix_block_if_still_valid() or buf_pool.page_hash lookups.

            Note: In MDEV-13542, I tried to write the corruption handling of page-freeing operations in such a way that when corruption is detected, we would stop the efforts to free further linked pages, but carry on with the higher-level operation. In other words, we’d prefer to ‘leak’ some pages (not marking them as free). This was extensively tested with some error injection (--debug-dbug=+d,intermittent_read_failure in fil_space_t::io()).

            marko Marko Mäkelä added a comment - debarun , I mostly agree with you. For test coverage, we occasionally had some crash recovery tests failing due to this: 10.6 aff5ed3988a3add17d607e6184288f82b567dbee innodb_zip.bug56680 '4k,innodb' w9 [ fail ] Found warnings/errors in server log file! Test ended at 2023-12-17 11:57:59 line 2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1315593 is not equal to page LSN 1315626 2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1315984 is not equal to page LSN 1316017 2023-12-17 11:57:58 0 [Warning] InnoDB: The last skipped log record LSN 1329658 is not equal to page LSN 1329707 ^ Found warnings in /home/buildbot/amd64-ubuntu-2204/build/mysql-test/var/9/log/mysqld.1.err I do not have any ideas how to improve the regression test suite with respect to this. I did think of adding an assertion to mtr_t::memo_push() that would disallow parameters that include the MODIFY flag, but I decided against it, because then buf_page_free() would have had to be changed to less optimal code. I prefer to buffer-fix some blocks across mini-transaction boundaries in operations that need to be split into multiple small mini-transactions. The buffer-fix will not block page writes or log checkpoints, but it will prevent page eviction. I agree that there are some dangers around buffer-fixing, like MDEV-31767 taught me. If a block may have just been read into the buffer pool, we must not only acquire a page latch, but we must also verify that the page was not evicted due to corruption ( MDEV-13542 ). In this case, this validation was already performed when the page was first looked up by buf_page_get_gen() . The buffer-fix will ensure that an already validated block can be safely accessed in a subsequent mini-transaction, after reacquiring the page latch (which will block file I/O again). The benefit of the buffer-fix trick is twofold: Correctness: Preventing a hard-to-test scenario that across the mini-transaction boundary, the block was written out, evicted, and read back as corrupted. Performance: Being able to remove some code for attempting to handle the above scenario. Avoiding expensive operations like Block_hint::buffer_fix_block_if_still_valid() or buf_pool.page_hash lookups. Note: In MDEV-13542 , I tried to write the corruption handling of page-freeing operations in such a way that when corruption is detected, we would stop the efforts to free further linked pages, but carry on with the higher-level operation. In other words, we’d prefer to ‘leak’ some pages (not marking them as free). This was extensively tested with some error injection ( --debug-dbug=+d,intermittent_read_failure in fil_space_t::io() ).

            marko, debarun, thank you for such informative discussion. After all, I think that page fixing is reasonable solution for this case. As for testing, we could use the same technique for setting debug sync points in purge threads as it was used in MDEV-31355 for "cursor-restore-locking.test" after I find the reason of the test instability.

            vlad.lesin Vladislav Lesin added a comment - marko , debarun , thank you for such informative discussion. After all, I think that page fixing is reasonable solution for this case. As for testing, we could use the same technique for setting debug sync points in purge threads as it was used in MDEV-31355 for "cursor-restore-locking.test" after I find the reason of the test instability.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.