Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32681

Test case innodb.undo_truncate_recover frequent failure

Details

    Description

      innodb.undo_truncate_recover '16k,2,innodb,strict_full_crc32' w15 [ fail ]
              Test ended at 2023-11-03 15:59:19
       
      CURRENT_TEST: innodb.undo_truncate_recover
      --- /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/undo_truncate_recover.result	2023-11-03 15:04:02.000000000 +0000
      +++ /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/undo_truncate_recover.reject	2023-11-03 15:59:19.194596785 +0000
      @@ -11,5 +11,5 @@
       drop table t1;
       call mtr.add_suppression("InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces");
       SET GLOBAL innodb_fast_shutdown=0;
      -FOUND 1 /ib_undo_trunc/ in mysqld.1.err
      +NOT FOUND /ib_undo_trunc/ in mysqld.1.err
       # restart: with restart_parameters
      

      When it happens, the following appear in the mysqld.1.err:

      2023-11-03 15:58:19 0 [Note] InnoDB: Truncating .//undo001
      2023-11-03 15:58:19 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 1491 OS file reads, 1420 OS file writes, 231 OS fsyncs.
      231103 15:59:17 [ERROR] mysqld got signal 6 ;
      

      I think the signal 6 might be because of a timeout waiting for the server to shut down (the server is manually restarted in the test case at this point).

      The server is at this point stuck during shutdown in trx_purge_truncate_history(), trying to create the rsegs for the truncated tablespace, but not able to get a buffer pool page, as per the printout in the error log:

      #12 0x00007f662f46d90c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      #13 0x0000559380bb0bab in tpool::waitable_task::wait (this=0x55938203ece0 <purge_coordinator_task>, lk=...) at /kvm/src/my/sn-mdev30255/tpool/task.cc:79
      #14 0x0000559380bb0bf9 in tpool::waitable_task::wait (this=0x55938203ece0 <purge_coordinator_task>) at /kvm/src/my/sn-mdev30255/tpool/task.cc:85
      #15 0x00005593809b6216 in srv_purge_shutdown () at /kvm/src/my/sn-mdev30255/storage/innobase/srv/srv0srv.cc:2048
      #16 0x00005593809be4a3 in innodb_preshutdown () at /kvm/src/my/sn-mdev30255/storage/innobase/srv/srv0start.cc:2003
      #17 0x000055938043c47c in plugin_pre_shutdown (plugin=0x559382a31d78) at /kvm/src/my/sn-mdev30255/sql/handler.cc:1000
      #18 0x00005593800ff526 in plugin_foreach_with_mask (thd=0x0, func=0x55938043c43b <plugin_pre_shutdown(THD*, plugin_ref, void*)>, type=1, state_mask=10, arg=0x0) at /kvm/src/my/sn-mdev30255/sql/sql_plugin.cc:2553
      #19 0x000055938043c4a8 in ha_pre_shutdown () at /kvm/src/my/sn-mdev30255/sql/handler.cc:1007
      #20 0x000055937ffa6c10 in mysqld_main (argc=170, argv=0x559382731150) at /kvm/src/my/sn-mdev30255/sql/mysqld.cc:5734
      #21 0x000055937ff9d465 in main (argc=29, argv=0x7fff57493358) at /kvm/src/my/sn-mdev30255/sql/main.cc:25
       
      Thread 2 (Thread 1407920.1407930 (mariadbd)):
      #0  buf_LRU_get_free_block (have_mutex=false) at /kvm/src/my/sn-mdev30255/storage/innobase/buf/buf0lru.cc:486
      #1  0x0000559380b2e530 in fsp_page_create (space=0x5593828ab850, offset=49, mtr=0x7f66248f55f0) at /kvm/src/my/sn-mdev30255/storage/innobase/fsp/fsp0fsp.cc:1080
      #2  0x0000559380b2eb05 in fsp_alloc_free_page (space=0x5593828ab850, hint=0, mtr=0x7f66248f55f0, init_mtr=0x7f66248f55f0) at /kvm/src/my/sn-mdev30255/storage/innobase/fsp/fsp0fsp.cc:1197
      #3  0x0000559380b323db in fseg_alloc_free_page_low (space=0x5593828ab850, seg_inode=0x7f662e0762b2 "", iblock=0x7f662dbd3020, hint=0, direction=111 'o', has_done_reservation=false, mtr=0x7f66248f55f0, init_mtr=0x7f66248f55f0) at /kvm/src/my/sn-mdev30255/storage/innobase/fsp/fsp0fsp.cc:2136
      #4  0x0000559380b30a27 in fseg_create (space=0x5593828ab850, byte_offset=62, mtr=0x7f66248f55f0, has_done_reservation=false, block=0x0) at /kvm/src/my/sn-mdev30255/storage/innobase/fsp/fsp0fsp.cc:1759
      #5  0x00005593809fda94 in trx_rseg_header_create (space=0x5593828ab850, rseg_id=93, max_trx_id=37, sys_header=0x7f662dbd2c00, mtr=0x7f66248f55f0) at /kvm/src/my/sn-mdev30255/storage/innobase/trx/trx0rseg.cc:321
      #6  0x00005593809debaa in trx_purge_truncate_history () at /kvm/src/my/sn-mdev30255/storage/innobase/trx/trx0purge.cc:737
      #7  0x00005593809b57f1 in srv_do_purge (n_total_purged=0x7f66248f5b18) at /kvm/src/my/sn-mdev30255/storage/innobase/srv/srv0srv.cc:1847
      #8  0x00005593809b5d92 in purge_coordinator_callback_low () at /kvm/src/my/sn-mdev30255/storage/innobase/srv/srv0srv.cc:1945
      #9  0x00005593809b5e8f in purge_coordinator_callback () at /kvm/src/my/sn-mdev30255/storage/innobase/srv/srv0srv.cc:1974
      #10 0x0000559380bb0678 in tpool::task_group::execute (this=0x55938203ec40 <purge_coordinator_task_group>, t=0x55938203ece0 <purge_coordinator_task>) at /kvm/src/my/sn-mdev30255/tpool/task_group.cc:55
      #11 0x0000559380bb09ce in tpool::task::execute (this=0x55938203ece0 <purge_coordinator_task>) at /kvm/src/my/sn-mdev30255/tpool/task.cc:47
      #12 0x0000559380baa527 in tpool::thread_pool_generic::worker_main (this=0x55938279d090, thread_var=0x55938279d5c0) at /kvm/src/my/sn-mdev30255/tpool/tpool_generic.cc:599
      #13 0x0000559380bb0418 in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__f=@0x7f6614001798: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x559380baa492 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>, __t=@0x7f6614001790: 0x55938279d090) at /usr/include/c++/10/bits/invoke.h:73
      #14 0x0000559380bb0308 in std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=@0x7f6614001798: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x559380baa492 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>) at /usr/include/c++/10/bits/invoke.h:95
      #15 0x0000559380bb023b in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=0x7f6614001788) at /usr/include/c++/10/thread:264
      #16 0x0000559380bb01d8 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=0x7f6614001788) at /usr/include/c++/10/thread:271
      #17 0x0000559380bb01bc in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=0x7f6614001780) at /usr/include/c++/10/thread:215
      #18 0x00007f662f472ed0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      #19 0x00007f662f22bea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #20 0x00007f662f149a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      I was able to reproduce with --rr, have the run saved currently.

      I see that the code is locking all dirty buffer pool pages in the undo tablespace, then moving on to re-creating all rseg header pages. I wonder if it managed to lock all pages in the buffer pool, thus locking itself out from creating new pages? Maybe there should be a check if the code is trying to lock more than say 50% of buffer pool pages, and if so flush some pages explicitly or abort and try again later. Or better if the pages can be dropped from the buffer pool without flushing, but not sure if that can be done in a crash/recover -safe way.

      In 10.5, the test failure is easy to trigger with 16k pages, I can see it hangs when it has locked + created around 500 pages (with 8M buffer pool). Cannot be reproduced with 16M buffer pool.

      In 10.6, the failure no longer occurs. Not sure if this is accidental or if the code is fixed. In 10.6 the code still locks all modified pages in the to-be-truncated undo tablespace, but the number of pages is smaller so the code doesn't reach 500 pages locked+created which triggered the hang in 10.5.

      Attachments

        Issue Links

          Activity

            Transaction 31 was actually committed before the server was killed due to the shutdown hang. It had written 60,000 undo log records, corresponding to the following SQL statements:

            let SEARCH_PATTERN = ib_undo_trunc;
            begin;
            update t1 set c = 'MariaDB';
            update t1 set c = 'InnoDB';
            eval set global debug_dbug = '+d,$SEARCH_PATTERN';
            commit;
            

            The LSN of the transaction commit was 20365073, and the log was durably written up to that point, and ultimately to 20379464, which also is the recv_sys.recovered_lsn of the recovery. So, this transaction should have been recovered as committed, and no rollback should have been attempted in the first place.

            marko Marko Mäkelä added a comment - Transaction 31 was actually committed before the server was killed due to the shutdown hang. It had written 60,000 undo log records, corresponding to the following SQL statements: let SEARCH_PATTERN = ib_undo_trunc; begin ; update t1 set c = 'MariaDB' ; update t1 set c = 'InnoDB' ; eval set global debug_dbug = '+d,$SEARCH_PATTERN' ; commit ; The LSN of the transaction commit was 20365073, and the log was durably written up to that point, and ultimately to 20379464, which also is the recv_sys.recovered_lsn of the recovery. So, this transaction should have been recovered as committed, and no rollback should have been attempted in the first place.

            The undo log page 68 state was changed from TRX_UNDO_ACTIVE to TRX_UNDO_TO_PURGE during the commit of transaction 31:

            10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5

            #1  0x000055774eaa9de7 in trx_undo_set_state_at_finish (undo=0x557750a0adc0, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0undo.cc:1321
            #2  0x000055774ea8897c in trx_purge_add_undo_to_history (trx=trx@entry=0x7fc7a0170188, undo=@0x7fc7a0171228: 0x557750a0adc0, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:219
            #3  0x000055774eaa2fdd in trx_write_serialisation_history (trx=trx@entry=0x7fc7a0170188, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1089
            #4  0x000055774eaa3414 in trx_t::commit_low (this=this@entry=0x7fc7a0170188, mtr=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1455
            #5  0x000055774eaa35f3 in trx_t::commit (this=this@entry=0x7fc7a0170188) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1502
            

            The FIL_PAGE_LSN of this page at the time the SIGABRT is received is 20379428, and the oldest_modification is 0, that is, any changes had supposedly been written back. But, on recovery, we read the an older version of the page, with the FIL_PAGE_LSN being only 19400054. This looks like a "write side" error. The page had been wrongly marked as clean here:

            10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5

            #0  buf_page_t::clear_oldest_modification (this=0x7fc79b8082a0, temporary=false) at /mariadb/10.5/storage/innobase/include/buf0buf.h:2250
            #1  0x000055774ea8aa46 in trx_purge_truncate_history () at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:681
            #2  0x000055774ea6ef4b in srv_do_purge (n_total_purged=n_total_purged@entry=0x7fc799ef2bb8) at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1847
            #3  0x000055774ea6f15f in purge_coordinator_callback_low () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1945
            #4  0x000055774ea71706 in purge_coordinator_callback () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1974
            

            At this point, the log_sys.flushed_to_disk_lsn as well as log_sys.lsn had already reached 20379464, the value at the time of the SIGABRT. But, we had not yet durably written any log record for truncating the undo tablespace. The error occurs when the log checkpoint is advanced from 19417270 to 20379452 by the buf_flush_page_cleaner() thread a little later. At that point, our undo tablespace truncation thread was busy allocating buffer pool pages:

            10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5

            #11 0x000055774e99f57c in binary_semaphore::wait (this=0x7fc799ef36a0) at /mariadb/10.5/storage/innobase/log/log0sync.cc:126
            #12 0x000055774e99f8f2 in group_commit_lock::acquire (this=this@entry=0x55774f607b00 <flush_lock>, num=num@entry=20379464) at /mariadb/10.5/storage/innobase/log/log0sync.cc:189
            #13 0x000055774e98bf6d in log_write_up_to (lsn=20379464, flush_to_disk=flush_to_disk@entry=true, rotate_key=rotate_key@entry=false) at /mariadb/10.5/storage/innobase/log/log0log.cc:822
            #14 0x000055774e98c14b in log_buffer_flush_to_disk (sync=sync@entry=true) at /mariadb/10.5/storage/innobase/log/log0log.cc:858
            #15 0x000055774e8ea008 in buf_flush_LRU (max_n=32) at /mariadb/10.5/storage/innobase/buf/buf0flu.cc:1742
            #16 0x000055774e8e01d5 in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at /mariadb/10.5/storage/innobase/buf/buf0lru.cc:504
            #17 0x000055774e8d4d56 in fsp_page_create (space=space@entry=0x5577509eba60, offset=51, mtr=mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1080
            #18 0x000055774e8d5344 in fsp_alloc_free_page (space=space@entry=0x5577509eba60, hint=<optimized out>, hint@entry=0, mtr=mtr@entry=0x7fc799ef2680, init_mtr=init_mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1197
            #19 0x000055774e8d61e4 in fseg_alloc_free_page_low (space=space@entry=0x5577509eba60, seg_inode=seg_inode@entry=0x7fc79bcb6432 "", iblock=0x7fc79b818ec0, hint=hint@entry=0, direction=direction@entry=111 'o', has_done_reservation=has_done_reservation@entry=false, mtr=0x7fc799ef2680, init_mtr=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:2136
            #20 0x000055774e8d6b48 in fseg_create (space=space@entry=0x5577509eba60, byte_offset=byte_offset@entry=62, mtr=mtr@entry=0x7fc799ef2680, has_done_reservation=has_done_reservation@entry=false, block=block@entry=0x0) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1759
            #21 0x000055774ea9ccea in trx_rseg_header_create (space=space@entry=0x5577509eba60, rseg_id=rseg_id@entry=97, max_trx_id=37, sys_header=sys_header@entry=0x7fc79b825620, mtr=mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/trx/trx0rseg.cc:321
            #22 0x000055774ea8ad7e in trx_purge_truncate_history () at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:737
            #23 0x000055774ea6ef4b in srv_do_purge (n_total_purged=n_total_purged@entry=0x7fc799ef2bb8) at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1847
            #24 0x000055774ea6f15f in purge_coordinator_callback_low () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1945
            #25 0x000055774ea71706 in purge_coordinator_callback () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1974
            

            In summary, the extremely slow buffer pool eviction in 10.5 let us find a recovery bug in innodb_undo_log_truncate=ON. It turns out that the following patch will fix this:

            diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
            index a297ca4921e..28a58c00772 100644
            --- a/storage/innobase/trx/trx0purge.cc
            +++ b/storage/innobase/trx/trx0purge.cc
            @@ -677,10 +677,7 @@ void trx_purge_truncate_history()
                     ut_ad(bpage->io_fix() == BUF_IO_NONE);
             
                     if (bpage->oldest_modification() > 1)
            -        {
            -          bpage->clear_oldest_modification(false);
                       mtr.memo_push(block, MTR_MEMO_PAGE_X_FIX);
            -        }
                     else
                     {
                       rw_lock_x_unlock(&block->lock);
            

            That call is not only harmful but also redundant, because mtr_t::commit_shrink() would execute the following in Shrink::operator()(mtr_memo_slot_t*):

                  if (bpage.oldest_modification() > 1)
                    bpage.clear_oldest_modification(false);
                  slot->type= static_cast<mtr_memo_type_t>(slot->type & ~MTR_MEMO_MODIFY);
            

            However, the above patch is not enough. We would need to adjust the following condition in fsp_page_create():

              if (UNIV_UNLIKELY(space->is_being_truncated))
              {
                // …
                if (block && block->page.oldest_modification() <= 1)
                  block= nullptr;
            

            Before the call to clear_oldest_modification() was removed, this condition would hold and we would end up allocating another block for the same page identifier. This is actually related to running out of the buffer pool! Without the assignment block=nullptr we would hit an assertion failure page.state() != BUF_BLOCK_FILE_PAGE in buf_block_t::initialise() later in fsp_page_create():

                  free_block= block;
                  goto got_free_block;
              }
            // …
            got_free_block:
              block= buf_page_create(space, static_cast<uint32_t>(offset),
                                     space->zip_size(), mtr, free_block);
            

            I think that we must revise fsp_page_create() as well as Shrink::operator()(mtr_memo_slot_t*) so that the x-latched blocks of the mini-transaction will only be marked clean if they were not modified by the mini-transaction. In that way, we should be able to reuse the already x-latched blocks in the buffer pool.

            marko Marko Mäkelä added a comment - The undo log page 68 state was changed from TRX_UNDO_ACTIVE to TRX_UNDO_TO_PURGE during the commit of transaction 31: 10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5 #1 0x000055774eaa9de7 in trx_undo_set_state_at_finish (undo=0x557750a0adc0, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0undo.cc:1321 #2 0x000055774ea8897c in trx_purge_add_undo_to_history (trx=trx@entry=0x7fc7a0170188, undo=@0x7fc7a0171228: 0x557750a0adc0, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:219 #3 0x000055774eaa2fdd in trx_write_serialisation_history (trx=trx@entry=0x7fc7a0170188, mtr=mtr@entry=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1089 #4 0x000055774eaa3414 in trx_t::commit_low (this=this@entry=0x7fc7a0170188, mtr=0x7fc789daef00) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1455 #5 0x000055774eaa35f3 in trx_t::commit (this=this@entry=0x7fc7a0170188) at /mariadb/10.5/storage/innobase/trx/trx0trx.cc:1502 The FIL_PAGE_LSN of this page at the time the SIGABRT is received is 20379428, and the oldest_modification is 0, that is, any changes had supposedly been written back. But, on recovery, we read the an older version of the page, with the FIL_PAGE_LSN being only 19400054. This looks like a "write side" error. The page had been wrongly marked as clean here: 10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5 #0 buf_page_t::clear_oldest_modification (this=0x7fc79b8082a0, temporary=false) at /mariadb/10.5/storage/innobase/include/buf0buf.h:2250 #1 0x000055774ea8aa46 in trx_purge_truncate_history () at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:681 #2 0x000055774ea6ef4b in srv_do_purge (n_total_purged=n_total_purged@entry=0x7fc799ef2bb8) at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1847 #3 0x000055774ea6f15f in purge_coordinator_callback_low () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1945 #4 0x000055774ea71706 in purge_coordinator_callback () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1974 At this point, the log_sys.flushed_to_disk_lsn as well as log_sys.lsn had already reached 20379464, the value at the time of the SIGABRT. But, we had not yet durably written any log record for truncating the undo tablespace. The error occurs when the log checkpoint is advanced from 19417270 to 20379452 by the buf_flush_page_cleaner() thread a little later. At that point, our undo tablespace truncation thread was busy allocating buffer pool pages: 10.5 c68620df48cd1b0e7f5f1cb5ff8b905a5d9c80a5 #11 0x000055774e99f57c in binary_semaphore::wait (this=0x7fc799ef36a0) at /mariadb/10.5/storage/innobase/log/log0sync.cc:126 #12 0x000055774e99f8f2 in group_commit_lock::acquire (this=this@entry=0x55774f607b00 <flush_lock>, num=num@entry=20379464) at /mariadb/10.5/storage/innobase/log/log0sync.cc:189 #13 0x000055774e98bf6d in log_write_up_to (lsn=20379464, flush_to_disk=flush_to_disk@entry=true, rotate_key=rotate_key@entry=false) at /mariadb/10.5/storage/innobase/log/log0log.cc:822 #14 0x000055774e98c14b in log_buffer_flush_to_disk (sync=sync@entry=true) at /mariadb/10.5/storage/innobase/log/log0log.cc:858 #15 0x000055774e8ea008 in buf_flush_LRU (max_n=32) at /mariadb/10.5/storage/innobase/buf/buf0flu.cc:1742 #16 0x000055774e8e01d5 in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at /mariadb/10.5/storage/innobase/buf/buf0lru.cc:504 #17 0x000055774e8d4d56 in fsp_page_create (space=space@entry=0x5577509eba60, offset=51, mtr=mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1080 #18 0x000055774e8d5344 in fsp_alloc_free_page (space=space@entry=0x5577509eba60, hint=<optimized out>, hint@entry=0, mtr=mtr@entry=0x7fc799ef2680, init_mtr=init_mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1197 #19 0x000055774e8d61e4 in fseg_alloc_free_page_low (space=space@entry=0x5577509eba60, seg_inode=seg_inode@entry=0x7fc79bcb6432 "", iblock=0x7fc79b818ec0, hint=hint@entry=0, direction=direction@entry=111 'o', has_done_reservation=has_done_reservation@entry=false, mtr=0x7fc799ef2680, init_mtr=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:2136 #20 0x000055774e8d6b48 in fseg_create (space=space@entry=0x5577509eba60, byte_offset=byte_offset@entry=62, mtr=mtr@entry=0x7fc799ef2680, has_done_reservation=has_done_reservation@entry=false, block=block@entry=0x0) at /mariadb/10.5/storage/innobase/fsp/fsp0fsp.cc:1759 #21 0x000055774ea9ccea in trx_rseg_header_create (space=space@entry=0x5577509eba60, rseg_id=rseg_id@entry=97, max_trx_id=37, sys_header=sys_header@entry=0x7fc79b825620, mtr=mtr@entry=0x7fc799ef2680) at /mariadb/10.5/storage/innobase/trx/trx0rseg.cc:321 #22 0x000055774ea8ad7e in trx_purge_truncate_history () at /mariadb/10.5/storage/innobase/trx/trx0purge.cc:737 #23 0x000055774ea6ef4b in srv_do_purge (n_total_purged=n_total_purged@entry=0x7fc799ef2bb8) at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1847 #24 0x000055774ea6f15f in purge_coordinator_callback_low () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1945 #25 0x000055774ea71706 in purge_coordinator_callback () at /mariadb/10.5/storage/innobase/srv/srv0srv.cc:1974 In summary, the extremely slow buffer pool eviction in 10.5 let us find a recovery bug in innodb_undo_log_truncate=ON . It turns out that the following patch will fix this: diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc index a297ca4921e..28a58c00772 100644 --- a/storage/innobase/trx/trx0purge.cc +++ b/storage/innobase/trx/trx0purge.cc @@ -677,10 +677,7 @@ void trx_purge_truncate_history() ut_ad(bpage->io_fix() == BUF_IO_NONE); if (bpage->oldest_modification() > 1) - { - bpage->clear_oldest_modification(false); mtr.memo_push(block, MTR_MEMO_PAGE_X_FIX); - } else { rw_lock_x_unlock(&block->lock); That call is not only harmful but also redundant, because mtr_t::commit_shrink() would execute the following in Shrink::operator()(mtr_memo_slot_t*) : if (bpage.oldest_modification() > 1) bpage.clear_oldest_modification( false ); slot->type= static_cast <mtr_memo_type_t>(slot->type & ~MTR_MEMO_MODIFY); However, the above patch is not enough. We would need to adjust the following condition in fsp_page_create() : if (UNIV_UNLIKELY(space->is_being_truncated)) { // … if (block && block->page.oldest_modification() <= 1) block= nullptr; Before the call to clear_oldest_modification() was removed, this condition would hold and we would end up allocating another block for the same page identifier. This is actually related to running out of the buffer pool! Without the assignment block=nullptr we would hit an assertion failure page.state() != BUF_BLOCK_FILE_PAGE in buf_block_t::initialise() later in fsp_page_create() : free_block= block; goto got_free_block; } // … got_free_block: block= buf_page_create(space, static_cast <uint32_t>(offset), space->zip_size(), mtr, free_block); I think that we must revise fsp_page_create() as well as Shrink::operator()(mtr_memo_slot_t*) so that the x-latched blocks of the mini-transaction will only be marked clean if they were not modified by the mini-transaction. In that way, we should be able to reuse the already x-latched blocks in the buffer pool.

            I filed MDEV-32757 for the recovery bug. My current work-in-progress fix of that does not prevent the test from occasionally running out of buffer pool from time to time. It might have reduced the failure rate somewhat.

            marko Marko Mäkelä added a comment - I filed MDEV-32757 for the recovery bug. My current work-in-progress fix of that does not prevent the test from occasionally running out of buffer pool from time to time. It might have reduced the failure rate somewhat.

            I ported MDEV-26055 to 10.5. It seems to fix this test failure. I do not think it makes sense to port MDEV-26827. It could depend more on MDEV-24142 and other refactoring that is only present in 10.6.

            marko Marko Mäkelä added a comment - I ported MDEV-26055 to 10.5. It seems to fix this test failure. I do not think it makes sense to port MDEV-26827 . It could depend more on MDEV-24142 and other refactoring that is only present in 10.6.

            origin/10.5-MDEV-32681 4ffeaec2dfe591d4961b80c95a6863a3100cbba6 2023-11-14T09:13:39+02:00
            performed well in RQG testing.

            mleich Matthias Leich added a comment - origin/10.5- MDEV-32681 4ffeaec2dfe591d4961b80c95a6863a3100cbba6 2023-11-14T09:13:39+02:00 performed well in RQG testing.

            People

              marko Marko Mäkelä
              knielsen Kristian Nielsen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.