Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37447

Race condition between shrinking innodb_buffer_pool_size and buf_pool_t::page_guess()

    XMLWordPrintable

Details

    • Can result in hang or crash
    • There was a very small chance of InnoDB crashing or misbehaving after an attempt to reduce innodb_buffer_pool_size.

    Description

      Assertion found on debug build :

      SET GLOBAL innodb_buffer_pool_size = 8388608;
      

      executing concurrently with the purge of committed transaction history
      Leads to :-

      GIT_SHOW: HEAD -> 10.11, origin/bb-10.11-MDEV-26115, origin/10.11 852e4510fa662c571a42f550278d4abd09e3c5cf 2025-07-23T09:34:47+07:00

      # 2025-08-14T16:21:43 [2316517] | mariadbd: /data/Server/10.11_new/storage/innobase/include/buf0buf.h:1528: buf_page_t* buf_pool_t::LRU_remove(buf_page_t*): Assertion `bpage->in_LRU_list' failed.
      

      StackTrace

      Thread 3 received signal SIGABRT, Aborted.
      [Switching to Thread 2332164.2480081]
      __pthread_kill_implementation (no_tid=0, signo=6, threadid=80655625631296) at ./nptl/pthread_kill.c:44
      44      ./nptl/pthread_kill.c: No such file or directory.
      (rr) set print addr off
      (rr) bt
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=80655625631296) at ./nptl/pthread_kill.c:44
      #1  __pthread_kill_internal (signo=6, threadid=80655625631296) at ./nptl/pthread_kill.c:78
      #2  __GI___pthread_kill (threadid=80655625631296, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
      #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #4  __GI_abort () at ./stdlib/abort.c:79
      #5  __assert_fail_base (fmt="%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion="bpage->in_LRU_list", file="/data/Server/10.11_new/storage/innobase/include/buf0buf.h", line=1528,
          function=<optimized out>) at ./assert/assert.c:92
      #6  __GI___assert_fail (assertion="bpage->in_LRU_list", file="/data/Server/10.11_new/storage/innobase/include/buf0buf.h", line=1528,
          function="buf_page_t* buf_pool_t::LRU_remove(buf_page_t*)") at ./assert/assert.c:101
      #7  buf_pool_t::LRU_remove (this=this@entry=<buf_pool>, bpage=bpage@entry=) at /data/Server/10.11_new/storage/innobase/include/buf0buf.h:1528
      #8  buf_LRU_remove_block (bpage=<optimized out>) at /data/Server/10.11_new/storage/innobase/buf/buf0lru.cc:554
      #9  buf_page_make_young (bpage=bpage@entry=) at /data/Server/10.11_new/storage/innobase/buf/buf0lru.cc:711
      #10 buf_page_make_young_if_needed (bpage=bpage@entry=) at /data/Server/10.11_new/storage/innobase/buf/buf0lru.cc:721
      #11 btr_cur_t::search_leaf (this=this@entry=, tuple=tuple@entry=, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_LEAF, mtr=mtr@entry=)
          at /data/Server/10.11_new/storage/innobase/btr/btr0cur.cc:1346
      #12 btr_pcur_open (mtr=, cursor=, latch_mode=BTR_MODIFY_LEAF, mode=PAGE_CUR_LE, tuple=) at /data/Server/10.11_new/storage/innobase/include/btr0pcur.h:430
      #13 row_search_on_row_ref (pcur=pcur@entry=, mode=mode@entry=BTR_MODIFY_LEAF, table=<optimized out>, ref=, mtr=mtr@entry=)
          at /data/Server/10.11_new/storage/innobase/row/row0row.cc:1227
      #14 row_purge_reposition_pcur (mode=mode@entry=BTR_MODIFY_LEAF, node=node@entry=, mtr=mtr@entry=) at /data/Server/10.11_new/storage/innobase/row/row0purge.cc:83
      #15 row_purge_reset_trx_id (node=node@entry=, mtr=mtr@entry=) at /data/Server/10.11_new/storage/innobase/row/row0purge.cc:1092
      #16 row_purge_record_func (node=node@entry=, undo_rec=undo_rec@entry="\001\036\v", thr=thr@entry=, updated_extern=<optimized out>)
          at /data/Server/10.11_new/storage/innobase/row/row0purge.cc:1582
      #17 row_purge (node=node@entry=, undo_rec=undo_rec@entry="\001\036\v", thr=thr@entry=) at /data/Server/10.11_new/storage/innobase/row/row0purge.cc:1626
      #18 row_purge_step (thr=thr@entry=) at /data/Server/10.11_new/storage/innobase/row/row0purge.cc:1689
      #19 que_thr_step (thr=thr@entry=) at /data/Server/10.11_new/storage/innobase/que/que0que.cc:553
      #20 que_run_threads_low (thr=thr@entry=) at /data/Server/10.11_new/storage/innobase/que/que0que.cc:609
      #21 que_run_threads (thr=thr@entry=) at /data/Server/10.11_new/storage/innobase/que/que0que.cc:629
      #22 srv_task_execute () at /data/Server/10.11_new/storage/innobase/srv/srv0srv.cc:1440
      #23 srv_purge_worker_task_low () at /data/Server/10.11_new/storage/innobase/srv/srv0srv.cc:1571
      #24 purge_worker_callback () at /data/Server/10.11_new/storage/innobase/srv/srv0srv.cc:1582
      #25 tpool::task_group::execute (this=<purge_task_group>, t=t@entry=<purge_worker_task>) at /data/Server/10.11_new/tpool/task_group.cc:70
      #26 tpool::task::execute (this=<purge_worker_task>) at /data/Server/10.11_new/tpool/task.cc:32
      #27 tpool::thread_pool_generic::worker_main (this=, thread_var=) at /data/Server/10.11_new/tpool/tpool_generic.cc:566
      #28 std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__t=<optimized out>, __f=<optimized out>)
          at /usr/include/c++/11/bits/invoke.h:74
      #29 std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=<optimized out>)
          at /usr/include/c++/11/bits/invoke.h:96
      #30 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (
          this=<optimized out>) at /usr/include/c++/11/bits/std_thread.h:259
      #31 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=<optimized out>)
          at /usr/include/c++/11/bits/std_thread.h:266
      --Type <RET> for more, q to quit, c to continue without paging--c
      #32 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=<optimized out>) at /usr/include/c++/11/bits/std_thread.h:211
      #33 ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
      #34 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #35 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
      

      RR trace is present on pluto :
      /data/results/1755170219/TBR-1221

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              saahil Saahil Alam
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.