Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26200

buf_pool.flush_list corruption in buffer pool resizing or with ROW_FORMAT=COMPRESSED




      MDEV-25113 introduced a race condition that we already fixed up in MDEV-26010. It turns out that this fix was not sufficient. The test innodb_zip.wl5522_debug_zip as well as the buffer pool resizing tests would still occasionally fail in debug builds due to a corruption of buf_pool.flush_list. The symptom would be that buf_pool.flush_list.count disagrees with the length of the linked list chain.

      The race condition might be unobservable on single-socket IA-32 and AMD64 setups. I observed it on a dual Intel® Xeon® E5-2630. Adding more calls to buf_flush_validate_low() would seem to reduce the probability of failure.

      The safe procedure for relocating a block in buf_pool.flush_list seems to be the following:

      1. Acquire buf_pool.mutex.
      2. Acquire the exclusive buf_pool.page_hash.latch.
      3. Acquire buf_pool.flush_list_mutex.
      4. Copy the block descriptor.
      5. Invoke buf_flush_relocate_on_flush_list().
      6. Release buf_pool.flush_list_mutex.

      In this way, the relocated block descriptor should be guaranteed to be in a consistent state. At least the test innodb_zip.wl5522_debug_zip,16k no longer triggered the debug assertion on my system.

      For the record, the debug assertion looks like this:

      10.6 61fcbed920c0ed1373725c4122af5a483ae7ffb2

      innodb_zip.wl5522_debug_zip '16k,innodb' w26 [ fail ]
              Test ended at 2021-07-21 14:08:52CURRENT_TEST: innodb_zip.wl5522_debug_zip
      mysqltest: At line 320: query 'UPDATE t1 SET c2 = c2 + c1' failed: <Unknown> (2013): Lost connection to server during query
      #5  0x000055a7aed68614 in ut_dbg_assertion_failed (expr=0x55a7af241007 "count == list.count", file=<optimized out>, line=<optimized out>, line@entry=467) at /mariadb/10.6/storage/innobase/ut/ut0dbg.cc:60
      #6  0x000055a7aedeba16 in ut_list_map<ut_list_base<buf_page_t, ut_list_node<buf_page_t> buf_page_t::*>, Check> (list=<optimized out>, functor=@0x7fc07cfef930: {<No data fields>}) at /mariadb/10.6/storage/innobase/include/ut0lst.h:467
      #7  ut_list_validate<ut_list_base<buf_page_t, ut_list_node<buf_page_t> buf_page_t::*>, Check> (list=<optimized out>, functor=@0x7fc07cfef930: {<No data fields>}) at /mariadb/10.6/storage/innobase/include/ut0lst.h:496
      #8  buf_flush_validate_low () at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:2475
      #9  0x000055a7aedeab03 in buf_flush_validate_skip () at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:124
      #10 buf_pool_t::insert_into_flush_list (this=0x55a7af831280 <buf_pool>, block=0x7fc07d36f578, lsn=158319367) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:204
      #11 0x000055a7aec166ad in buf_flush_note_modification (block=0x7fc07d36f578, start_lsn=158319367, end_lsn=158323907) at /mariadb/10.6/storage/innobase/include/buf0flu.ic:62
      #12 ReleaseBlocks::operator() (this=<optimized out>, this@entry=0x7fc07cfefa30, slot=slot@entry=0x7fc07cff01a8) at /mariadb/10.6/storage/innobase/mtr/mtr0mtr.cc:348
      #13 0x000055a7aec13481 in CIterate<ReleaseBlocks const>::operator() (this=0x7fc07cfefa30, block=<optimized out>) at /mariadb/10.6/storage/innobase/mtr/mtr0mtr.cc:61
      #14 mtr_buf_t::for_each_block_in_reverse<CIterate<ReleaseBlocks const> > (this=<optimized out>, this@entry=0x7fc07cff0160, functor=@0x7fc07cfefa30: {functor = {start = 158319367, end = 158323907, memo = @0x7fc07cff0160}}) at /mariadb/10.6/storage/innobase/include/dyn0buf.h:379
      #15 0x000055a7aec0ff3c in mtr_t::commit (this=<optimized out>) at /mariadb/10.6/storage/innobase/mtr/mtr0mtr.cc:444


        Issue Links



              marko Marko Mäkelä
              marko Marko Mäkelä
              0 Vote for this issue
              1 Start watching this issue



                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.