Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32811

Potentially broken crash recovery if a mini-transaction frees a page, not modifying previously clean pages

Details

    Description

      The MDEV-14795 test innodb.sys_truncate_debug in 11.2 would occasionally fail like this in the setup phase that is executing INSERT and using a debug setting to limit the number of records per page:

      bb-11.2-release 0427c4739e7de5cd52d5e83479acf11ddb033ce6

      innodb.sys_truncate_debug 'innodb'       w1 [ fail ]
              Test ended at 2023-11-14 20:04:26
       
      CURRENT_TEST: innodb.sys_truncate_debug
      mysqltest: At line 20: query 'INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384' failed: <Unknown> (2013): Lost connection to server during query
      buf/buf0flu.cc:766(buf_page_t::flush(bool, fil_space_t*))[0x1406b5d80]
      buf/buf0flu.cc:1103(buf_flush_try_neighbors(fil_space_t*, page_id_t, buf_page_t*, bool, bool, unsigned long, unsigned long))[0x1406b7cd4]
      buf/buf0flu.cc:1350(buf_flush_LRU_list_batch(unsigned long, bool, flush_counters_t*))[0x1406b9120]
      buf/buf0flu.cc:1390(buf_do_LRU_batch(unsigned long, bool, flush_counters_t*))[0x1406b93d8]
      buf/buf0flu.cc:1734(buf_flush_LRU(unsigned long, bool))[0x1406bae34]
      buf/buf0lru.cc:506(buf_LRU_get_free_block(buf_LRU_get))[0x1406c657c]
      fsp/fsp0fsp.cc:1063(fsp_page_create(fil_space_t*, unsigned int, mtr_t*))[0x140769ae4]
      fsp/fsp0fsp.cc:2230(fseg_alloc_free_page_low(fil_space_t*, unsigned char*, buf_block_t*, unsigned int, unsigned char, bool, mtr_t*, mtr_t*, dberr_t*))[0x14076efe8]
      fsp/fsp0fsp.cc:2284(fseg_alloc_free_page_general(unsigned char*, unsigned int, unsigned char, bool, mtr_t*, mtr_t*, dberr_t*))[0x14076f1a8]
      btr/btr0btr.cc:567(btr_page_alloc(dict_index_t*, unsigned int, unsigned char, unsigned long, mtr_t*, mtr_t*, dberr_t*))[0x140624dd8]
      btr/btr0btr.cc:2833(btr_page_split_and_insert(unsigned long, btr_cur_t*, unsigned short**, mem_block_info_t**, dtuple_t const*, unsigned long, mtr_t*, dberr_t*))[0x1406310b0]
      btr/btr0cur.cc:2534(btr_cur_pessimistic_insert(unsigned long, btr_cur_t*, unsigned short**, mem_block_info_t**, dtuple_t*, unsigned char**, big_rec_t**, unsigned long, que_thr_t*, mtr_t*))[0x140660a5c]
      row/row0ins.cc:3147(row_ins_sec_index_entry_low(unsigned long, btr_latch_mode, dict_index_t*, mem_block_info_t*, mem_block_info_t*, dtuple_t*, unsigned long, que_thr_t*))[0x1404ee318]
      row/row0ins.cc:3332(row_ins_sec_index_entry(dict_index_t*, dtuple_t*, que_thr_t*, bool))[0x1404eebdc]
      row/row0ins.cc:3377(row_ins_index_entry(dict_index_t*, dtuple_t*, que_thr_t*))[0x1404eef04]
      row/row0ins.cc:3543(row_ins_index_entry_step(ins_node_t*, que_thr_t*))[0x1404efa68]
      row/row0ins.cc:3660(row_ins(ins_node_t*, que_thr_t*))[0x1404eff9c]
      row/row0ins.cc:3789(row_ins_step(que_thr_t*))[0x1404f0898]
      row/row0mysql.cc:1314(row_insert_for_mysql(unsigned char const*, row_prebuilt_t*, ins_mode_t))[0x14051c328]
      handler/ha_innodb.cc:7835(ha_innobase::write_row(unsigned char const*))[0x1402daa68]
      sql/handler.cc:7852(handler::ha_write_row(unsigned char const*))[0x13fdb86e0]
      

      thiru was able to reproduce this under rr on his system.

      The reason for this failure is that the same mini-transaction is freeing, allocating, and freeing the same index page. The page initialization would clear the FIL_PAGE_LSN field on the page.

      Before MDEV-27774 in 10.8 refactored some code, there was the function buf_flush_note_modification() that would set the FIL_PAGE_LSN on all blocks that a mini-transaction modified. Starting with 10.8, this logic is within mtr_t::commit() itself, in two copies. The second copy of the logic, in the !m_made_dirty code path, wrongly skips the setting of FIL_PAGE_LSN on blocks that are marked as freed:

                  if (s >= buf_page_t::UNFIXED)
                  {
                    mach_write_to_8(bpage->frame + FIL_PAGE_LSN, m_commit_lsn);
                    if (UNIV_LIKELY_NULL(bpage->zip.data))
                      memcpy_aligned<8>(FIL_PAGE_LSN + bpage->zip.data,
                                        FIL_PAGE_LSN + bpage->frame, 8);
                  }
      

      The if condition must be removed here, so that the write-ahead logging protocol will be followed.

      As far as I can tell, the user impact of this on non-debug builds is that the log checkpoint may be wrongly advanced too early, similar to MDEV-32757. This could cause crash recovery and backup to produce corrupted data.

      The bug was introduced in this merge of MDEV-29383 and many other changes.

      Attachments

        Issue Links

          Activity

            thiru, please push the if removal to 10.11. In the commit message, mention the 11.2 test case that caught it. Because the failure was sporadic (depending on when the buf_flush_page_cleaner() thread is being scheduled, I do not think that it makes sense to try to write a regression test for this.

            marko Marko Mäkelä added a comment - thiru , please push the if removal to 10.11. In the commit message, mention the 11.2 test case that caught it. Because the failure was sporadic (depending on when the buf_flush_page_cleaner() thread is being scheduled, I do not think that it makes sense to try to write a regression test for this.

            The debug assertion failure that this test triggers may occur if the same mini-transaction frees, allocates, and frees a page. But, I think that the problem of invalid log checkpoint advances is possible whenever a page is freed just once.

            marko Marko Mäkelä added a comment - The debug assertion failure that this test triggers may occur if the same mini-transaction frees, allocates, and frees a page. But, I think that the problem of invalid log checkpoint advances is possible whenever a page is freed just once.

            People

              thiru Thirunarayanan Balathandayuthapani
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.