[MDEV-32811] Potentially broken crash recovery if a mini-transaction frees a page, not modifying previously clean pages - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.11.2, 11.0.1, 10.8.7, 10.9.5, 10.10.3, 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
Fix Version/s: 10.11.7, 11.0.5, 11.1.4, 11.2.3
Component/s: Backup, Storage Engine - InnoDB
Labels:

Description

The ~~MDEV-14795~~ test innodb.sys_truncate_debug in 11.2 would occasionally fail like this in the setup phase that is executing INSERT and using a debug setting to limit the number of records per page:

bb-11.2-release 0427c4739e7de5cd52d5e83479acf11ddb033ce6
innodb.sys_truncate_debug 'innodb' w1 [ fail ]
Test ended at 2023-11-14 20:04:26

CURRENT_TEST: innodb.sys_truncate_debug
mysqltest: At line 20: query 'INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384' failed: <Unknown> (2013): Lost connection to server during query
…
buf/buf0flu.cc:766(buf_page_t::flush(bool, fil_space_t*))[0x1406b5d80]
buf/buf0flu.cc:1103(buf_flush_try_neighbors(fil_space_t, page_id_t, buf_page_t, bool, bool, unsigned long, unsigned long))[0x1406b7cd4]
buf/buf0flu.cc:1350(buf_flush_LRU_list_batch(unsigned long, bool, flush_counters_t*))[0x1406b9120]
buf/buf0flu.cc:1390(buf_do_LRU_batch(unsigned long, bool, flush_counters_t*))[0x1406b93d8]
buf/buf0flu.cc:1734(buf_flush_LRU(unsigned long, bool))[0x1406bae34]
buf/buf0lru.cc:506(buf_LRU_get_free_block(buf_LRU_get))[0x1406c657c]
fsp/fsp0fsp.cc:1063(fsp_page_create(fil_space_t, unsigned int, mtr_t))[0x140769ae4]
fsp/fsp0fsp.cc:2230(fseg_alloc_free_page_low(fil_space_t, unsigned char, buf_block_t, unsigned int, unsigned char, bool, mtr_t, mtr_t, dberr_t))[0x14076efe8]
fsp/fsp0fsp.cc:2284(fseg_alloc_free_page_general(unsigned char, unsigned int, unsigned char, bool, mtr_t, mtr_t, dberr_t))[0x14076f1a8]
btr/btr0btr.cc:567(btr_page_alloc(dict_index_t, unsigned int, unsigned char, unsigned long, mtr_t, mtr_t, dberr_t))[0x140624dd8]
btr/btr0btr.cc:2833(btr_page_split_and_insert(unsigned long, btr_cur_t, unsigned short, mem_block_info_t, dtuple_t const, unsigned long, mtr_t, dberr_t))[0x1406310b0]
btr/btr0cur.cc:2534(btr_cur_pessimistic_insert(unsigned long, btr_cur_t, unsigned short, mem_block_info_t, dtuple_t, unsigned char, big_rec_t, unsigned long, que_thr_t, mtr_t))[0x140660a5c]
row/row0ins.cc:3147(row_ins_sec_index_entry_low(unsigned long, btr_latch_mode, dict_index_t, mem_block_info_t, mem_block_info_t, dtuple_t, unsigned long, que_thr_t*))[0x1404ee318]
row/row0ins.cc:3332(row_ins_sec_index_entry(dict_index_t, dtuple_t, que_thr_t*, bool))[0x1404eebdc]
row/row0ins.cc:3377(row_ins_index_entry(dict_index_t, dtuple_t, que_thr_t*))[0x1404eef04]
row/row0ins.cc:3543(row_ins_index_entry_step(ins_node_t, que_thr_t))[0x1404efa68]
row/row0ins.cc:3660(row_ins(ins_node_t, que_thr_t))[0x1404eff9c]
row/row0ins.cc:3789(row_ins_step(que_thr_t*))[0x1404f0898]
row/row0mysql.cc:1314(row_insert_for_mysql(unsigned char const, row_prebuilt_t, ins_mode_t))[0x14051c328]
handler/ha_innodb.cc:7835(ha_innobase::write_row(unsigned char const*))[0x1402daa68]
sql/handler.cc:7852(handler::ha_write_row(unsigned char const*))[0x13fdb86e0]

thiru was able to reproduce this under rr on his system.

The reason for this failure is that the same mini-transaction is freeing, allocating, and freeing the same index page. The page initialization would clear the FIL_PAGE_LSN field on the page.

Before ~~MDEV-27774~~ in 10.8 refactored some code, there was the function buf_flush_note_modification() that would set the FIL_PAGE_LSN on all blocks that a mini-transaction modified. Starting with 10.8, this logic is within mtr_t::commit() itself, in two copies. The second copy of the logic, in the !m_made_dirty code path, wrongly skips the setting of FIL_PAGE_LSN on blocks that are marked as freed:

            if (s >= buf_page_t::UNFIXED)

              mach_write_to_8(bpage->frame + FIL_PAGE_LSN, m_commit_lsn);

              if (UNIV_LIKELY_NULL(bpage->zip.data))

                memcpy_aligned<8>(FIL_PAGE_LSN + bpage->zip.data,

                                  FIL_PAGE_LSN + bpage->frame, 8);

The if condition must be removed here, so that the write-ahead logging protocol will be followed.

As far as I can tell, the user impact of this on non-debug builds is that the log checkpoint may be wrongly advanced too early, similar to ~~MDEV-32757~~. This could cause crash recovery and backup to produce corrupted data.

The bug was introduced in this merge of ~~MDEV-29383~~ and many other changes.

Attachments

Issue Links

relates to

MDEV-32817 在最近将版本升级到10.11.5后，针对表进行频繁的读写操作不久后，出现index for table xxxx is corrupt，随后此表tablespace xxxxxx corrupted，最后Tablespace is missing for a table，此表已完全不可用

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2023-11-15 13:58

thiru, please push the if removal to 10.11. In the commit message, mention the 11.2 test case that caught it. Because the failure was sporadic (depending on when the buf_flush_page_cleaner() thread is being scheduled, I do not think that it makes sense to try to write a regression test for this.

Marko Mäkelä added a comment - 2023-11-15 13:58 thiru , please push the if removal to 10.11. In the commit message, mention the 11.2 test case that caught it. Because the failure was sporadic (depending on when the buf_flush_page_cleaner() thread is being scheduled, I do not think that it makes sense to try to write a regression test for this.

Marko Mäkelä added a comment - 2023-11-15 18:20

The debug assertion failure that this test triggers may occur if the same mini-transaction frees, allocates, and frees a page. But, I think that the problem of invalid log checkpoint advances is possible whenever a page is freed just once.

Marko Mäkelä added a comment - 2023-11-15 18:20 The debug assertion failure that this test triggers may occur if the same mini-transaction frees, allocates, and frees a page. But, I think that the problem of invalid log checkpoint advances is possible whenever a page is freed just once.

People

Assignee:: Thirunarayanan Balathandayuthapani

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2023-11-15 13:53

Updated:: 2023-11-16 12:17

Resolved:: 2023-11-16 12:16

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server