[MDEV-32163] Crash recovery fails after DROP TABLE in system tablespace - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: N/A
Fix Version/s: 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3
Component/s: Backup, Storage Engine - InnoDB, Tests
Labels:
None

Description

Fix for ~~MDEV-32028~~ cherry-picked into preview-11.3-preview breaks mariabackup.sys_truncate test sporadically.

There are 5 InnoDB commits in 11.3 preview, after the first one the test already starts to fail.

Attachments

Issue Links

relates to

MDEV-28699 Shrink temporary tablespaces without restart

Closed

MDEV-32028 InnoDB scrubbing doesn't write zero while freeing the extent

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2023-09-14 11:30

Using the mtr test case that thiru provided, I was able to produce an rr replay trace of this. I got an OPT_PAGE_CHECKSUM failure (~~MDEV-18976~~); these records are something that is being written in debug builds. The failure occurred for the following OPT_PAGE_CHECKSUM record:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf
(rr) p *this
$2 = {<log_rec_t> = {next = 0x7f1871570dc8, lsn = 8448794}, start_lsn = 8448769}

In that page, we had FIL_PAGE_LSN of 6281382, which is too old. The recovery had started like this:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf
2023-09-14 11:08:03 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=8415203,9398815
2023-09-14 11:08:03 0 [Note] InnoDB: To recover: 260 pages
2023-09-14 11:08:04 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=0, page number=494]

I found the corresponding mini-transaction in the rr replay trace of the server that was intentionally killed by the test. The page was being marked as freed by the following:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf
#3 0x00005598517b44a2 in buf_page_free (space=space@entry=0x5598545749f8, page=page@entry=494, mtr=mtr@entry=0x7e813d744a70) at /mariadb/10.6m/storage/innobase/buf/buf0buf.cc:2157
#4 0x00005598517eb685 in fseg_free_extent (seg_inode=seg_inode@entry=0x19806e8cf872 "", iblock=0x19806e4535f0, space=space@entry=0x5598545749f8, page=448, mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false)
at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2767
#5 0x00005598517ed72c in fseg_free_step (header=0x19806e90c04a "", mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2831
#6 0x00005598517f450c in btr_free_but_not_root (block=block@entry=0x19806e4542f0, log_mode=MTR_LOG_ALL, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1142
#7 0x00005598517f6fb2 in btr_free_if_exists (space=space@entry=0x5598545749f8, page=page@entry=313, index_id=<optimized out>, mtr=mtr@entry=0x7e813d745150)
at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1229
#8 0x0000559851870d44 in dict_drop_index_tree (pcur=pcur@entry=0x5598545d15e0, trx=trx@entry=0x0, mtr=mtr@entry=0x7e813d745150) at /mariadb/10.6m/storage/innobase/dict/dict0crea.cc:905
#9 0x000055985198a455 in row_purge_remove_clust_if_poss_low (node=node@entry=0x5598545d1540, mode=mode@entry=BTR_MODIFY_LEAF) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:166
#10 0x000055985198aea9 in row_purge_remove_clust_if_poss (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:259
#11 0x000055985198b043 in row_purge_del_mark (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:635
#12 0x000055985198b799 in row_purge_record_func (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0, updated_extern=false)
at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1215
#13 0x000055985198b90e in row_purge (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1276

This is the purge of delete-marking a SYS_INDEXES record corresponding to DROP TABLE t1 in the system tablespace.

This mini-transaction (at LSN=9388592) failed to write any FREE_PAGE record. At mtr_t::commit(), m_memo would comprise 69 latches, and 80 bytes of log would have been written: WRITE, MEMSET and MEMMOVE records and two OPT_PAGE_CHECKSUM for other pages than this freed page.

The oldest_modification of the page was 8407829 (which is before the recovery checkpoint LSN) at the time when it was cleared by the page cleaner.

I think that the problem is that there was no FREE_PAGE record written for this page 494 (0x1ee). We do have it for some neighbours, in mlog_init at the time the recovery catches the error:

…

[{m_id = 446}] = {lsn = 9388941, created = false},

[{m_id = 447}] = {lsn = 9388900, created = false},

[{m_id = 512}] = {lsn = 9390540, created = false},

[{m_id = 513}] = {lsn = 9392084, created = false},

…

The test case that I used is as follows:

--source include/have_innodb.inc

--source include/have_sequence.inc

SET GLOBAL INNODB_FILE_PER_TABLE= 0;

CREATE TABLE t1(f1 INT NOT NULL, f2 INT NOT NULL,

                f3 INT NOT NULL, INDEX(f1),

                INDEX(f2), INDEX(f3))ENGINE=InnoDB;

INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;

INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;

INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;

DROP TABLE t1;

SET GLOBAL innodb_max_purge_lag_wait=0;

let $shutdown_timeout=0;

--source include/restart_mysqld.inc

This would seem to fail reliably when not using rr. I produced my rr trace as follows:

while ./mtr --rr=-h --parallel=auto name_of_test{,,,,,,,,,,,,,,,,}; do :; done

Marko Mäkelä added a comment - 2023-09-14 11:30 Using the mtr test case that thiru provided, I was able to produce an rr replay trace of this. I got an OPT_PAGE_CHECKSUM failure ( MDEV-18976 ); these records are something that is being written in debug builds. The failure occurred for the following OPT_PAGE_CHECKSUM record: 10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf (rr) p *this $2 = {<log_rec_t> = {next = 0x7f1871570dc8, lsn = 8448794}, start_lsn = 8448769} In that page, we had FIL_PAGE_LSN of 6281382, which is too old. The recovery had started like this: 10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf 2023-09-14 11:08:03 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=8415203,9398815 2023-09-14 11:08:03 0 [Note] InnoDB: To recover: 260 pages 2023-09-14 11:08:04 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=0, page number=494] I found the corresponding mini-transaction in the rr replay trace of the server that was intentionally killed by the test. The page was being marked as freed by the following: 10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf #3 0x00005598517b44a2 in buf_page_free (space=space@entry=0x5598545749f8, page=page@entry=494, mtr=mtr@entry=0x7e813d744a70) at /mariadb/10.6m/storage/innobase/buf/buf0buf.cc:2157 #4 0x00005598517eb685 in fseg_free_extent (seg_inode=seg_inode@entry=0x19806e8cf872 "", iblock=0x19806e4535f0, space=space@entry=0x5598545749f8, page=448, mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2767 #5 0x00005598517ed72c in fseg_free_step (header=0x19806e90c04a "", mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2831 #6 0x00005598517f450c in btr_free_but_not_root (block=block@entry=0x19806e4542f0, log_mode=MTR_LOG_ALL, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1142 #7 0x00005598517f6fb2 in btr_free_if_exists (space=space@entry=0x5598545749f8, page=page@entry=313, index_id=<optimized out>, mtr=mtr@entry=0x7e813d745150) at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1229 #8 0x0000559851870d44 in dict_drop_index_tree (pcur=pcur@entry=0x5598545d15e0, trx=trx@entry=0x0, mtr=mtr@entry=0x7e813d745150) at /mariadb/10.6m/storage/innobase/dict/dict0crea.cc:905 #9 0x000055985198a455 in row_purge_remove_clust_if_poss_low (node=node@entry=0x5598545d1540, mode=mode@entry=BTR_MODIFY_LEAF) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:166 #10 0x000055985198aea9 in row_purge_remove_clust_if_poss (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:259 #11 0x000055985198b043 in row_purge_del_mark (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:635 #12 0x000055985198b799 in row_purge_record_func (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0, updated_extern=false) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1215 #13 0x000055985198b90e in row_purge (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1276 This is the purge of delete-marking a SYS_INDEXES record corresponding to DROP TABLE t1 in the system tablespace. This mini-transaction (at LSN=9388592) failed to write any FREE_PAGE record. At mtr_t::commit() , m_memo would comprise 69 latches, and 80 bytes of log would have been written: WRITE , MEMSET and MEMMOVE records and two OPT_PAGE_CHECKSUM for other pages than this freed page. The oldest_modification of the page was 8407829 (which is before the recovery checkpoint LSN) at the time when it was cleared by the page cleaner. I think that the problem is that there was no FREE_PAGE record written for this page 494 (0x1ee). We do have it for some neighbours, in mlog_init at the time the recovery catches the error: … [{m_id = 446}] = {lsn = 9388941, created = false}, [{m_id = 447}] = {lsn = 9388900, created = false}, [{m_id = 512}] = {lsn = 9390540, created = false}, [{m_id = 513}] = {lsn = 9392084, created = false}, … The test case that I used is as follows: --source include/have_innodb.inc --source include/have_sequence.inc SET GLOBAL INNODB_FILE_PER_TABLE= 0; CREATE TABLE t1(f1 INT NOT NULL , f2 INT NOT NULL , f3 INT NOT NULL , INDEX (f1), INDEX (f2), INDEX (f3))ENGINE=InnoDB; INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384; INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384; INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384; DROP TABLE t1; SET GLOBAL innodb_max_purge_lag_wait=0; let $shutdown_timeout=0; --source include/restart_mysqld.inc This would seem to fail reliably when not using rr . I produced my rr trace as follows: while . /mtr --rr=-h --parallel=auto name_of_test{,,,,,,,,,,,,,,,,}; do :; done

Marko Mäkelä added a comment - 2023-09-14 12:24

I was able to reproduce this on 10.5 as well:

10.5 cb1965bd9d840665b2ea52216985d88c39a8a9f9

mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.

mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.

This was with the test case but not the code change of of my 10.5 fix.

Marko Mäkelä added a comment - 2023-09-14 12:24 I was able to reproduce this on 10.5 as well: 10.5 cb1965bd9d840665b2ea52216985d88c39a8a9f9 mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed. mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed. This was with the test case but not the code change of of my 10.5 fix .

People

Assignee:: Marko Mäkelä

Reporter:: Sergei Golubchik

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2023-09-13 19:06

Updated:: 2023-10-21 19:22

Resolved:: 2023-09-14 13:50

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server