[MDEV-32163] Crash recovery fails after DROP TABLE in system tablespace Created: 2023-09-13  Updated: 2023-10-21  Resolved: 2023-09-14

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB, Tests
Affects Version/s: N/A
Fix Version/s: 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3

Type: Bug Priority: Critical
Reporter: Sergei Golubchik Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-28699 Shrink temporary tablespaces without ... Closed
relates to MDEV-32028 InnoDB scrubbing doesn't write zero w... Closed

 Description   

Fix for MDEV-32028 cherry-picked into preview-11.3-preview breaks mariabackup.sys_truncate test sporadically.

There are 5 InnoDB commits in 11.3 preview, after the first one the test already starts to fail.



 Comments   
Comment by Marko Mäkelä [ 2023-09-14 ]

Using the mtr test case that thiru provided, I was able to produce an rr replay trace of this. I got an OPT_PAGE_CHECKSUM failure (MDEV-18976); these records are something that is being written in debug builds. The failure occurred for the following OPT_PAGE_CHECKSUM record:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf

(rr) p *this
$2 = {<log_rec_t> = {next = 0x7f1871570dc8, lsn = 8448794}, start_lsn = 8448769}

In that page, we had FIL_PAGE_LSN of 6281382, which is too old. The recovery had started like this:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf

2023-09-14 11:08:03 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=8415203,9398815
2023-09-14 11:08:03 0 [Note] InnoDB: To recover: 260 pages
2023-09-14 11:08:04 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=0, page number=494]

I found the corresponding mini-transaction in the rr replay trace of the server that was intentionally killed by the test. The page was being marked as freed by the following:

10.6 0f9acce3f2fd8d8f6334e10b08f9b5a031d40fdf

#3  0x00005598517b44a2 in buf_page_free (space=space@entry=0x5598545749f8, page=page@entry=494, mtr=mtr@entry=0x7e813d744a70) at /mariadb/10.6m/storage/innobase/buf/buf0buf.cc:2157
#4  0x00005598517eb685 in fseg_free_extent (seg_inode=seg_inode@entry=0x19806e8cf872 "", iblock=0x19806e4535f0, space=space@entry=0x5598545749f8, page=448, mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false)
    at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2767
#5  0x00005598517ed72c in fseg_free_step (header=0x19806e90c04a "", mtr=mtr@entry=0x7e813d744a70, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/fsp/fsp0fsp.cc:2831
#6  0x00005598517f450c in btr_free_but_not_root (block=block@entry=0x19806e4542f0, log_mode=MTR_LOG_ALL, ahi=ahi@entry=false) at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1142
#7  0x00005598517f6fb2 in btr_free_if_exists (space=space@entry=0x5598545749f8, page=page@entry=313, index_id=<optimized out>, mtr=mtr@entry=0x7e813d745150)
    at /mariadb/10.6m/storage/innobase/btr/btr0btr.cc:1229
#8  0x0000559851870d44 in dict_drop_index_tree (pcur=pcur@entry=0x5598545d15e0, trx=trx@entry=0x0, mtr=mtr@entry=0x7e813d745150) at /mariadb/10.6m/storage/innobase/dict/dict0crea.cc:905
#9  0x000055985198a455 in row_purge_remove_clust_if_poss_low (node=node@entry=0x5598545d1540, mode=mode@entry=BTR_MODIFY_LEAF) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:166
#10 0x000055985198aea9 in row_purge_remove_clust_if_poss (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:259
#11 0x000055985198b043 in row_purge_del_mark (node=node@entry=0x5598545d1540) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:635
#12 0x000055985198b799 in row_purge_record_func (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0, updated_extern=false)
    at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1215
#13 0x000055985198b90e in row_purge (node=node@entry=0x5598545d1540, undo_rec=undo_rec@entry=0x5c8478084070 "", thr=thr@entry=0x5598545d14a0) at /mariadb/10.6m/storage/innobase/row/row0purge.cc:1276

This is the purge of delete-marking a SYS_INDEXES record corresponding to DROP TABLE t1 in the system tablespace.

This mini-transaction (at LSN=9388592) failed to write any FREE_PAGE record. At mtr_t::commit(), m_memo would comprise 69 latches, and 80 bytes of log would have been written: WRITE, MEMSET and MEMMOVE records and two OPT_PAGE_CHECKSUM for other pages than this freed page.

The oldest_modification of the page was 8407829 (which is before the recovery checkpoint LSN) at the time when it was cleared by the page cleaner.

I think that the problem is that there was no FREE_PAGE record written for this page 494 (0x1ee). We do have it for some neighbours, in mlog_init at the time the recovery catches the error:

[{m_id = 446}] = {lsn = 9388941, created = false},
[{m_id = 447}] = {lsn = 9388900, created = false},
[{m_id = 512}] = {lsn = 9390540, created = false},
[{m_id = 513}] = {lsn = 9392084, created = false},

The test case that I used is as follows:

--source include/have_innodb.inc
--source include/have_sequence.inc
SET GLOBAL INNODB_FILE_PER_TABLE= 0;
CREATE TABLE t1(f1 INT NOT NULL, f2 INT NOT NULL,
                f3 INT NOT NULL, INDEX(f1),
                INDEX(f2), INDEX(f3))ENGINE=InnoDB;
INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;
INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;
INSERT INTO t1 SELECT seq, seq, seq FROM seq_1_to_16384;
DROP TABLE t1;
SET GLOBAL innodb_max_purge_lag_wait=0;
let $shutdown_timeout=0;
--source include/restart_mysqld.inc

This would seem to fail reliably when not using rr. I produced my rr trace as follows:

while ./mtr --rr=-h --parallel=auto name_of_test{,,,,,,,,,,,,,,,,}; do :; done

Comment by Marko Mäkelä [ 2023-09-14 ]

I was able to reproduce this on 10.5 as well:

10.5 cb1965bd9d840665b2ea52216985d88c39a8a9f9

mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.
mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:273: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.

This was with the test case but not the code change of of my 10.5 fix.

Generated at Thu Feb 08 10:29:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.