Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.11, 11.4, 11.8
-
None
Description
While executing the innodb.log_file_size_online test case, it fails with
CURRENT_TEST: innodb.log_file_size_online
mysqltest: At line 59: query 'SELECT * FROM t WHERE a<10' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
2024-12-05 16:19:56 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=5, page number=345]
|
2024-12-05 16:19:56 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
|
2024-12-05 16:19:56 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
|
Debugged the test case failure, while startup fails with
Thread 2 hit Breakpoint 1, log_phys_t::page_checksum (block=...,
|
l=0x78eeebcf335f "\267H\251N")
|
at /home/thiru/source_file/server/10.11/storage/innobase/log/log0recv.cc:197
|
197 ib::error() << "OPT_PAGE_CHECKSUM mismatch on " << block.page.id();
|
(rr) f 2
|
p#2 0x000063c80564908d in recv_recover_page (block=0x78eee99011e0,
|
mtr=..., recs=..., space=0x63c807c73c68, init=0x0)
|
at /home/thiru/source_file/server/10.11/storage/innobase/log/log0recv.cc:3248
|
3248 log_phys_t::apply_status a= l->apply(*block, recs.last_offset);
|
(rr) p *l
|
$2 = {<log_rec_t> = {next = 0x78eeebcf3368, lsn = 7415422},
|
start_lsn = 7415393}
|
Checksum fails to match with page id is the following
(rr) p mach_read_from_4(l)
|
$1 = 3074992462
|
During recovery, these are the things to be noticed.
(rr) p checkpoint_lsn
|
$5 = 7294363
|
(rr) p end_lsn
|
$6 = 7415362
|
(rr) p first_lsn
|
$7 = 7292882
|
Let's visit before restart:
=====================
page_checksum calculated for page_id 21474836825 is 3074992462 during m_commit_lsn: 7323774
mtr::page_checksum calculated 7323774 for the page_id
|
(rr) p bpage.id_.m_id
|
$1 = 21474836825
|
(rr) p checksum
|
$2 = 3074992462
|
Thread 2 hit Hardware watchpoint 3: -location m_commit_lsn
|
Old value = 0
|
New value = 7323774
|
mtr_t::finish_writer<false, false> (mtr=0x7673079ff4e0, len=29)
|
at /home/thiru/source_file/server/10.11/storage/innobase/mtr/mtr0mtr.cc:1391
|
1391 return {start.first, log_close(mtr->m_commit_lsn)};
|
|
(rr) f 0
|
#0 log_t::append (
|
d=@0x7673079ff0b8: 0x76730c80348f "G\005\200َ\022\006",
|
s=0x7673079ff658, size=24)
|
at /home/thiru/source_file/server/10.11/storage/innobase/mtr/mtr0m
|
tr.cc:1308
|
(rr) x/24b d
|
0x76730c80348f: 0x47 0x05 0x80 0xd9 0x8e 0x12 0x06 0x00
|
0x76730c803497: 0xb2 0x00 0x80 0xc3 0x00 0x06 0x00 0x78
|
0x76730c80349f: 0x05 0x80 0xd9 0x00 0xb7 0x48 0xa9 0x4e
|
Above is the redo log written for checksum for the page
After that, InnoDB does flushes the log thrice. There are 3 log_t::write_buf() calls
At the end of the first log_t::write_buf()
(rr) p write_lsn
|
$2 = 7353588
|
checksum redo log was flushed to disk during first write_buf() call itself.
Note: Every write_buf() swaps resize_buf with resize_flush_buf
At the end of the 2nd log_t::write_buf():
(rr) p write_lsn
|
$4 = 7401982
|
let's have detailed look at the last log_t::write_buf():
At the start
|
|
(rr) p write_lsn
|
$16 = 7401982
|
(rr) p lsn
|
$17 = 7415393
|
Initially, length is 13455. We copy the last non-aligned
block to resize_flush_buf. Round off the length to 13824 (multiple of 512 bytes)
InnoDB writes the old resize_buffer garbage value to the log file for the remaining
349 bytes. Since InnoDb fails to mark the end of the log in resize_buf. During
recovery, InnoDB reads the old mtr record and fails.
|
(rr) x/24b resize_buf + 13455
|
0x76730420348f: 0x47 0x05 0x80 0xd9 0x8e 0x12 0x06 0x00
|
0x767304203497: 0xb2 0x00 0x80 0xc3 0x00 0x06 0x00 0x78
|
0x76730420349f: 0x05 0x80 0xd9 0x00 0xb7 0x48 0xa9 0x4e
|
|
Above bytes matches with mtr checksum value which was written during 7323774
Attachments
Issue Links
- relates to
-
MDEV-35608 Fake PMEM on /dev/shm no longer works
- Closed
-
MDEV-35411 innodb.log_file_size_online occasionally fails
- Closed