[MDEV-29559] Recovery of INSERT_HEAP_DYNAMIC into secondary index fails Created: 2022-09-16  Updated: 2022-09-19  Resolved: 2022-09-19

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.5.18, 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-21724 Optimize page_cur_insert_rec_low() re... Closed

 Description   

--source include/have_innodb.inc
 
CREATE TABLE t (a YEAR PRIMARY KEY, KEY(a)) ENGINE=InnoDB;
INSERT INTO t VALUES (2005),(2030);
 
--let $shutdown_timeout=0
--source include/restart_mysqld.inc
 
# Cleanup
DROP TABLE t;

10.5 fb70bb44

2022-09-16 23:50:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47807,47807
2022-09-16 23:50:59 0 [Note] InnoDB: Starting final batch to recover 11 pages from redo log.
2022-09-16 23:50:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=5, page number=4]
2022-09-16 23:50:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2022-09-16 23:50:59 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1501] with error Data structure corruption

10.6+ produces a bit more output

10.6 fd0bdd31

2022-09-17  0:17:10 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=42790,42790
2022-09-17  0:17:10 0 [Note] InnoDB: Starting final batch to recover 7 pages from redo log.
2022-09-17  0:17:10 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=5, page number=4]
2022-09-17  0:17:10 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=4]; set innodb_force_recovery to ignore
2022-09-17  0:17:10 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2022-09-17  0:17:10 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
2022-09-17  0:17:10 0 [Note] InnoDB: Starting shutdown...
2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=9]; set innodb_force_recovery to ignore
2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=3]; set innodb_force_recovery to ignore
2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_table_stats.ibd': Page read from tablespace is corrupted.
2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 9 from file './ibdata1': Page read from tablespace is corrupted.
2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=2, page number=3]; set innodb_force_recovery to ignore
2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_index_stats.ibd': Page read from tablespace is corrupted.
2022-09-17  0:17:10 0 [ERROR] Plugin 'InnoDB' init function returned error.

Reproducible on 10.5-10.10, including older minor releases.
Not reproducible on 10.4.



 Comments   
Comment by Marko Mäkelä [ 2022-09-19 ]

Here is a slightly simpler test case. The important part is that the first byte of the record must be 0x69, matching the first byte of the infimum pseudo-record:

--source include/have_innodb.inc
 
CREATE TABLE t (a TINYINT UNSIGNED PRIMARY KEY, KEY(a)) ENGINE=InnoDB;
INSERT INTO t VALUES (0x69);
 
--let $shutdown_timeout=0
--source include/restart_mysqld.inc
 
DROP TABLE t;

Comment by Marko Mäkelä [ 2022-09-19 ]

The recovery fails because the log record parser throws an error. The problematic mini-transaction log is as follows:

bytes explanation
0x34, 0x5, 0x4, 0x3f, 0x16, WRITE, tablespace=5, page=4, offset=0x3f, bytes=0x16 (setting PAGE_MAX_TRX_ID=0x16)
0xa5, 0x6, 0x0, 0x0, 0x0, 0x1 EXTENDED, INSERT_HEAP_DYNAMIC
0x77, 0x5, 0x4, 0x0, 0xf7, 0xfd, 0x7a, 0xcd, OPTION, PAGE_CHECKSUM
0x0 end of mini-transaction

Parsing the INSERT_HEAP_DYNAMIC record fails because of an off-by-one error:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 7db0fdbefa5..2e7ad1f33a6 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -393,7 +393,7 @@ struct log_phys_t : public log_rec_t
             rlen-= ll;
             l+= ll;
             ll= mlog_decode_varint_length(*l);
-            if (UNIV_UNLIKELY(ll > 3 || ll >= rlen))
+            if (UNIV_UNLIKELY(ll > 3 || ll > rlen))
               goto record_corrupted;
             size_t data_c= mlog_decode_varint(l);
             ut_ad(data_c != MLOG_DECODE_ERROR);

In this case, the secondary index record is (a)=(0x69), and all of the data will have been copied from the infimum record. The data_len=rlen-ll that will be passed to page_apply_insert_dynamic() will be 0.

Comment by Marko Mäkelä [ 2022-09-19 ]

The reason why a variant of this test case with ROW_FORMAT=REDUNDANT does not fail is that the INSERT_HEAP_REDUNDANT record would include 1 literal byte of record header. I even tried a variant where the length of the field would match the infimum pseudo-record:

--source include/have_innodb.inc
 
CREATE TABLE t (a BINARY(8) PRIMARY KEY, KEY(a)) ENGINE=InnoDB
ROW_FORMAT=REDUNDANT;
INSERT INTO t VALUES (0x696e66696d756d00);
 
--let $shutdown_timeout=0
--source include/restart_mysqld.inc
 
CHECK TABLE t;
DROP TABLE t;

The reason why the header is not being copied from the infimum pseudo-record is that the following condition in page_cur_insert_rec_low() will not hold:

    if (c_start >=
        &block->page.frame[PAGE_OLD_SUPREMUM_END + REC_N_OLD_EXTRA_BYTES])
    {
      /* Find common header bytes with the preceding record. */

In other words, the following adjustment to the ROW_FORMAT=REDUNDANT parsing code is not strictly necessary, because some literal bytes will always be present in the log record:

@@ -420,7 +420,7 @@ struct log_phys_t : public log_rec_t
             rlen-= ll;
             l+= ll;
             ll= mlog_decode_varint_length(*l);
-            if (UNIV_UNLIKELY(ll > 2 || ll >= rlen))
+            if (UNIV_UNLIKELY(ll > 2 || ll > rlen))
               goto record_corrupted;
             size_t data_c= mlog_decode_varint(l);
             rlen-= ll;

Comment by Marko Mäkelä [ 2022-09-19 ]

An INSERT of a single record (not 2 records as in the test case) could fail like this:

10.5 3c8674edcc99605aa4634145771b2195c3f4f007

2022-09-19 11:03:24 0 [Note] InnoDB: Starting final batch to recover 61 pages from redo log.
2022-09-19 11:03:25 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1501] with error Data structure corruption
2022-09-19 11:03:25 0 [Note] InnoDB: Starting shutdown...
2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

Comment by Marko Mäkelä [ 2022-09-19 ]

I believe that this only affects secondary indexes where the keys are fixed length and the data encodes to a prefix of the string infimum and a NUL byte. This could affect the insert of the first record into any secondary index leaf page.

Generated at Thu Feb 08 10:09:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.