Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29559

Recovery of INSERT_HEAP_DYNAMIC into secondary index fails

Details

    Description

      --source include/have_innodb.inc
       
      CREATE TABLE t (a YEAR PRIMARY KEY, KEY(a)) ENGINE=InnoDB;
      INSERT INTO t VALUES (2005),(2030);
       
      --let $shutdown_timeout=0
      --source include/restart_mysqld.inc
       
      # Cleanup
      DROP TABLE t;
      

      10.5 fb70bb44

      2022-09-16 23:50:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47807,47807
      2022-09-16 23:50:59 0 [Note] InnoDB: Starting final batch to recover 11 pages from redo log.
      2022-09-16 23:50:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=5, page number=4]
      2022-09-16 23:50:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-09-16 23:50:59 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1501] with error Data structure corruption
      

      10.6+ produces a bit more output

      10.6 fd0bdd31

      2022-09-17  0:17:10 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=42790,42790
      2022-09-17  0:17:10 0 [Note] InnoDB: Starting final batch to recover 7 pages from redo log.
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=5, page number=4]
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=4]; set innodb_force_recovery to ignore
      2022-09-17  0:17:10 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
      2022-09-17  0:17:10 0 [Note] InnoDB: Starting shutdown...
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=9]; set innodb_force_recovery to ignore
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=3]; set innodb_force_recovery to ignore
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_table_stats.ibd': Page read from tablespace is corrupted.
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 9 from file './ibdata1': Page read from tablespace is corrupted.
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=2, page number=3]; set innodb_force_recovery to ignore
      2022-09-17  0:17:10 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_index_stats.ibd': Page read from tablespace is corrupted.
      2022-09-17  0:17:10 0 [ERROR] Plugin 'InnoDB' init function returned error.
      

      Reproducible on 10.5-10.10, including older minor releases.
      Not reproducible on 10.4.

      Attachments

        Issue Links

          Activity

            Here is a slightly simpler test case. The important part is that the first byte of the record must be 0x69, matching the first byte of the infimum pseudo-record:

            --source include/have_innodb.inc
             
            CREATE TABLE t (a TINYINT UNSIGNED PRIMARY KEY, KEY(a)) ENGINE=InnoDB;
            INSERT INTO t VALUES (0x69);
             
            --let $shutdown_timeout=0
            --source include/restart_mysqld.inc
             
            DROP TABLE t;
            

            marko Marko Mäkelä added a comment - Here is a slightly simpler test case. The important part is that the first byte of the record must be 0x69, matching the first byte of the infimum pseudo-record: --source include/have_innodb.inc   CREATE TABLE t (a TINYINT UNSIGNED PRIMARY KEY , KEY (a)) ENGINE=InnoDB; INSERT INTO t VALUES (0x69);   --let $shutdown_timeout=0 --source include/restart_mysqld.inc   DROP TABLE t;

            The recovery fails because the log record parser throws an error. The problematic mini-transaction log is as follows:

            bytes explanation
            0x34, 0x5, 0x4, 0x3f, 0x16, WRITE, tablespace=5, page=4, offset=0x3f, bytes=0x16 (setting PAGE_MAX_TRX_ID=0x16)
            0xa5, 0x6, 0x0, 0x0, 0x0, 0x1 EXTENDED, INSERT_HEAP_DYNAMIC
            0x77, 0x5, 0x4, 0x0, 0xf7, 0xfd, 0x7a, 0xcd, OPTION, PAGE_CHECKSUM
            0x0 end of mini-transaction

            Parsing the INSERT_HEAP_DYNAMIC record fails because of an off-by-one error:

            diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
            index 7db0fdbefa5..2e7ad1f33a6 100644
            --- a/storage/innobase/log/log0recv.cc
            +++ b/storage/innobase/log/log0recv.cc
            @@ -393,7 +393,7 @@ struct log_phys_t : public log_rec_t
                         rlen-= ll;
                         l+= ll;
                         ll= mlog_decode_varint_length(*l);
            -            if (UNIV_UNLIKELY(ll > 3 || ll >= rlen))
            +            if (UNIV_UNLIKELY(ll > 3 || ll > rlen))
                           goto record_corrupted;
                         size_t data_c= mlog_decode_varint(l);
                         ut_ad(data_c != MLOG_DECODE_ERROR);
            

            In this case, the secondary index record is (a)=(0x69), and all of the data will have been copied from the infimum record. The data_len=rlen-ll that will be passed to page_apply_insert_dynamic() will be 0.

            marko Marko Mäkelä added a comment - The recovery fails because the log record parser throws an error. The problematic mini-transaction log is as follows: bytes explanation 0x34, 0x5, 0x4, 0x3f, 0x16, WRITE , tablespace=5, page=4, offset=0x3f, bytes=0x16 (setting PAGE_MAX_TRX_ID =0x16) 0xa5, 0x6, 0x0, 0x0, 0x0, 0x1 EXTENDED , INSERT_HEAP_DYNAMIC 0x77, 0x5, 0x4, 0x0, 0xf7, 0xfd, 0x7a, 0xcd, OPTION , PAGE_CHECKSUM 0x0 end of mini-transaction Parsing the INSERT_HEAP_DYNAMIC record fails because of an off-by-one error: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 7db0fdbefa5..2e7ad1f33a6 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -393,7 +393,7 @@ struct log_phys_t : public log_rec_t rlen-= ll; l+= ll; ll= mlog_decode_varint_length(*l); - if (UNIV_UNLIKELY(ll > 3 || ll >= rlen)) + if (UNIV_UNLIKELY(ll > 3 || ll > rlen)) goto record_corrupted; size_t data_c= mlog_decode_varint(l); ut_ad(data_c != MLOG_DECODE_ERROR); In this case, the secondary index record is (a)=(0x69), and all of the data will have been copied from the infimum record. The data_len=rlen-ll that will be passed to page_apply_insert_dynamic() will be 0.

            The reason why a variant of this test case with ROW_FORMAT=REDUNDANT does not fail is that the INSERT_HEAP_REDUNDANT record would include 1 literal byte of record header. I even tried a variant where the length of the field would match the infimum pseudo-record:

            --source include/have_innodb.inc
             
            CREATE TABLE t (a BINARY(8) PRIMARY KEY, KEY(a)) ENGINE=InnoDB
            ROW_FORMAT=REDUNDANT;
            INSERT INTO t VALUES (0x696e66696d756d00);
             
            --let $shutdown_timeout=0
            --source include/restart_mysqld.inc
             
            CHECK TABLE t;
            DROP TABLE t;
            

            The reason why the header is not being copied from the infimum pseudo-record is that the following condition in page_cur_insert_rec_low() will not hold:

                if (c_start >=
                    &block->page.frame[PAGE_OLD_SUPREMUM_END + REC_N_OLD_EXTRA_BYTES])
                {
                  /* Find common header bytes with the preceding record. */
            

            In other words, the following adjustment to the ROW_FORMAT=REDUNDANT parsing code is not strictly necessary, because some literal bytes will always be present in the log record:

            @@ -420,7 +420,7 @@ struct log_phys_t : public log_rec_t
                         rlen-= ll;
                         l+= ll;
                         ll= mlog_decode_varint_length(*l);
            -            if (UNIV_UNLIKELY(ll > 2 || ll >= rlen))
            +            if (UNIV_UNLIKELY(ll > 2 || ll > rlen))
                           goto record_corrupted;
                         size_t data_c= mlog_decode_varint(l);
                         rlen-= ll;
            

            marko Marko Mäkelä added a comment - The reason why a variant of this test case with ROW_FORMAT=REDUNDANT does not fail is that the INSERT_HEAP_REDUNDANT record would include 1 literal byte of record header. I even tried a variant where the length of the field would match the infimum pseudo-record: --source include/have_innodb.inc   CREATE TABLE t (a BINARY (8) PRIMARY KEY , KEY (a)) ENGINE=InnoDB ROW_FORMAT=REDUNDANT; INSERT INTO t VALUES (0x696e66696d756d00);   --let $shutdown_timeout=0 --source include/restart_mysqld.inc   CHECK TABLE t; DROP TABLE t; The reason why the header is not being copied from the infimum pseudo-record is that the following condition in page_cur_insert_rec_low() will not hold: if (c_start >= &block->page.frame[PAGE_OLD_SUPREMUM_END + REC_N_OLD_EXTRA_BYTES]) { /* Find common header bytes with the preceding record. */ In other words, the following adjustment to the ROW_FORMAT=REDUNDANT parsing code is not strictly necessary, because some literal bytes will always be present in the log record: @@ -420,7 +420,7 @@ struct log_phys_t : public log_rec_t rlen-= ll; l+= ll; ll= mlog_decode_varint_length(*l); - if (UNIV_UNLIKELY(ll > 2 || ll >= rlen)) + if (UNIV_UNLIKELY(ll > 2 || ll > rlen)) goto record_corrupted; size_t data_c= mlog_decode_varint(l); rlen-= ll;

            An INSERT of a single record (not 2 records as in the test case) could fail like this:

            10.5 3c8674edcc99605aa4634145771b2195c3f4f007

            2022-09-19 11:03:24 0 [Note] InnoDB: Starting final batch to recover 61 pages from redo log.
            2022-09-19 11:03:25 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1501] with error Data structure corruption
            2022-09-19 11:03:25 0 [Note] InnoDB: Starting shutdown...
            2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' init function returned error.
            2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            

            marko Marko Mäkelä added a comment - An INSERT of a single record (not 2 records as in the test case) could fail like this: 10.5 3c8674edcc99605aa4634145771b2195c3f4f007 2022-09-19 11:03:24 0 [Note] InnoDB: Starting final batch to recover 61 pages from redo log. 2022-09-19 11:03:25 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1501] with error Data structure corruption 2022-09-19 11:03:25 0 [Note] InnoDB: Starting shutdown... 2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' init function returned error. 2022-09-19 11:03:25 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

            I believe that this only affects secondary indexes where the keys are fixed length and the data encodes to a prefix of the string infimum and a NUL byte. This could affect the insert of the first record into any secondary index leaf page.

            marko Marko Mäkelä added a comment - I believe that this only affects secondary indexes where the keys are fixed length and the data encodes to a prefix of the string infimum and a NUL byte. This could affect the insert of the first record into any secondary index leaf page.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.