Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29438

Recovery or backup of instant ALTER TABLE is incorrect

Details

    Description

      mleich provided rr replay traces where mariadb-backup --prepare fails like this:

      10.6 92032499874259bae7455130958ea7f38c4d53a3

      2022-08-31  9:06:55 0 [Note] InnoDB: Starting final batch to recover 241 pages from redo log.
      2022-08-31  9:06:58 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=0, page number=514]
      2022-08-31  9:06:58 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-08-31  9:06:58 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=514]; set innodb_force_recovery to ignore
      2022-08-31  9:06:58 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1490] with error Data structure corruption
      

      I extracted a copy of the page both from the backup and from the server at the logical point of time when the OPT_PAGE_CHECKSUM record was written. Apart from FIL_PAGE_LSN which is excluded from the checksum, the pages differ as follows:

      @@ -38,7 +38,7 @@
       0004e0 20 20 20 20 20 20 20 20 00 00 00 00 00 00 00 00
       0004f0 01 01 00 00 00 00 00 00 00 00 00 f0 3f 00 00 00
       000500 00 00 00 f0 3f 03 01 30 00 3c fb 73 00 00 00 00
      -000510 00 00 00 00 07 0a 8b 00 00 01 46 0d 17 00 00 00
      +000510 00 00 00 00 06 0a 8b 00 00 01 46 0d 17 00 00 00
       000520 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       000530 00 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
       000540 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
      

      The difference is for a clustered index leaf page record that starts at 0x50c. The DB_TRX_ID=0x70a would be incorrectly recovered as 0x60a. This difference was caught thanks to MDEV-18976.

      The columns of the index seem to be something like the following:
      (id INT UNSIGNED NOT NULL, DB_TRX_ID, DB_ROLL_PTR, pad CHAR(60), c CHAR(120), k INT, geocol2 GEOMETRY, tcol CHAR(3)).

      On recovery, the incorrect byte was recovered by the following:

      10.6 92032499874259bae7455130958ea7f38c4d53a3

      (rr) frame 2
      #2  0x000055f5b96c8359 in page_apply_insert_dynamic (block=..., reuse=false, 
          prev=0, shift=0, enc_hdr_l=23, hdr_c=0, data_c=9, data=0x4e0c52e6e807, 
          data_len=212)
          at /data/Server/bb-10.6-MDEV-29374/storage/innobase/page/page0cur.cc:2838
      2838	  memcpy(buf, prev_rec, data_c);
      

      The buf starts at 0x50c (the start of the first column), and the prev_rec at offset 0x63 contains the following:

      (rr) p/x *prev_rec@data_c
      $12 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x6}
      

      It is worth noting that right after the incorrect data had been copied, that prev_rec field would have been adjusted to the correct value:

      #0  mach_write_to_1 (b=0x4e0c5298806b "\a", n=7)
          at /data/Server/bb-10.6-MDEV-29374/storage/innobase/include/mach0data.inl:45
      #1  0x000055f5b96b4204 in rec_set_bit_field_1 (rec=0x4e0c52988070 "", val=7, 
          offs=5, mask=15, shift=0)
          at /data/Server/bb-10.6-MDEV-29374/storage/innobase/include/rem0rec.inl:159
      #2  0x000055f5b96c83ab in page_apply_insert_dynamic (block=..., reuse=false, 
          prev=0, shift=0, enc_hdr_l=23, hdr_c=0, data_c=9, data=0x4e0c52e6e807, 
          data_len=212)
          at /data/Server/bb-10.6-MDEV-29374/storage/innobase/page/page0cur.cc:2842
      …
      (rr) frame 2
      #2  0x000055f5b96c83ab in page_apply_insert_dynamic (block=..., reuse=false, 
          prev=0, shift=0, enc_hdr_l=23, hdr_c=0, data_c=9, data=0x4e0c52e6e807, 
          data_len=212)
          at /data/Server/bb-10.6-MDEV-29374/storage/innobase/page/page0cur.cc:2842
      2842	  rec_set_bit_field_1(owner_rec, n_owned + 1, REC_NEW_N_OWNED,
      

      Offset 0x63 is where the page infimum pseudo-record is stored. Because an instant ADD/DROP COLUMN has been executed on this table, the record will not contain the string infimum but something else (NUL bytes followed by the header of the supremum record).

      I think that we must ‘pessimize’ the implementation of MDEV-21724 and never write log that would copy something from the infimum record to the first actual user record in the table.

      Attachments

        Issue Links

          Activity

            The infimum and supremum strings were repurposed in MDEV-15562.

            The insert of the record that is logged incorrectly is actually for the hidden metadata record of a MDEV-15562 operation:

            10.6 92032499874259bae7455130958ea7f38c4d53a3

            #2  0x00005572faaadca1 in page_cur_insert_rec_low (cur=0x58551c4f5048, index=0x616006e1a608, rec=0x6140000254cf "", offsets=0x58551c4f5170, mtr=0x58551c4f5450)
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/page/page0cur.cc:1525
            #3  0x00005572fad90d7d in page_cur_tuple_insert (cursor=0x58551c4f5048, tuple=0x61d0009f1e88, index=0x616006e1a608, offsets=0x58551c4f5000, heap=0x58551c4f4fe0, n_ext=1, mtr=0x58551c4f5450)
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/include/page0cur.inl:199
            #4  0x00005572fada7631 in btr_cur_optimistic_insert (flags=2, cursor=0x58551c4f5040, offsets=0x58551c4f5000, heap=0x58551c4f4fe0, entry=0x61d0009f1e88, rec=0x58551c4f5020, big_rec=0x58551c4f4fc0, n_ext=1, 
                thr=0x61d0009f2070, mtr=0x58551c4f5450) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/btr/btr0cur.cc:3500
            #5  0x00005572fab90ae2 in row_ins_clust_index_entry_low (flags=2, mode=10, index=0x616006e1a608, n_uniq=1, entry=0x61d0009f1e88, n_ext=0, thr=0x61d0009f2070)
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/row/row0ins.cc:2764
            #6  0x00005572fa8dcfcd in innobase_instant_try (ha_alter_info=0x58551c4f7a90, ctx=0x62b00011aad0, altered_table=0x58551c4f8190, table=0x619001cea398, trx=0x75cd11101440)
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:6177
            #7  0x00005572fa92d529 in commit_try_norebuild (ha_alter_info=0x58551c4f7a90, ctx=0x62b00011aad0, altered_table=0x58551c4f8190, old_table=0x619001cea398, trx=0x75cd11101440, table_name=0x61b000092155 "t18")
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:10559
            #8  0x00005572fa905753 in ha_innobase::commit_inplace_alter_table (this=0x61d000885eb8, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, commit=true)
                at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:11308
            #9  0x00005572f9f5c304 in handler::ha_commit_inplace_alter_table (this=0x61d000885eb8, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, commit=true)
                at /data/Server/bb-10.6-MDEV-29374/sql/handler.cc:5234
            #10 0x00005572f99d192a in mysql_inplace_alter_table (thd=0x62b000149218, table_list=0x62b000118400, table=0x619001cea398, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, 
                target_mdl_request=0x58551c4f7b90, ddl_log_state=0x58551c4f78b0, trigger_param=0x58551c4f8600, alter_ctx=0x58551c4f90a0) at /data/Server/bb-10.6-MDEV-29374/sql/sql_table.cc:7471
            #11 0x00005572f99e55b6 in mysql_alter_table (thd=0x62b000149218, new_db=0x62b00014dc18, new_name=0x62b00014e030, create_info=0x58551c4fa550, table_list=0x62b000118400, alter_info=0x58551c4fa420, order_num=0, 
                order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.6-MDEV-29374/sql/sql_table.cc:10278
            #12 0x00005572f9b74c76 in Sql_cmd_alter_table::execute (this=0x62b000118ce8, thd=0x62b000149218) at /data/Server/bb-10.6-MDEV-29374/sql/sql_alter.cc:542
            #13 0x00005572f9774c79 in mysql_execute_command (thd=0x62b000149218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-29374/sql/sql_parse.cc:5997
            #14 0x00005572f9781046 in mysql_parse (thd=0x62b000149218, rawbuf=0x62b000118238 "ALTER TABLE t18 ADD COLUMN IF NOT EXISTS tcol7 TIME DEFAULT NULL FIRST, LOCK=SHARED /* E_R Thread2 QNO 2036 CON_ID 19 */", 
                length=120, parser_state=0x58551c4fba20) at /data/Server/bb-10.6-MDEV-29374/sql/sql_parse.cc:8030
            

            The ADD COLUMN…FIRST will cause columns to be reordered in the table.

            For reproducing this bug, I think that it is important that the table contains INT UNSIGNED PRIMARY KEY. Execute something like this in a loop, and concurrently run backup and restore:

            CREATE TABLE t(a INT UNSIGNED PRIMARY KEY) ENGINE=InnoDB;
            INSERT INTO t VALUES …;
            ALTER TABLE t ADD COLUMN b INT NULL FIRST;
            DROP TABLE t;
            

            marko Marko Mäkelä added a comment - The infimum and supremum strings were repurposed in MDEV-15562 . The insert of the record that is logged incorrectly is actually for the hidden metadata record of a MDEV-15562 operation: 10.6 92032499874259bae7455130958ea7f38c4d53a3 #2 0x00005572faaadca1 in page_cur_insert_rec_low (cur=0x58551c4f5048, index=0x616006e1a608, rec=0x6140000254cf "", offsets=0x58551c4f5170, mtr=0x58551c4f5450) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/page/page0cur.cc:1525 #3 0x00005572fad90d7d in page_cur_tuple_insert (cursor=0x58551c4f5048, tuple=0x61d0009f1e88, index=0x616006e1a608, offsets=0x58551c4f5000, heap=0x58551c4f4fe0, n_ext=1, mtr=0x58551c4f5450) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/include/page0cur.inl:199 #4 0x00005572fada7631 in btr_cur_optimistic_insert (flags=2, cursor=0x58551c4f5040, offsets=0x58551c4f5000, heap=0x58551c4f4fe0, entry=0x61d0009f1e88, rec=0x58551c4f5020, big_rec=0x58551c4f4fc0, n_ext=1, thr=0x61d0009f2070, mtr=0x58551c4f5450) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/btr/btr0cur.cc:3500 #5 0x00005572fab90ae2 in row_ins_clust_index_entry_low (flags=2, mode=10, index=0x616006e1a608, n_uniq=1, entry=0x61d0009f1e88, n_ext=0, thr=0x61d0009f2070) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/row/row0ins.cc:2764 #6 0x00005572fa8dcfcd in innobase_instant_try (ha_alter_info=0x58551c4f7a90, ctx=0x62b00011aad0, altered_table=0x58551c4f8190, table=0x619001cea398, trx=0x75cd11101440) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:6177 #7 0x00005572fa92d529 in commit_try_norebuild (ha_alter_info=0x58551c4f7a90, ctx=0x62b00011aad0, altered_table=0x58551c4f8190, old_table=0x619001cea398, trx=0x75cd11101440, table_name=0x61b000092155 "t18") at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:10559 #8 0x00005572fa905753 in ha_innobase::commit_inplace_alter_table (this=0x61d000885eb8, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, commit=true) at /data/Server/bb-10.6-MDEV-29374/storage/innobase/handler/handler0alter.cc:11308 #9 0x00005572f9f5c304 in handler::ha_commit_inplace_alter_table (this=0x61d000885eb8, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, commit=true) at /data/Server/bb-10.6-MDEV-29374/sql/handler.cc:5234 #10 0x00005572f99d192a in mysql_inplace_alter_table (thd=0x62b000149218, table_list=0x62b000118400, table=0x619001cea398, altered_table=0x58551c4f8190, ha_alter_info=0x58551c4f7a90, target_mdl_request=0x58551c4f7b90, ddl_log_state=0x58551c4f78b0, trigger_param=0x58551c4f8600, alter_ctx=0x58551c4f90a0) at /data/Server/bb-10.6-MDEV-29374/sql/sql_table.cc:7471 #11 0x00005572f99e55b6 in mysql_alter_table (thd=0x62b000149218, new_db=0x62b00014dc18, new_name=0x62b00014e030, create_info=0x58551c4fa550, table_list=0x62b000118400, alter_info=0x58551c4fa420, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.6-MDEV-29374/sql/sql_table.cc:10278 #12 0x00005572f9b74c76 in Sql_cmd_alter_table::execute (this=0x62b000118ce8, thd=0x62b000149218) at /data/Server/bb-10.6-MDEV-29374/sql/sql_alter.cc:542 #13 0x00005572f9774c79 in mysql_execute_command (thd=0x62b000149218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-29374/sql/sql_parse.cc:5997 #14 0x00005572f9781046 in mysql_parse (thd=0x62b000149218, rawbuf=0x62b000118238 "ALTER TABLE t18 ADD COLUMN IF NOT EXISTS tcol7 TIME DEFAULT NULL FIRST, LOCK=SHARED /* E_R Thread2 QNO 2036 CON_ID 19 */", length=120, parser_state=0x58551c4fba20) at /data/Server/bb-10.6-MDEV-29374/sql/sql_parse.cc:8030 The ADD COLUMN…FIRST will cause columns to be reordered in the table. For reproducing this bug, I think that it is important that the table contains INT UNSIGNED PRIMARY KEY . Execute something like this in a loop, and concurrently run backup and restore: CREATE TABLE t(a INT UNSIGNED PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t VALUES …; ALTER TABLE t ADD COLUMN b INT NULL FIRST ; DROP TABLE t;

            My attempt at reproducing this reproduced something else:

            diff --git a/mysql-test/suite/innodb/t/instant_alter_crash.test b/mysql-test/suite/innodb/t/instant_alter_crash.test
            index ddeaa6885e6..8c08ad0f6d9 100644
            --- a/mysql-test/suite/innodb/t/instant_alter_crash.test
            +++ b/mysql-test/suite/innodb/t/instant_alter_crash.test
            @@ -198,6 +198,29 @@ disconnect ddl;
             SHOW CREATE TABLE t1;
             SHOW CREATE TABLE t2;
             SHOW CREATE TABLE t3;
            -DROP TABLE t1,t2,t3;
            +DROP TABLE t2,t3;
            +
            +--echo #
            +--echo # MDEV-29438 Recovery or backup of instant ALTER TABLE is incorrect
            +--echo #
            +CREATE TABLE t2(a INT UNSIGNED PRIMARY KEY) ENGINE=InnoDB;
            +INSERT INTO t2 VALUES (1),(2),(3),(4),(5),(6);
            +
            +connect ddl, localhost, root;
            +SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
            +--send
            +ALTER TABLE t2 ADD COLUMN b TINYINT UNSIGNED NOT NULL DEFAULT 42 FIRST;
            +
            +connection default;
            +SET DEBUG_SYNC='now WAIT_FOR ddl';
            +SET GLOBAL innodb_flush_log_at_trx_commit=1;
            +DELETE FROM t1;
            +
            +--source include/kill_mysqld.inc
            +disconnect ddl;
            +--source include/start_mysqld.inc
            +
            +CHECK TABLE t2;
            +DROP TABLE t1,t2;
             
             --list_files $MYSQLD_DATADIR/test
            

            10.6 40aa94df356cfa000fc43c92ff0061212d1b161d

            2022-09-01 13:19:14 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849
            mariadbd: /mariadb/10.6/storage/innobase/rem/rem0rec.cc:304: void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = true; bool redundant_temp = false; rec_t = unsigned char; rec_offs = short unsigned int; ulint = long unsigned int]: Assertion `n_fields <= ulint(index->n_fields) + 1' failed.
            

            The index->n_fields=3 corresponds to (a,DB_TRX_ID,DB_ROLL_PTR) and the n_fields=5 corresponds to the metadata record (a,DB_TRX_ID,DB_ROLL_PTR,blob,b).

            marko Marko Mäkelä added a comment - My attempt at reproducing this reproduced something else: diff --git a/mysql-test/suite/innodb/t/instant_alter_crash.test b/mysql-test/suite/innodb/t/instant_alter_crash.test index ddeaa6885e6..8c08ad0f6d9 100644 --- a/mysql-test/suite/innodb/t/instant_alter_crash.test +++ b/mysql-test/suite/innodb/t/instant_alter_crash.test @@ -198,6 +198,29 @@ disconnect ddl; SHOW CREATE TABLE t1; SHOW CREATE TABLE t2; SHOW CREATE TABLE t3; -DROP TABLE t1,t2,t3; +DROP TABLE t2,t3; + +--echo # +--echo # MDEV-29438 Recovery or backup of instant ALTER TABLE is incorrect +--echo # +CREATE TABLE t2(a INT UNSIGNED PRIMARY KEY) ENGINE=InnoDB; +INSERT INTO t2 VALUES (1),(2),(3),(4),(5),(6); + +connect ddl, localhost, root; +SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever'; +--send +ALTER TABLE t2 ADD COLUMN b TINYINT UNSIGNED NOT NULL DEFAULT 42 FIRST; + +connection default; +SET DEBUG_SYNC='now WAIT_FOR ddl'; +SET GLOBAL innodb_flush_log_at_trx_commit=1; +DELETE FROM t1; + +--source include/kill_mysqld.inc +disconnect ddl; +--source include/start_mysqld.inc + +CHECK TABLE t2; +DROP TABLE t1,t2; --list_files $MYSQLD_DATADIR/test 10.6 40aa94df356cfa000fc43c92ff0061212d1b161d 2022-09-01 13:19:14 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849 mariadbd: /mariadb/10.6/storage/innobase/rem/rem0rec.cc:304: void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = true; bool redundant_temp = false; rec_t = unsigned char; rec_offs = short unsigned int; ulint = long unsigned int]: Assertion `n_fields <= ulint(index->n_fields) + 1' failed. The index->n_fields=3 corresponds to (a,DB_TRX_ID,DB_ROLL_PTR) and the n_fields=5 corresponds to the metadata record (a,DB_TRX_ID,DB_ROLL_PTR,blob,b).

            That something else is now MDEV-29440.

            marko Marko Mäkelä added a comment - That something else is now MDEV-29440 .

            Finally, I reproduced this without hitting MDEV-29440. The UPDATE loop is for incrementing the DB_TRX_ID to a suitable range.

            --source include/have_innodb.inc
            --source include/not_embedded.inc
            --source include/have_debug.inc
            --source include/have_debug_sync.inc
             
            CREATE TABLE t1(a INT PRIMARY KEY) ENGINE=InnoDB;
            INSERT INTO t1 SET a=1;
            --disable_query_log
            let $N=768;
            while ($N) {
            UPDATE t1 SET a=a+1;
            dec $N;
            }
            --enable_query_log
            CREATE TABLE t2(a INT UNSIGNED PRIMARY KEY, b INT NOT NULL DEFAULT 42)
            ENGINE=InnoDB;
            INSERT INTO t2 (a) VALUES (1),(2),(3),(4);
             
            connect ddl, localhost, root;
            SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
            --send
            ALTER TABLE t2 MODIFY COLUMN b INT NOT NULL DEFAULT 42 FIRST;
             
            connection default;
            SET DEBUG_SYNC='now WAIT_FOR ddl';
            SET GLOBAL innodb_flush_log_at_trx_commit=1;
            DELETE FROM t1;
             
            --source include/kill_mysqld.inc
            disconnect ddl;
            --source include/start_mysqld.inc
             
            CHECK TABLE t2;
            DROP TABLE t1,t2;
            

            10.6 40aa94df356cfa000fc43c92ff0061212d1b161d

            2022-09-01 15:50:35 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849
            2022-09-01 15:50:35 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=6, page number=3]
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=3]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 6 row operations to undo
            2022-09-01 15:50:35 0 [Note] InnoDB: Trx id counter is 1574
            2022-09-01 15:50:35 0 [Note] InnoDB: Starting final batch to recover 12 pages from redo log.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=324]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=410]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=0]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 0 from file './ibdata1': Page read from tablespace is corrupted.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=1]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=2]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=3]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=1]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=2]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=4]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1490] with error Data structure corruption
            2022-09-01 15:50:35 0 [Note] InnoDB: Starting shutdown...
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=243]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 243 from file './ibdata1': Page read from tablespace is corrupted.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=3]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_table_stats.ibd': Page read from tablespace is corrupted.
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=2, page number=3]; set innodb_force_recovery to ignore
            2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_index_stats.ibd': Page read from tablespace is corrupted.
            2022-09-01 15:50:35 0 [ERROR] Plugin 'InnoDB' init function returned error.
            2022-09-01 15:50:35 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            

            I may have accidentally reproduced another bug in multi-batch recovery, because I only expected that single page to be corrupted.

            marko Marko Mäkelä added a comment - Finally, I reproduced this without hitting MDEV-29440 . The UPDATE loop is for incrementing the DB_TRX_ID to a suitable range. --source include/have_innodb.inc --source include/not_embedded.inc --source include/have_debug.inc --source include/have_debug_sync.inc   CREATE TABLE t1(a INT PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t1 SET a=1; --disable_query_log let $N=768; while ($N) { UPDATE t1 SET a=a+1; dec $N; } --enable_query_log CREATE TABLE t2(a INT UNSIGNED PRIMARY KEY , b INT NOT NULL DEFAULT 42) ENGINE=InnoDB; INSERT INTO t2 (a) VALUES (1),(2),(3),(4);   connect ddl, localhost, root; SET DEBUG_SYNC= 'innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever' ; --send ALTER TABLE t2 MODIFY COLUMN b INT NOT NULL DEFAULT 42 FIRST ;   connection default ; SET DEBUG_SYNC= 'now WAIT_FOR ddl' ; SET GLOBAL innodb_flush_log_at_trx_commit=1; DELETE FROM t1;   --source include/kill_mysqld.inc disconnect ddl; --source include/start_mysqld.inc   CHECK TABLE t2; DROP TABLE t1,t2; 10.6 40aa94df356cfa000fc43c92ff0061212d1b161d 2022-09-01 15:50:35 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849 2022-09-01 15:50:35 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=6, page number=3] 2022-09-01 15:50:35 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=3]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 6 row operations to undo 2022-09-01 15:50:35 0 [Note] InnoDB: Trx id counter is 1574 2022-09-01 15:50:35 0 [Note] InnoDB: Starting final batch to recover 12 pages from redo log. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=324]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=410]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=0]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 0 from file './ibdata1': Page read from tablespace is corrupted. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=1]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=2]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=3]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=1]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=2]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=6, page number=4]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1490] with error Data structure corruption 2022-09-01 15:50:35 0 [Note] InnoDB: Starting shutdown... 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=0, page number=243]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 243 from file './ibdata1': Page read from tablespace is corrupted. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=3]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_table_stats.ibd': Page read from tablespace is corrupted. 2022-09-01 15:50:35 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=2, page number=3]; set innodb_force_recovery to ignore 2022-09-01 15:50:35 0 [ERROR] InnoDB: Failed to read page 3 from file './mysql/innodb_index_stats.ibd': Page read from tablespace is corrupted. 2022-09-01 15:50:35 0 [ERROR] Plugin 'InnoDB' init function returned error. 2022-09-01 15:50:35 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. I may have accidentally reproduced another bug in multi-batch recovery, because I only expected that single page to be corrupted.

            With the following fix, there will be no log apply failure messages:

            diff --git a/storage/innobase/page/page0cur.cc b/storage/innobase/page/page0cur.cc
            index 5f7192675dc..3f863d36a3a 100644
            --- a/storage/innobase/page/page0cur.cc
            +++ b/storage/innobase/page/page0cur.cc
            @@ -1648,7 +1648,7 @@ page_cur_insert_rec_low(
               {
                 const byte *r= rec;
                 const byte *c= cur->rec;
            -    const byte *c_end= cur->rec + data_size;
            +    const byte *c_end= c + (page_rec_is_infimum(c) ? 8 : data_size);
                 static_assert(REC_N_OLD_EXTRA_BYTES == REC_N_NEW_EXTRA_BYTES + 1, "");
                 if (c <= insert_buf && c_end > insert_buf)
                   c_end= insert_buf;
            

            10.6 40aa94df356cfa000fc43c92ff0061212d1b161d with the patch

            2022-09-01 16:25:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849
            2022-09-01 16:25:13 0 [ERROR] InnoDB: Table `test`.`t2` contains unrecognizable instant ALTER metadata
            2022-09-01 16:25:13 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 6 row operations to undo
            2022-09-01 16:25:13 0 [Note] InnoDB: Trx id counter is 1574
            2022-09-01 16:25:13 0 [Note] InnoDB: Starting final batch to recover 13 pages from redo log.
            2022-09-01 16:25:13 0 [Note] InnoDB: 128 rollback segments are active.
            2022-09-01 16:25:13 0 [Note] InnoDB: Rolled back recovered transaction 1571
            …
            2022-09-01 16:25:13 0 [ERROR] InnoDB: Table `test`.`t2` contains unrecognizable instant ALTER metadata
            mariadbd: /mariadb/10.6/storage/innobase/dict/dict0load.cc:2464: dict_table_t* dict_load_table_one(const st_::span<const char>&, dict_err_ignore_t, dict_names_t&): Assertion `!table || (ignore_err & ~DICT_ERR_IGNORE_FK_NOKEY) || !table->is_readable() || !table->corrupted' failed.
            

            That is, we would hit MDEV-29440 a little later (this time during innodb_check_version(), invoked by ddl_log_execute_recovery()). I double-checked that without the fix, we will get many ‘extra’ messages for the unrelated pages.

            marko Marko Mäkelä added a comment - With the following fix, there will be no log apply failure messages: diff --git a/storage/innobase/page/page0cur.cc b/storage/innobase/page/page0cur.cc index 5f7192675dc..3f863d36a3a 100644 --- a/storage/innobase/page/page0cur.cc +++ b/storage/innobase/page/page0cur.cc @@ -1648,7 +1648,7 @@ page_cur_insert_rec_low( { const byte *r= rec; const byte *c= cur->rec; - const byte *c_end= cur->rec + data_size; + const byte *c_end= c + (page_rec_is_infimum(c) ? 8 : data_size); static_assert(REC_N_OLD_EXTRA_BYTES == REC_N_NEW_EXTRA_BYTES + 1, ""); if (c <= insert_buf && c_end > insert_buf) c_end= insert_buf; 10.6 40aa94df356cfa000fc43c92ff0061212d1b161d with the patch 2022-09-01 16:25:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54849,54849 2022-09-01 16:25:13 0 [ERROR] InnoDB: Table `test`.`t2` contains unrecognizable instant ALTER metadata 2022-09-01 16:25:13 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 6 row operations to undo 2022-09-01 16:25:13 0 [Note] InnoDB: Trx id counter is 1574 2022-09-01 16:25:13 0 [Note] InnoDB: Starting final batch to recover 13 pages from redo log. 2022-09-01 16:25:13 0 [Note] InnoDB: 128 rollback segments are active. 2022-09-01 16:25:13 0 [Note] InnoDB: Rolled back recovered transaction 1571 … 2022-09-01 16:25:13 0 [ERROR] InnoDB: Table `test`.`t2` contains unrecognizable instant ALTER metadata mariadbd: /mariadb/10.6/storage/innobase/dict/dict0load.cc:2464: dict_table_t* dict_load_table_one(const st_::span<const char>&, dict_err_ignore_t, dict_names_t&): Assertion `!table || (ignore_err & ~DICT_ERR_IGNORE_FK_NOKEY) || !table->is_readable() || !table->corrupted' failed. That is, we would hit MDEV-29440 a little later (this time during innodb_check_version() , invoked by ddl_log_execute_recovery() ). I double-checked that without the fix, we will get many ‘extra’ messages for the unrelated pages.

            The reason for the flood of messages for unrelated pages is the following in log_phys_t::apply():

                  case OPTION:
                    ut_ad(rlen == 5);
                    ut_ad(*l == OPT_PAGE_CHECKSUM);
                    if (page_checksum(block, l + 1))
                    {
                      applied= APPLIED_YES;
            page_corrupted:
                      sql_print_error("InnoDB: Set innodb_force_recovery=1"
                                      " to ignore corruption.");
                      recv_sys.set_corrupt_log();
                      return applied;
                    }
            

            We are flagging the log corrupted only because a single page is corrupted. This will cause all subsequent invocations of recv_recover_page() to evict pages as corrupted:

            set_start_lsn:
            		if (recv_sys.is_corrupt_log() && !srv_force_recovery) {
            …
            			buf_pool.corrupted_evict(&block->page,
            						 block->page.state() &
            						 buf_page_t::LRU_MASK);
            

            It would be better to introduce another return value, say, log_phys_t::APPLIED_CORRUPTED, to indicate that the page was corrupted. Then there would be no ‘collateral damage’ of claiming unrelated pages as corrupted. The recv_sys.set_corrupt_log() must be reserved to cases where the entire log is corrupted.

            marko Marko Mäkelä added a comment - The reason for the flood of messages for unrelated pages is the following in log_phys_t::apply() : case OPTION: ut_ad(rlen == 5); ut_ad(*l == OPT_PAGE_CHECKSUM); if (page_checksum(block, l + 1)) { applied= APPLIED_YES; page_corrupted: sql_print_error( "InnoDB: Set innodb_force_recovery=1" " to ignore corruption." ); recv_sys.set_corrupt_log(); return applied; } We are flagging the log corrupted only because a single page is corrupted. This will cause all subsequent invocations of recv_recover_page() to evict pages as corrupted: set_start_lsn: if (recv_sys.is_corrupt_log() && !srv_force_recovery) { … buf_pool.corrupted_evict(&block->page, block->page.state() & buf_page_t::LRU_MASK); It would be better to introduce another return value, say, log_phys_t::APPLIED_CORRUPTED , to indicate that the page was corrupted. Then there would be no ‘collateral damage’ of claiming unrelated pages as corrupted. The recv_sys.set_corrupt_log() must be reserved to cases where the entire log is corrupted.

            I think that this bug should also affect normal tables where the string infimum is being inserted in the first PRIMARY KEY column as the first record of the table, but I was unable to repeat an OPT_PAGE_CHECKSUM mismatch in such a case.

            In 10.5, recovery would not invoke buf_pool.corrupted_evict() if it fails to apply some log to a page. That code was added to 10.6.9 in MDEV-13542.

            marko Marko Mäkelä added a comment - I think that this bug should also affect normal tables where the string infimum is being inserted in the first PRIMARY KEY column as the first record of the table, but I was unable to repeat an OPT_PAGE_CHECKSUM mismatch in such a case. In 10.5, recovery would not invoke buf_pool.corrupted_evict() if it fails to apply some log to a page. That code was added to 10.6.9 in MDEV-13542 .

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.