A wrong condition in the function btr_lift_page_up() causes a table to remain wrongly reset to "no instant ALTER was performed" state in any operation that involves reducing the B-tree height (such as UPDATE, DELETE, or purge of history):
The problem can be repeated with a table whose clustered index consists of at least 2 levels of node pointer pages above the leaf level. The corruption would be noticed after the table is loaded to the InnoDB data dictionary cache (such as after server restart). Here is a test case with debug injection:
#10 0x000055ba28171ae8 in btr_estimate_number_of_different_key_vals (index=0x7f49d4046838) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:6704
#11 0x000055ba28212ee8 in dict_stats_update_transient_for_index (index=0x7f49d4046838) at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:890
#12 0x000055ba28213261 in dict_stats_update_transient (table=<optimized out>) at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:948
#13 dict_stats_update (table=0x7f49d4045628, stats_upd_option=<optimized out>) at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:3359
#14 0x000055ba27f3fae3 in ha_innobase::open (this=0x7f49d41becb0, name=0x7f49d409edf8 "./test/t1") at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:6154
#15 0x000055ba27d8180e in handler::ha_open (this=0x7f49d41becb0, table_arg=0x7f49d41d4438, name=0x7f49d409edf8 "./test/t1", mode=2, test_if_locked=18, mem_root=0x0, partitions_to_open=0x0) at /mariadb/10.3/sql/handler.cc:2760
#16 0x000055ba27c22d92 in open_table_from_share (thd=0x7f49d4000cf8, share=0x7f49d409e8c0, alias=<optimized out>, db_stat=33, prgflag=<optimized out>, ha_open_flags=0, outparam=0x7f49d41d4438, is_create_table=<optimized out>, partitions_to_open=0x0) at /mariadb/10.3/sql/table.cc:3506
#17 0x000055ba27ae0d04 in open_table (thd=<optimized out>, table_list=<optimized out>, ot_ctx=0x7f4a2f105250) at /mariadb/10.3/sql/sql_base.cc:1979
#18 0x000055ba27bfe223 in mysql_inplace_alter_table (thd=<optimized out>, table_list=0x7f49d4011de0, table=0x0, altered_table=<optimized out>, ha_alter_info=<optimized out>, inplace_supported=HA_ALTER_INPLACE_INSTANT, target_mdl_request=<optimized out>, alter_ctx=<optimized out>) at /mariadb/10.3/sql/sql_table.cc:7631
For the record, I wanted to see what exactly happens if a table that was corrupted due to this bug is being accessed in a newer version. In order to catch MDEV-19783 (before realizing that it is a duplicate of this bug), we added some consistency checks.
I built a debug version of MariaDB 10.3.16 (the last release before this fix) and ran the following (adapted from innodb.instant_alter_debug:
--source include/have_innodb.inc
--source include/have_debug.inc
--echo #
--echo # MDEV-19916 Corruption after instant ADD/DROP and shrinking the tree
--echo #
CREATETABLE t1 (a INTPRIMARYKEY) ENGINE=InnoDB;
# Create an index tree with 2 levels of node pointer pages.
SET @old_limit = @@innodb_limit_optimistic_insert_debug;
After the mariadb-10.3.16 server was shut down, I started up a different one, and immediately got a debug assertion failure in purge, before the CHECK TABLE statement could be executed:
This crash occurs at the very start of the CHECK TABLE statement, when attempting to open the table for executing any SQL statement on it. Let us retry once more with a non-debug version of the server, without innodb_force_recovery. It will note the corruption without crashing:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
test.t1 check Warning InnoDB: The B-tree of index PRIMARY is corrupted.
test.t1 check error Corrupt
In the server error log, there is some more detail:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
2019-12-09 8:13:03 4 [ERROR] InnoDB: Record overlaps another: 107+12
2019-12-09 8:13:03 4 [ERROR] InnoDB: Summed data size 14, returned by func 13
2019-12-09 8:13:03 4 [ERROR] InnoDB: Apparent corruption in space 5 page 3 of index `PRIMARY` of table `test`.`t1`
2019-12-09 8:13:03 4 [ERROR] InnoDB: In page 3 of index `PRIMARY` of table `test`.`t1`, index tree level 1
But, unfortunately there is no message about the existence of the metadata record at the start of the index. I think that the proper place for that would be in btr_validate_level(). I filed MDEV-21251 for that omission.
Marko Mäkelä
added a comment - For the record, I wanted to see what exactly happens if a table that was corrupted due to this bug is being accessed in a newer version. In order to catch MDEV-19783 (before realizing that it is a duplicate of this bug), we added some consistency checks.
I built a debug version of MariaDB 10.3.16 (the last release before this fix) and ran the following (adapted from innodb.instant_alter_debug :
--source include/have_innodb.inc
--source include/have_debug.inc
--echo #
--echo # MDEV-19916 Corruption after instant ADD/DROP and shrinking the tree
--echo #
CREATE TABLE t1 (a INT PRIMARY KEY ) ENGINE=InnoDB;
# Create an index tree with 2 levels of node pointer pages.
SET @old_limit = @@innodb_limit_optimistic_insert_debug;
SET GLOBAL innodb_limit_optimistic_insert_debug = 2;
INSERT INTO t1 VALUES (1),(5),(4),(3),(2);
SET GLOBAL innodb_limit_optimistic_insert_debug = @old_limit;
ALTER TABLE t1 ADD COLUMN b INT , ALGORITHM=INSTANT;
SET GLOBAL innodb_defragment = 1;
OPTIMIZE TABLE t1;
--source include/restart_mysqld.inc
# restart with a newer MariaDB 10.3
CHECK TABLE t1;
DROP TABLE t1;
After the mariadb-10.3.16 server was shut down, I started up a different one, and immediately got a debug assertion failure in purge, before the CHECK TABLE statement could be executed:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
mysqld: /mariadb/10.3/storage/innobase/include/rem0rec.h:791: bool rec_is_metadata(const rec_t *, const dict_index_t *): Assertion `!is || index->is_instant()' failed.
…
#4 0x0000555555f9c9f0 in rec_is_metadata (rec=0x7ffff0bb00ab "\200",
index=0x7fffc40219b8)
at /mariadb/10.3/storage/innobase/include/rem0rec.h:791
#5 0x000055555602a21e in page_cur_search_with_match_bytes (
block=<optimized out>, index=<optimized out>, tuple=<optimized out>,
mode=<optimized out>, iup_matched_fields=<optimized out>,
iup_matched_bytes=<optimized out>, ilow_matched_fields=<optimized out>,
ilow_matched_bytes=<optimized out>, cursor=<optimized out>)
at /mariadb/10.3/storage/innobase/page/page0cur.cc:735
#6 0x00005555561e3729 in btr_cur_search_to_nth_level_func (
index=<optimized out>, level=<optimized out>, tuple=<optimized out>,
mode=<optimized out>, latch_mode=<optimized out>, cursor=<optimized out>,
ahi_latch=0x0,
file=0x5555567279b5 "/mariadb/10.3/storage/innobase/row/row0row.cc",
line=1052, mtr=0x7fffdcff8200, autoinc=0)
at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:1837
#7 0x00005555561051b7 in btr_pcur_open_low (index=0x7fffc40219b8, level=0,
tuple=0x7fffc4015140, mode=PAGE_CUR_LE, latch_mode=<optimized out>,
cursor=0x555557b3eea0, file=0x0, line=1052, autoinc=0, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/include/btr0pcur.ic:441
#8 0x0000555556103b83 in row_search_on_row_ref (pcur=0x555557b3eea0, mode=2,
table=0x7fffc401f9c8, ref=0x7fffc4015140, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/row/row0row.cc:1052
#9 0x00005555560f3bdb in row_purge_reposition_pcur (mode=2,
node=0x555557b3edf8, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:79
#10 0x00005555560f708c in row_purge_reset_trx_id (node=0x555557b3edf8,
mtr=0x7fffdcff8200) at /mariadb/10.3/storage/innobase/row/row0purge.cc:796
#11 0x00005555560f4f14 in row_purge_record_func (node=<optimized out>,
undo_rec=<optimized out>, thr=0x555557b3ed38,
updated_extern=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1210
#12 row_purge (node=<optimized out>, undo_rec=<optimized out>,
thr=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1259
#13 row_purge_step (thr=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1338
Let us try with innodb_force_recovery=2 instead, so that the purge will be disabled:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
mysqld: /mariadb/10.3/storage/innobase/rem/rem0rec.cc:603: void rec_init_offsets(const rec_t *, const dict_index_t *, bool, ulint *): Assertion `index->is_instant()' failed.
…
#4 0x000055555607956a in rec_init_offsets (rec=0x7ffff0bb00ab "\200",
index=0x7fffbc13e0d8, leaf=true, offsets=0x7fffbc13dbf0)
at /mariadb/10.3/storage/innobase/rem/rem0rec.cc:603
#5 rec_get_offsets_func (rec=0x7ffff0bb00ab "\200", index=<optimized out>,
offsets=0x7fffbc13dbf0, leaf=true, n_fields=<optimized out>,
file=<optimized out>, line=6701, heap=0x7ffff03b10d8)
at /mariadb/10.3/storage/innobase/rem/rem0rec.cc:869
#6 0x0000555556203a39 in btr_estimate_number_of_different_key_vals (
index=0x7fffbc13e0d8) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:6699
#7 0x00005555562c3ec2 in dict_stats_update_transient_for_index (
index=0x7fffbc13e0d8)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:888
#8 0x00005555562c4241 in dict_stats_update_transient (table=0x7fffbc13ccc8)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:946
#9 dict_stats_update (table=0x7fffbc13ccc8, stats_upd_option=<optimized out>)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:3352
#10 0x0000555555f4971f in dict_stats_init (table=0x7fffbc13ccc8)
at /mariadb/10.3/storage/innobase/include/dict0stats.ic:165
#11 ha_innobase::open (this=0x7fffbc13c520, name=0x7fffbc13f948 "./test/t1")
at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:6145
#12 0x0000555555d734db in handler::ha_open (this=0x7fffbc13c520,
table_arg=0x7fffbc13b948, name=0x7fffbc13f948 "./test/t1", mode=2,
test_if_locked=50, mem_root=0x0, partitions_to_open=0x0)
at /mariadb/10.3/sql/handler.cc:2760
This crash occurs at the very start of the CHECK TABLE statement, when attempting to open the table for executing any SQL statement on it. Let us retry once more with a non-debug version of the server, without innodb_force_recovery . It will note the corruption without crashing:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
test.t1 check Warning InnoDB: The B-tree of index PRIMARY is corrupted.
test.t1 check error Corrupt
In the server error log, there is some more detail:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
2019-12-09 8:13:03 4 [ERROR] InnoDB: Record overlaps another: 107+12
2019-12-09 8:13:03 4 [ERROR] InnoDB: Summed data size 14, returned by func 13
2019-12-09 8:13:03 4 [ERROR] InnoDB: Apparent corruption in space 5 page 3 of index `PRIMARY` of table `test`.`t1`
2019-12-09 8:13:03 4 [ERROR] InnoDB: In page 3 of index `PRIMARY` of table `test`.`t1`, index tree level 1
But, unfortunately there is no message about the existence of the metadata record at the start of the index. I think that the proper place for that would be in btr_validate_level() . I filed MDEV-21251 for that omission.
For the record, I wanted to see what exactly happens if a table that was corrupted due to this bug is being accessed in a newer version. In order to catch
MDEV-19783(before realizing that it is a duplicate of this bug), we added some consistency checks.I built a debug version of MariaDB 10.3.16 (the last release before this fix) and ran the following (adapted from innodb.instant_alter_debug:
--source include/have_innodb.inc
--source include/have_debug.inc
--echo #
--echo # MDEV-19916 Corruption after instant ADD/DROP and shrinking the tree
--echo #
--source include/restart_mysqld.inc
After the mariadb-10.3.16 server was shut down, I started up a different one, and immediately got a debug assertion failure in purge, before the CHECK TABLE statement could be executed:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
mysqld: /mariadb/10.3/storage/innobase/include/rem0rec.h:791: bool rec_is_metadata(const rec_t *, const dict_index_t *): Assertion `!is || index->is_instant()' failed.
…
#4 0x0000555555f9c9f0 in rec_is_metadata (rec=0x7ffff0bb00ab "\200",
index=0x7fffc40219b8)
at /mariadb/10.3/storage/innobase/include/rem0rec.h:791
#5 0x000055555602a21e in page_cur_search_with_match_bytes (
block=<optimized out>, index=<optimized out>, tuple=<optimized out>,
mode=<optimized out>, iup_matched_fields=<optimized out>,
iup_matched_bytes=<optimized out>, ilow_matched_fields=<optimized out>,
ilow_matched_bytes=<optimized out>, cursor=<optimized out>)
at /mariadb/10.3/storage/innobase/page/page0cur.cc:735
#6 0x00005555561e3729 in btr_cur_search_to_nth_level_func (
index=<optimized out>, level=<optimized out>, tuple=<optimized out>,
mode=<optimized out>, latch_mode=<optimized out>, cursor=<optimized out>,
ahi_latch=0x0,
file=0x5555567279b5 "/mariadb/10.3/storage/innobase/row/row0row.cc",
line=1052, mtr=0x7fffdcff8200, autoinc=0)
at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:1837
#7 0x00005555561051b7 in btr_pcur_open_low (index=0x7fffc40219b8, level=0,
tuple=0x7fffc4015140, mode=PAGE_CUR_LE, latch_mode=<optimized out>,
cursor=0x555557b3eea0, file=0x0, line=1052, autoinc=0, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/include/btr0pcur.ic:441
#8 0x0000555556103b83 in row_search_on_row_ref (pcur=0x555557b3eea0, mode=2,
table=0x7fffc401f9c8, ref=0x7fffc4015140, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/row/row0row.cc:1052
#9 0x00005555560f3bdb in row_purge_reposition_pcur (mode=2,
node=0x555557b3edf8, mtr=0x7fffdcff8200)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:79
#10 0x00005555560f708c in row_purge_reset_trx_id (node=0x555557b3edf8,
mtr=0x7fffdcff8200) at /mariadb/10.3/storage/innobase/row/row0purge.cc:796
#11 0x00005555560f4f14 in row_purge_record_func (node=<optimized out>,
undo_rec=<optimized out>, thr=0x555557b3ed38,
updated_extern=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1210
#12 row_purge (node=<optimized out>, undo_rec=<optimized out>,
thr=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1259
#13 row_purge_step (thr=<optimized out>)
at /mariadb/10.3/storage/innobase/row/row0purge.cc:1338
Let us try with innodb_force_recovery=2 instead, so that the purge will be disabled:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
mysqld: /mariadb/10.3/storage/innobase/rem/rem0rec.cc:603: void rec_init_offsets(const rec_t *, const dict_index_t *, bool, ulint *): Assertion `index->is_instant()' failed.
…
#4 0x000055555607956a in rec_init_offsets (rec=0x7ffff0bb00ab "\200",
index=0x7fffbc13e0d8, leaf=true, offsets=0x7fffbc13dbf0)
at /mariadb/10.3/storage/innobase/rem/rem0rec.cc:603
#5 rec_get_offsets_func (rec=0x7ffff0bb00ab "\200", index=<optimized out>,
offsets=0x7fffbc13dbf0, leaf=true, n_fields=<optimized out>,
file=<optimized out>, line=6701, heap=0x7ffff03b10d8)
at /mariadb/10.3/storage/innobase/rem/rem0rec.cc:869
#6 0x0000555556203a39 in btr_estimate_number_of_different_key_vals (
index=0x7fffbc13e0d8) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:6699
#7 0x00005555562c3ec2 in dict_stats_update_transient_for_index (
index=0x7fffbc13e0d8)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:888
#8 0x00005555562c4241 in dict_stats_update_transient (table=0x7fffbc13ccc8)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:946
#9 dict_stats_update (table=0x7fffbc13ccc8, stats_upd_option=<optimized out>)
at /mariadb/10.3/storage/innobase/dict/dict0stats.cc:3352
#10 0x0000555555f4971f in dict_stats_init (table=0x7fffbc13ccc8)
at /mariadb/10.3/storage/innobase/include/dict0stats.ic:165
#11 ha_innobase::open (this=0x7fffbc13c520, name=0x7fffbc13f948 "./test/t1")
at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:6145
#12 0x0000555555d734db in handler::ha_open (this=0x7fffbc13c520,
table_arg=0x7fffbc13b948, name=0x7fffbc13f948 "./test/t1", mode=2,
test_if_locked=50, mem_root=0x0, partitions_to_open=0x0)
at /mariadb/10.3/sql/handler.cc:2760
This crash occurs at the very start of the CHECK TABLE statement, when attempting to open the table for executing any SQL statement on it. Let us retry once more with a non-debug version of the server, without innodb_force_recovery. It will note the corruption without crashing:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
test.t1 check Warning InnoDB: The B-tree of index PRIMARY is corrupted.
test.t1 check error Corrupt
In the server error log, there is some more detail:
10.3 e5e5877740f248de848219ee3a1d2881cd5c5b82
2019-12-09 8:13:03 4 [ERROR] InnoDB: Record overlaps another: 107+12
2019-12-09 8:13:03 4 [ERROR] InnoDB: Summed data size 14, returned by func 13
2019-12-09 8:13:03 4 [ERROR] InnoDB: Apparent corruption in space 5 page 3 of index `PRIMARY` of table `test`.`t1`
2019-12-09 8:13:03 4 [ERROR] InnoDB: In page 3 of index `PRIMARY` of table `test`.`t1`, index tree level 1
But, unfortunately there is no message about the existence of the metadata record at the start of the index. I think that the proper place for that would be in btr_validate_level(). I filed
MDEV-21251for that omission.