[MDEV-24730] Corruption in online ALTER TABLE with ROW_FORMAT=REDUNDANT Created: 2021-01-29  Updated: 2021-11-17  Resolved: 2021-03-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3.2
Fix Version/s: 10.3.29, 10.4.19, 10.5.10, 10.6.0

Type: Bug Priority: Major
Reporter: Thirunarayanan Balathandayuthapani Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: rr-profile-analyzed

Issue Links:
Problem/Incident
is caused by MDEV-11369 Instant add column for InnoDB Closed
Relates
relates to MDEV-16131 Assertion `is_instant() || id == DICT... Closed
relates to MDEV-23805 Make Online DDL to Instant DDL when t... Closed

 Description   

InnoDB fails with following stack trace:

/Server/bb-10.5-MDEV-24693/storage/innobase/rem/rem0rec.cc:1502: void rec_convert_dtuple_to_rec_comp(rec_t*, const dict_index_t*, const dfield_t*, ulint, rec_comp_status_t, bool) [with bool mblob = false; rec_t = unsigned char; ulint = long unsigned int]: Assertion `index->is_instant()' failed.
 
(rr) bt
 
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00003ad76f402859 in __GI_abort () at abort.c:79
#2  0x00003ad76f402729 in __assert_fail_base (fmt=0x3ad76f598588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x564984cd5900 "index->is_instant()", file=0x564984cd4ce0 "/Server/bb-10.5-MDEV-24693/storage/innobase/rem/rem0rec.cc", line=1502, 
    function=<optimized out>) at assert.c:92
#3  0x00003ad76f413f36 in __GI___assert_fail (assertion=0x564984cd5900 "index->is_instant()", file=0x564984cd4ce0 "/Server/bb-10.5-MDEV-24693/storage/innobase/rem/rem0rec.cc", line=1502, 
    function=0x564984cda100 "void rec_convert_dtuple_to_rec_comp(rec_t*, const dict_index_t*, const dfield_t*, ulint, rec_comp_status_t, bool) [with bool mblob = false; rec_t = unsigned char; ulint = long unsigned int]") at assert.c:101
#4  0x00005649834eec31 in rec_convert_dtuple_to_rec_comp<>(rec_t *, const dict_index_t *, const dfield_t *, ulint, rec_comp_status_t, bool) (rec=0x52ac59bff008 "", index=0x6170006537a0, field=0x61a001399958, n_fields=17, status=REC_STATUS_INSTANT, temp=true)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/rem/rem0rec.cc:1502
#5  0x00005649834dde1b in rec_convert_dtuple_to_temp (rec=0x52ac59bff008 "", index=0x6170006537a0, fields=0x61a001399958, n_fields=17, status=REC_STATUS_INSTANT) at /Server/bb-10.5-MDEV-24693/storage/innobase/rem/rem0rec.cc:1822
#6  0x00005649835b90c6 in row_log_table_low_redundant (rec=0x44dd4ec28094 "", index=0x6170006537a0, insert=true, old_pk=0x0, new_index=0x6170000da9a0) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0log.cc:918
#7  0x00005649835b9f96 in row_log_table_low (rec=0x44dd4ec28094 "", index=0x6170006537a0, offsets=0x19b12b551ae0, insert=true, old_pk=0x0) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0log.cc:992
#8  0x00005649835bd471 in row_log_table_insert (rec=0x44dd4ec28094 "", index=0x6170006537a0, offsets=0x19b12b551ae0) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0log.cc:1437
#9  0x000056498353f977 in row_ins_clust_index_entry_low (flags=0, mode=16386, index=0x6170006537a0, n_uniq=0, entry=0x61700058d4a0, n_ext=0, thr=0x6250000c3440) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:2781
#10 0x0000564983542090 in row_ins_clust_index_entry (index=0x6170006537a0, entry=0x61700058d4a0, thr=0x6250000c3440, n_ext=0) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:3208
#11 0x0000564983542a9a in row_ins_index_entry (index=0x6170006537a0, entry=0x61700058d4a0, thr=0x6250000c3440) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:3333
#12 0x0000564983543c0b in row_ins_index_entry_step (node=0x622000177930, thr=0x6250000c3440) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:3502
#13 0x00005649835446e4 in row_ins (node=0x622000177930, thr=0x6250000c3440) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:3661
#14 0x00005649835458ff in row_ins_step (thr=0x6250000c3440) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0ins.cc:3807
#15 0x000056498358f403 in row_insert_for_mysql (mysql_rec=0x6320000f0838 "\312\377\200", prebuilt=0x6220001771a0, ins_mode=ROW_INS_NORMAL) at /Server/bb-10.5-MDEV-24693/storage/innobase/row/row0mysql.cc:1421
#16 0x00005649831ed830 in ha_innobase::write_row (this=0x61d000759eb8, record=0x6320000f0838 "\312\377\200") at /Server/bb-10.5-MDEV-24693/storage/innobase/handler/ha_innodb.cc:7544
#17 0x00005649826ea3b7 in handler::ha_write_row (this=0x61d000759eb8, buf=0x6320000f0838 "\312\377\200") at /Server/bb-10.5-MDEV-24693/sql/handler.cc:7151
#18 0x0000564981df5476 in write_record (thd=0x62b00027d218, table=0x61900054ce98, info=0x19b12b553820, sink=0x0) at /Server/bb-10.5-MDEV-24693/sql/sql_insert.cc:2106
#19 0x0000564981decf33 in mysql_insert (thd=0x62b00027d218, table_list=0x62b0002843b0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false, result=0x0) at /Server/bb-10.5-MDEV-24693/sql/sql_insert.cc:1099
#20 0x0000564981eb5868 in mysql_execute_command (thd=0x62b00027d218) at /Server/bb-10.5-MDEV-24693/sql/sql_parse.cc:4596
#21 0x0000564981ecdd70 in mysql_parse (thd=0x62b00027d218, rawbuf=0x62b000284238 "INSERT INTO t1 (tscol7) VALUES (NULL),(NULL) /* E_R Thread29 QNO 144 CON_ID 45 */", length=81, parser_state=0x19b12b5547c0, is_com_multi=false, is_next_command=false)
at /Server/bb-10.5-MDEV-24693/sql/sql_parse.cc:8062
#22 0x0000564981ea4069 in dispatch_command (command=COM_QUERY, thd=0x62b00027d218, packet=0x62900070d219 "INSERT INTO t1 (tscol7) VALUES (NULL),(NULL) /* E_R Thread29 QNO 144 CON_ID 45 */ ", packet_length=82, is_com_multi=false, is_next_command=false)
at /Server/bb-10.5-MDEV-24693/sql/sql_parse.cc:1889
#23 0x0000564981ea0992 in do_command (thd=0x62b00027d218) at /Server/bb-10.5-MDEV-24693/sql/sql_parse.cc:1370
#24 0x00005649822e31d7 in do_handle_one_connection (connect=0x60800008c138, put_in_cache=true) at /Server/bb-10.5-MDEV-24693/sql/sql_connect.cc:1410
#25 0x00005649822e2b40 in handle_one_connection (arg=0x60800008c138) at /Server/bb-10.5-MDEV-24693/sql/sql_connect.cc:1312
#26 0x0000564982fed983 in pfs_spawn_thread (arg=0x615000081198) at /Server/bb-10.5-MDEV-24693/storage/perfschema/pfs.cc:2201
#27 0x00004a6168c80609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#28 0x00003ad76f4ff293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Purge can reset the n_core_fields when alter is waiting to upgrade MDL for commit phase of DDL. In the meantime, any insert DML tries to log the operation fails with index is not being instant. This issue is happening for redundant row format tables.



 Comments   
Comment by Matthias Leich [ 2021-01-29 ]

RQG
====
git clone https://github.com/mleich1/rqg --branch experimental RQG
 
perl rqg.pl \                   
--gendata \
--vcols \
--views \
--grammar=conf/mariadb/instant_add.yy \
--mysqld=--innodb_use_native_aio=1 \
--mysqld=--innodb_lock_schedule_algorithm=fcfs \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--connect_timeout=60 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--net_read_timeout=30 \
--mysqld=--net_write_timeout=60 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--wait_timeout=28800 \
--mysqld=--lock-wait-timeout=86400 \
--mysqld=--innodb-lock-wait-timeout=50 \
--no-mask \
--queries=10000000 \
--seed=random \
--reporters=Backtrace \
--reporters=ErrorLog \
--reporters=Deadlock1 \
--validators=None \
--mysqld=--log_output=none \
--mysqld=--log-bin \
--mysqld=--log_bin_trust_function_creators=1 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--engine=InnoDB \
--restart_timeout=240 \
--mysqld=--plugin-load-add=file_key_management.so \
--mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
--duration=300 \
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
--mysqld=--innodb_stats_persistent=on \
--mysqld=--innodb_adaptive_hash_index=off \
--mysqld=--loose-max-statement-time=30 \
--threads=33 \
--mysqld=--innodb_page_size=16K \
--mysqld=--innodb-buffer-pool-size=8M \
--duration=300 \
--no_mask \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings> \
--basedir1=<local settings> \
--script_debug=_nix_ \
--rr=Extended \
--rr_options=--chaos

Comment by Marko Mäkelä [ 2021-01-29 ]

The fix of bug MDEV-16131 was incomplete and did not cover ROW_FORMAT=REDUNDANT tables, which were the default (and only) format before MySQL 5.0.3 (March 2005). The scenario is as follows:

  1. Instant ADD COLUMN (MDEV-11369) is executed.
  2. An online table rebuild (OPTIMIZE TABLE or some ALTER TABLE) is started.
  3. The table was emptied by purge.
  4. Some concurrent INSERT is buffered.
  5. When applying the buffered INSERT record, we will write a record to a temporary file in the wrong format, as if the table never had existed in the instant ADD format.

Thank you, thiru, for developing a fix and also testing that unlike I initially feared, there is no problem even if more than 128 columns had been instantly added.

Because we are close to release, and because the fix (adding a function parameter) might cause a performance regression, I think that we have to let this slip into the next releases.

Generated at Thu Feb 08 09:32:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.