[MDEV-14332] Corruption during online table-rebuilding ALTER when VIRTUAL columns exist, causes assertion failure Created: 2017-11-09  Updated: 2023-04-19  Resolved: 2017-11-10

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Storage Engine - InnoDB
Affects Version/s: 10.2
Fix Version/s: 10.2.11, 10.3.3

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-31087 Merge new release of InnoDB 5.7.42 to... Closed
PartOf
includes MDEV-13795 ALTER TABLE…DROP PRIMARY KEY, ADD PRI... Closed
Problem/Incident
causes MDEV-14341 Allow LOCK=NONE in table-rebuilding A... Open
Relates
relates to MDEV-5800 indexes on virtual (not materialized)... Closed
relates to MDEV-13795 ALTER TABLE…DROP PRIMARY KEY, ADD PRI... Closed
relates to MDEV-16329 Engine-independent online ALTER TABLE Closed

 Description   

10.2 debug 7c85a8d936b1a

mysqld: /data/src/10.2/storage/innobase/row/row0log.cc:2432: const mrec_t* row_log_table_apply_op(que_thr_t*, ulint, ulint, row_merge_dup_t*, dberr_t*, mem_heap_t*, mem_heap_t*, const mrec_t*, const mrec_t*, ulint*): Assertion `0' failed.
171109  2:35:13 [ERROR] mysqld got signal 6 ;
 
#7  0x00007f9135c14ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x000055c43656a131 in row_log_table_apply_op (thr=0x7f90d4024510, trx_id_col=1, new_trx_id_col=1, dup=0x7f913009e2e0, error=0x7f913009e084, offsets_heap=0x7f90d404ce70, heap=0x7f90d4028540, mrec=0x7f912869206d "\002B\001", mrec_end=0x7f9128692094 "", offsets=0x7f90d4025900) at /data/src/10.2/storage/innobase/row/row0log.cc:2432
#9  0x000055c43656c1b3 in row_log_table_apply_ops (thr=0x7f90d4024510, dup=0x7f913009e2e0, stage=0x7f90d4026120) at /data/src/10.2/storage/innobase/row/row0log.cc:3085
#10 0x000055c43656c6be in row_log_table_apply (thr=0x7f90d4024510, old_table=0x7f90e0091918, table=0x7f90d4019800, stage=0x7f90d4026120) at /data/src/10.2/storage/innobase/row/row0log.cc:3184
#11 0x000055c436450912 in ha_innobase::inplace_alter_table (this=0x7f90e014ed58, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610) at /data/src/10.2/storage/innobase/handler/handler0alter.cc:6464
#12 0x000055c435f6948b in handler::ha_inplace_alter_table (this=0x7f90e014ed58, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610) at /data/src/10.2/sql/handler.h:3768
#13 0x000055c435f60705 in mysql_inplace_alter_table (thd=0x7f90d4000b00, table_list=0x7f90d4011170, table=0x7f90e014e150, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f913009e680, alter_ctx=0x7f913009f240) at /data/src/10.2/sql/sql_table.cc:7332
#14 0x000055c435f65a47 in mysql_alter_table (thd=0x7f90d4000b00, new_db=0x7f90d4011780 "test", new_name=0x0, create_info=0x7f913009fe50, table_list=0x7f90d4011170, alter_info=0x7f913009fda0, order_num=0, order=0x0, ignore=false) at /data/src/10.2/sql/sql_table.cc:9324
#15 0x000055c435fdeb38 in Sql_cmd_alter_table::execute (this=0x7f90d4011790, thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_alter.cc:324
#16 0x000055c435e9b139 in mysql_execute_command (thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_parse.cc:6189
#17 0x000055c435e9f9ea in mysql_parse (thd=0x7f90d4000b00, rawbuf=0x7f90d4011088 "ALTER TABLE t1 ENGINE=InnoDB", length=28, parser_state=0x7f91300a1200, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7861
#18 0x000055c435e8d9a6 in dispatch_command (command=COM_QUERY, thd=0x7f90d4000b00, packet=0x7f90d40088a1 "ALTER TABLE t1 ENGINE=InnoDB", packet_length=28, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1805
#19 0x000055c435e8c304 in do_command (thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_parse.cc:1360
#20 0x000055c435fd9829 in do_handle_one_connection (connect=0x55c439414780) at /data/src/10.2/sql/sql_connect.cc:1354
#21 0x000055c435fd95b6 in handle_one_connection (arg=0x55c439414780) at /data/src/10.2/sql/sql_connect.cc:1260
#22 0x000055c4363f6aac in pfs_spawn_thread (arg=0x55c4394af1d0) at /data/src/10.2/storage/perfschema/pfs.cc:1863
#23 0x00007f91378eb494 in start_thread (arg=0x7f91300a2700) at pthread_create.c:333
#24 0x00007f9135cd193f in clone () from /lib/x86_64-linux-gnu/libc.so.6

10.2 non-debug 7c85a8d936b1a

CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
connect  con1,localhost,root,,test;
ALTER TABLE t1 ENGINE=InnoDB;
connect  con2,localhost,root,,test;
ALTER TABLE t1 ADD f INT;
connection default;
BEGIN;
UPDATE t1 SET i = 0;
ROLLBACK;
connection con1;
ERROR HY000: Index for table 't1' is corrupt; try to repair it
disconnect con1;
connection con2;
disconnect con2;
connection default;
DROP TABLE t1;

10.3 is also affected.
No failure on 10.1.

The test case is concurrent, but it fails quite reliably for me. Still, if it doesn't fail right away, try running with --repeat=N.

--source include/have_innodb.inc
 
CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
 
--connect (con1,localhost,root,,test)
--send
  ALTER TABLE t1 ENGINE=InnoDB;
 
--connect (con2,localhost,root,,test)
--send
  ALTER TABLE t1 ADD f INT;
 
--connection default
BEGIN;
UPDATE t1 SET i = 0;
ROLLBACK;
 
# Cleanup
 
--connection con1
--reap
--disconnect con1
--connection con2
--reap
--disconnect con2
--connection default
DROP TABLE t1;



 Comments   
Comment by Marko Mäkelä [ 2017-11-09 ]

Similar to MDEV-13795, this is a crash in online table-rebuilding ALTER TABLE when virtual columns pre-exist in the table.

The bug likely exists in MySQL 5.7 as well. MariaDB enabled indexed virtual columns for InnoDB tables in MDEV-5800. MariaDB 10.1 introduced support for virtual columns in InnoDB, but MySQL did not support that before MySQL 5.7, which also was the first release to support indexed virtual columns.

Comment by Marko Mäkelä [ 2017-11-09 ]

Here is a deterministic test:

--source include/have_innodb.inc
--source include/have_debug.inc
--source include/have_debug_sync.inc
 
CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
 
--connect (con1,localhost,root,,test)
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL copied WAIT_FOR dml';
--send
ALTER TABLE t1 FORCE, ENGINE=InnoDB;
 
--connection default
SET DEBUG_SYNC = 'now WAIT_FOR copied';
BEGIN;
UPDATE t1 SET i = 0;
ROLLBACK;
SET DEBUG_SYNC = 'now SIGNAL dml';
 
# Cleanup
 
--connection con1
--reap
--disconnect con1
--connection default
SET DEBUG_SYNC = 'RESET';
DROP TABLE t1;

Comment by Marko Mäkelä [ 2017-11-09 ]

There are 4 ROW_T_UPDATE records in clust_index->online_log.tail.block, total 148 bytes.
The parser consumed 1 byte too little from the 3rd record, or the record was too short by 1 byte. So, the last byte of the penultimate record was misinterpreted as record discriminator, which triggered the assertion failure (for reporting corruption).
The records 1 and 2 (written by the UPDATE) are 36 bytes each, but the records 3 and 4 (written by the ROLLBACK) are 38 bytes.

The bug appears to be that ROLLBACK is writing too long ROW_T_UPDATE records. There is no need to write virtual column information, because the virtual columns is not indexed.

Comment by Marko Mäkelä [ 2017-11-09 ]

The following change to the start of the test triggers a slightly different assertion (unexpected EOF). Again, the ROW_T_UPDATE record written by the ROLLBACK is 2 bytes longer than the one written by the UPDATE.

CREATE TABLE t1 (pk INT PRIMARY KEY, i INT NOT NULL, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
INSERT INTO t1 SET pk=1, i=1;

The record contents is:
(ROW_T_UPDATE,extra_size=0,pk=1,DB_TRX_ID=0x509,DB_ROLL_PTR=...,i=0,(2 bytes telling that there are no indexed virtual columns))
For the ROLLBACK, the record contents is equivalent, except that the 2 bytes at the end are duplicated.
The duplication occurs because of this in row_log_table_low():

		if (ventry && ventry->n_v_fields > 0) {
			rec_convert_dtuple_to_temp_v(b, new_index, ventry);
			b += mach_read_from_2(b);
 
			if (o_ventry) {
				rec_convert_dtuple_to_temp_v(
					b, new_index, o_ventry);
				b += mach_read_from_2(b);
			}

Both rec_convert_dtuple_to_temp_v() calls write 2 bytes. We should really write no bytes when there are no indexed virtual columns.
And ideally we should write no data for indexed virtual columns in a table-rebuilding online ALTER. Instead, we should evaluate the virtual column values from the logged row values.

The simplest solution is to never write virtual column values into the online_log in table-rebuilding ALTER, and to refuse table-rebuilding ALTER with LOCK=NONE when indexed virtual columns exist. This would also address MDEV-13795.

Comment by Marko Mäkelä [ 2017-11-09 ]

MDEV-13795 is a special case of this problem.

Generated at Thu Feb 08 08:12:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.