Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14332

Corruption during online table-rebuilding ALTER when VIRTUAL columns exist, causes assertion failure

Details

    Description

      10.2 debug 7c85a8d936b1a

      mysqld: /data/src/10.2/storage/innobase/row/row0log.cc:2432: const mrec_t* row_log_table_apply_op(que_thr_t*, ulint, ulint, row_merge_dup_t*, dberr_t*, mem_heap_t*, mem_heap_t*, const mrec_t*, const mrec_t*, ulint*): Assertion `0' failed.
      171109  2:35:13 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f9135c14ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055c43656a131 in row_log_table_apply_op (thr=0x7f90d4024510, trx_id_col=1, new_trx_id_col=1, dup=0x7f913009e2e0, error=0x7f913009e084, offsets_heap=0x7f90d404ce70, heap=0x7f90d4028540, mrec=0x7f912869206d "\002B\001", mrec_end=0x7f9128692094 "", offsets=0x7f90d4025900) at /data/src/10.2/storage/innobase/row/row0log.cc:2432
      #9  0x000055c43656c1b3 in row_log_table_apply_ops (thr=0x7f90d4024510, dup=0x7f913009e2e0, stage=0x7f90d4026120) at /data/src/10.2/storage/innobase/row/row0log.cc:3085
      #10 0x000055c43656c6be in row_log_table_apply (thr=0x7f90d4024510, old_table=0x7f90e0091918, table=0x7f90d4019800, stage=0x7f90d4026120) at /data/src/10.2/storage/innobase/row/row0log.cc:3184
      #11 0x000055c436450912 in ha_innobase::inplace_alter_table (this=0x7f90e014ed58, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610) at /data/src/10.2/storage/innobase/handler/handler0alter.cc:6464
      #12 0x000055c435f6948b in handler::ha_inplace_alter_table (this=0x7f90e014ed58, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610) at /data/src/10.2/sql/handler.h:3768
      #13 0x000055c435f60705 in mysql_inplace_alter_table (thd=0x7f90d4000b00, table_list=0x7f90d4011170, table=0x7f90e014e150, altered_table=0x7f90d4019800, ha_alter_info=0x7f913009e610, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f913009e680, alter_ctx=0x7f913009f240) at /data/src/10.2/sql/sql_table.cc:7332
      #14 0x000055c435f65a47 in mysql_alter_table (thd=0x7f90d4000b00, new_db=0x7f90d4011780 "test", new_name=0x0, create_info=0x7f913009fe50, table_list=0x7f90d4011170, alter_info=0x7f913009fda0, order_num=0, order=0x0, ignore=false) at /data/src/10.2/sql/sql_table.cc:9324
      #15 0x000055c435fdeb38 in Sql_cmd_alter_table::execute (this=0x7f90d4011790, thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_alter.cc:324
      #16 0x000055c435e9b139 in mysql_execute_command (thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_parse.cc:6189
      #17 0x000055c435e9f9ea in mysql_parse (thd=0x7f90d4000b00, rawbuf=0x7f90d4011088 "ALTER TABLE t1 ENGINE=InnoDB", length=28, parser_state=0x7f91300a1200, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7861
      #18 0x000055c435e8d9a6 in dispatch_command (command=COM_QUERY, thd=0x7f90d4000b00, packet=0x7f90d40088a1 "ALTER TABLE t1 ENGINE=InnoDB", packet_length=28, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1805
      #19 0x000055c435e8c304 in do_command (thd=0x7f90d4000b00) at /data/src/10.2/sql/sql_parse.cc:1360
      #20 0x000055c435fd9829 in do_handle_one_connection (connect=0x55c439414780) at /data/src/10.2/sql/sql_connect.cc:1354
      #21 0x000055c435fd95b6 in handle_one_connection (arg=0x55c439414780) at /data/src/10.2/sql/sql_connect.cc:1260
      #22 0x000055c4363f6aac in pfs_spawn_thread (arg=0x55c4394af1d0) at /data/src/10.2/storage/perfschema/pfs.cc:1863
      #23 0x00007f91378eb494 in start_thread (arg=0x7f91300a2700) at pthread_create.c:333
      #24 0x00007f9135cd193f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      10.2 non-debug 7c85a8d936b1a

      CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
      INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
      connect  con1,localhost,root,,test;
      ALTER TABLE t1 ENGINE=InnoDB;
      connect  con2,localhost,root,,test;
      ALTER TABLE t1 ADD f INT;
      connection default;
      BEGIN;
      UPDATE t1 SET i = 0;
      ROLLBACK;
      connection con1;
      ERROR HY000: Index for table 't1' is corrupt; try to repair it
      disconnect con1;
      connection con2;
      disconnect con2;
      connection default;
      DROP TABLE t1;
      

      10.3 is also affected.
      No failure on 10.1.

      The test case is concurrent, but it fails quite reliably for me. Still, if it doesn't fail right away, try running with --repeat=N.

      --source include/have_innodb.inc
       
      CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
      INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
       
      --connect (con1,localhost,root,,test)
      --send
        ALTER TABLE t1 ENGINE=InnoDB;
       
      --connect (con2,localhost,root,,test)
      --send
        ALTER TABLE t1 ADD f INT;
       
      --connection default
      BEGIN;
      UPDATE t1 SET i = 0;
      ROLLBACK;
       
      # Cleanup
       
      --connection con1
      --reap
      --disconnect con1
      --connection con2
      --reap
      --disconnect con2
      --connection default
      DROP TABLE t1;
      

      Attachments

        Issue Links

          Activity

            Similar to MDEV-13795, this is a crash in online table-rebuilding ALTER TABLE when virtual columns pre-exist in the table.

            The bug likely exists in MySQL 5.7 as well. MariaDB enabled indexed virtual columns for InnoDB tables in MDEV-5800. MariaDB 10.1 introduced support for virtual columns in InnoDB, but MySQL did not support that before MySQL 5.7, which also was the first release to support indexed virtual columns.

            marko Marko Mäkelä added a comment - Similar to MDEV-13795 , this is a crash in online table-rebuilding ALTER TABLE when virtual columns pre-exist in the table. The bug likely exists in MySQL 5.7 as well. MariaDB enabled indexed virtual columns for InnoDB tables in MDEV-5800 . MariaDB 10.1 introduced support for virtual columns in InnoDB, but MySQL did not support that before MySQL 5.7, which also was the first release to support indexed virtual columns.

            Here is a deterministic test:

            --source include/have_innodb.inc
            --source include/have_debug.inc
            --source include/have_debug_sync.inc
             
            CREATE TABLE t1 (c CHAR(8), i INT, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
            INSERT INTO t1 (c,i) VALUES ('foo',1),('bar',2);
             
            --connect (con1,localhost,root,,test)
            SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL copied WAIT_FOR dml';
            --send
            ALTER TABLE t1 FORCE, ENGINE=InnoDB;
             
            --connection default
            SET DEBUG_SYNC = 'now WAIT_FOR copied';
            BEGIN;
            UPDATE t1 SET i = 0;
            ROLLBACK;
            SET DEBUG_SYNC = 'now SIGNAL dml';
             
            # Cleanup
             
            --connection con1
            --reap
            --disconnect con1
            --connection default
            SET DEBUG_SYNC = 'RESET';
            DROP TABLE t1;
            

            marko Marko Mäkelä added a comment - Here is a deterministic test: --source include/have_innodb.inc --source include/have_debug.inc --source include/have_debug_sync.inc   CREATE TABLE t1 (c CHAR (8), i INT , v INT AS (i) VIRTUAL) ENGINE=InnoDB; INSERT INTO t1 (c,i) VALUES ( 'foo' ,1),( 'bar' ,2);   --connect (con1,localhost,root,,test) SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL copied WAIT_FOR dml' ; --send ALTER TABLE t1 FORCE , ENGINE=InnoDB;   --connection default SET DEBUG_SYNC = 'now WAIT_FOR copied' ; BEGIN ; UPDATE t1 SET i = 0; ROLLBACK ; SET DEBUG_SYNC = 'now SIGNAL dml' ;   # Cleanup   --connection con1 --reap --disconnect con1 --connection default SET DEBUG_SYNC = 'RESET' ; DROP TABLE t1;

            There are 4 ROW_T_UPDATE records in clust_index->online_log.tail.block, total 148 bytes.
            The parser consumed 1 byte too little from the 3rd record, or the record was too short by 1 byte. So, the last byte of the penultimate record was misinterpreted as record discriminator, which triggered the assertion failure (for reporting corruption).
            The records 1 and 2 (written by the UPDATE) are 36 bytes each, but the records 3 and 4 (written by the ROLLBACK) are 38 bytes.

            The bug appears to be that ROLLBACK is writing too long ROW_T_UPDATE records. There is no need to write virtual column information, because the virtual columns is not indexed.

            marko Marko Mäkelä added a comment - There are 4 ROW_T_UPDATE records in clust_index->online_log.tail.block, total 148 bytes. The parser consumed 1 byte too little from the 3rd record, or the record was too short by 1 byte. So, the last byte of the penultimate record was misinterpreted as record discriminator, which triggered the assertion failure (for reporting corruption). The records 1 and 2 (written by the UPDATE) are 36 bytes each, but the records 3 and 4 (written by the ROLLBACK) are 38 bytes. The bug appears to be that ROLLBACK is writing too long ROW_T_UPDATE records. There is no need to write virtual column information, because the virtual columns is not indexed.

            The following change to the start of the test triggers a slightly different assertion (unexpected EOF). Again, the ROW_T_UPDATE record written by the ROLLBACK is 2 bytes longer than the one written by the UPDATE.

            CREATE TABLE t1 (pk INT PRIMARY KEY, i INT NOT NULL, v INT AS (i) VIRTUAL) ENGINE=InnoDB;
            INSERT INTO t1 SET pk=1, i=1;
            

            The record contents is:
            (ROW_T_UPDATE,extra_size=0,pk=1,DB_TRX_ID=0x509,DB_ROLL_PTR=...,i=0,(2 bytes telling that there are no indexed virtual columns))
            For the ROLLBACK, the record contents is equivalent, except that the 2 bytes at the end are duplicated.
            The duplication occurs because of this in row_log_table_low():

            		if (ventry && ventry->n_v_fields > 0) {
            			rec_convert_dtuple_to_temp_v(b, new_index, ventry);
            			b += mach_read_from_2(b);
             
            			if (o_ventry) {
            				rec_convert_dtuple_to_temp_v(
            					b, new_index, o_ventry);
            				b += mach_read_from_2(b);
            			}
            

            Both rec_convert_dtuple_to_temp_v() calls write 2 bytes. We should really write no bytes when there are no indexed virtual columns.
            And ideally we should write no data for indexed virtual columns in a table-rebuilding online ALTER. Instead, we should evaluate the virtual column values from the logged row values.

            The simplest solution is to never write virtual column values into the online_log in table-rebuilding ALTER, and to refuse table-rebuilding ALTER with LOCK=NONE when indexed virtual columns exist. This would also address MDEV-13795.

            marko Marko Mäkelä added a comment - The following change to the start of the test triggers a slightly different assertion (unexpected EOF). Again, the ROW_T_UPDATE record written by the ROLLBACK is 2 bytes longer than the one written by the UPDATE. CREATE TABLE t1 (pk INT PRIMARY KEY , i INT NOT NULL , v INT AS (i) VIRTUAL) ENGINE=InnoDB; INSERT INTO t1 SET pk=1, i=1; The record contents is: (ROW_T_UPDATE,extra_size=0,pk=1,DB_TRX_ID=0x509,DB_ROLL_PTR=...,i=0,(2 bytes telling that there are no indexed virtual columns)) For the ROLLBACK, the record contents is equivalent, except that the 2 bytes at the end are duplicated. The duplication occurs because of this in row_log_table_low(): if (ventry && ventry->n_v_fields > 0) { rec_convert_dtuple_to_temp_v(b, new_index, ventry); b += mach_read_from_2(b);   if (o_ventry) { rec_convert_dtuple_to_temp_v( b, new_index, o_ventry); b += mach_read_from_2(b); } Both rec_convert_dtuple_to_temp_v() calls write 2 bytes. We should really write no bytes when there are no indexed virtual columns. And ideally we should write no data for indexed virtual columns in a table-rebuilding online ALTER. Instead, we should evaluate the virtual column values from the logged row values. The simplest solution is to never write virtual column values into the online_log in table-rebuilding ALTER, and to refuse table-rebuilding ALTER with LOCK=NONE when indexed virtual columns exist. This would also address MDEV-13795 .

            MDEV-13795 is a special case of this problem.

            marko Marko Mäkelä added a comment - MDEV-13795 is a special case of this problem.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.