Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19385

InnoDB: Failing assertion: !cursor->index->is_committed() with virtual columns / indexes, WITHOUT foreign keys and temporary tables

Details

    Description

      Important note: Only reproducible on a non-debug build.

      --source include/have_innodb.inc
       
      CREATE TABLE t1 ( 
        pk INTEGER AUTO_INCREMENT,
        a INT DEFAULT 0,
        b INT DEFAULT 0,
        c INT,
        d INT AS (b),
        e INT AS (a),
        PRIMARY KEY(pk),
        UNIQUE(c,d),
        UNIQUE(e)
      ) ENGINE=InnoDB;
       
      INSERT INTO t1 (a,b,c) VALUES (1,1,1) ;
       
      ALTER TABLE t1 ADD f INT;
      INSERT IGNORE INTO t1 () VALUES (),(),();
       
      # Cleanup
      DROP TABLE t1;
      

      10.3 cb9fa1a0 RelWithDebInfo

      2019-05-03 03:27:38 0x7eff1e4b5700  InnoDB: Assertion failure in file /data/src/10.3/storage/innobase/row/row0ins.cc line 270
      InnoDB: Failing assertion: !cursor->index->is_committed()
       
      #4  0x00007eff23fe23fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #5  0x000055ac7976d918 in ut_dbg_assertion_failed (expr=expr@entry=0x55ac7a073048 "!cursor->index->is_committed()", file=file@entry=0x55ac7a072ba0 "/data/src/10.3/storage/innobase/row/row0ins.cc", line=line@entry=270) at /data/src/10.3/storage/innobase/ut/ut0dbg.cc:60
      #6  0x000055ac79c52883 in row_ins_sec_index_entry_by_modify (mtr=0x7eff1e4b2180, thr=0x7efecc074ee0, entry=0x7efecc070608, heap=0x7efecc067e40, offsets_heap=0x7efecc074050, offsets=0x7eff1e4b1270, cursor=0x7eff1e4b12f0, mode=2, flags=0) at /data/src/10.3/storage/innobase/row/row0ins.cc:270
      #7  row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=2, index=index@entry=0x7efecc0716a0, offsets_heap=offsets_heap@entry=0x7efecc074050, heap=heap@entry=0x7efecc067e40, entry=entry@entry=0x7efecc070608, trx_id=0, thr=0x7efecc074ee0, dup_chk_only=false) at /data/src/10.3/storage/innobase/row/row0ins.cc:3155
      #8  0x000055ac79c52d7c in row_ins_sec_index_entry (index=index@entry=0x7efecc0716a0, entry=0x7efecc070608, thr=thr@entry=0x7efecc074ee0, dup_chk_only=dup_chk_only@entry=false) at /data/src/10.3/storage/innobase/row/row0ins.cc:3356
      #9  0x000055ac79c53151 in row_ins_index_entry (thr=0x7efecc074ee0, entry=<optimized out>, index=0x7efecc0716a0) at /data/src/10.3/storage/innobase/row/row0ins.cc:3402
      #10 row_ins_index_entry_step (thr=0x7efecc074ee0, node=<optimized out>) at /data/src/10.3/storage/innobase/row/row0ins.cc:3550
      #11 row_ins (thr=0x7efecc074ee0, node=<optimized out>) at /data/src/10.3/storage/innobase/row/row0ins.cc:3693
      #12 row_ins_step (thr=thr@entry=0x7efecc074ee0) at /data/src/10.3/storage/innobase/row/row0ins.cc:3938
      #13 0x000055ac79c622a5 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x7efecc06d748 "\311\003", prebuilt=0x7efecc0745c0, ins_mode=<optimized out>) at /data/src/10.3/storage/innobase/row/row0mysql.cc:1460
      #14 0x000055ac79bb0bd0 in ha_innobase::write_row (this=0x7efecc06c7e0, record=0x7efecc06d748 "\311\003") at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:8073
      #15 0x000055ac79a1b6f7 in handler::ha_write_row (this=0x7efecc06c7e0, buf=0x7efecc06d748 "\311\003") at /data/src/10.3/sql/handler.cc:6331
      #16 0x000055ac798105f6 in write_record (thd=thd@entry=0x7efecc0009a8, table=table@entry=0x7efecc072798, info=info@entry=0x7eff1e4b2c80) at /data/src/10.3/sql/sql_insert.cc:2030
      #17 0x000055ac7981879c in mysql_insert (thd=0x7efecc0009a8, table_list=<optimized out>, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=true) at /data/src/10.3/sql/sql_insert.cc:1071
      #18 0x000055ac798420ce in mysql_execute_command (thd=0x7efecc0009a8) at /data/src/10.3/sql/sql_parse.cc:4730
      #19 0x000055ac79843bc1 in mysql_parse (thd=0x7efecc0009a8, rawbuf=<optimized out>, length=40, parser_state=0x7eff1e4b4630, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /data/src/10.3/sql/sql_parse.cc:8091
      #20 0x000055ac79846268 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7efecc0009a8, packet=packet@entry=0x7efecc009329 "INSERT IGNORE INTO t1 () VALUES (),(),()", packet_length=packet_length@entry=40, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/src/10.3/sql/sql_parse.cc:1858
      #21 0x000055ac79846c20 in do_command (thd=0x7efecc0009a8) at /data/src/10.3/sql/sql_parse.cc:1404
      #22 0x000055ac79919e64 in do_handle_one_connection (connect=connect@entry=0x55ac7b5e20f8) at /data/src/10.3/sql/sql_connect.cc:1402
      #23 0x000055ac7991a004 in handle_one_connection (arg=arg@entry=0x55ac7b5e20f8) at /data/src/10.3/sql/sql_connect.cc:1308
      #24 0x000055ac79b8e514 in pfs_spawn_thread (arg=0x55ac7b57d168) at /data/src/10.3/storage/perfschema/pfs.cc:1862
      #25 0x00007eff260ce494 in start_thread (arg=0x7eff1e4b5700) at pthread_create.c:333
      #26 0x00007eff2409693f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Also reproducible on 10.3.14, I didn't try earlier 10.3 builds because there were other bugs like that.
      Couldn't reproduce on 10.2, 10.4.

      Attachments

        Issue Links

          Activity

            This bug seems to depend on the ALTER TABLE executing an instant ADD COLUMN. With the following statement, the test will not crash:

            ALTER TABLE t1 ADD f INT, FORCE;
            

            I tried to add wait_all_purged.inc, but it did not help repeat this in a debug build. Also, WITH_VALGRIND did not flag any use of uninitialized memory in either debug or RelWithDebInfo build.

            A non-optimized RelWithDebInfo build will repeat this, so that is what I will be debugging.

            marko Marko Mäkelä added a comment - This bug seems to depend on the ALTER TABLE executing an instant ADD COLUMN . With the following statement, the test will not crash: ALTER TABLE t1 ADD f INT , FORCE ; I tried to add wait_all_purged.inc , but it did not help repeat this in a debug build. Also, WITH_VALGRIND did not flag any use of uninitialized memory in either debug or RelWithDebInfo build. A non-optimized RelWithDebInfo build will repeat this, so that is what I will be debugging.
            marko Marko Mäkelä added a comment - - edited

            I simplified the test case a little further, and added ROW_FORMAT=REDUNDANT to ease the analysis of page dumps:

            CREATE TABLE t1 (
              pk TINYINT AUTO_INCREMENT PRIMARY KEY,
              b TINYINT DEFAULT 0,
              c TINYINT,
              d TINYINT AS (b),
              e TINYINT AS (b),
              UNIQUE(c,d),
              UNIQUE(e)
            ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
            INSERT INTO t1 (b,c) VALUES (1,1);
            ALTER TABLE t1 ADD f INT, ALGORITHM=INSTANT;
            INSERT IGNORE INTO t1 () VALUES (),(),();
            DROP TABLE t1;
            

            Both secondary indexes must be UNIQUE. Column c must be DEFAULT NULL. Column b could also be declared NOT NULL. The column pk is necessary; with an internally created DB_ROW_ID the test would pass. I was not able to replace the INSERT IGNORE with something that would use explicit ROLLBACK.

            For me, the crash occurs on the unique index (c,d) after trx_rollback_to_savepoint_low had been invoked for the second row that the INSERT IGNORE tried to insert.

            Upon entering trx_rollback_to_savepoint_low, the index (c,d) contains the following entries: (c,d,pk)=(NULL,0,2),(NULL,0,3),(1,1,1). Initially it may feel strange that no duplicate was flagged for the two (NULL,0), but we must keep in mind that because NULL is not equal to NULL, UNIQUE indexes can allow 'duplicates' where at least one of the unique columns is NULL. The unique index (e) contains (e,pk)=(0,2),(1,1), and it caused the rollback to be initiated due to the attempt to insert the duplicate value 0, in the tuple (e,pk)=(0,3).

            The PRIMARY KEY index contains the following: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c)=metadata(0,0,1<<55,NULL,NULL,f=NULL),(1,0,1<<55,1,1),(2,DB_TRX_ID,insert,0,NULL),(3,DB_TRX_ID,insert,0,NULL).

            The first record pk=0 the MDEV-11369 hidden metadata record. Unlike other rows, it explicitly stores a value for the instantly added column f.

            After rollback and at the time of the assertion failure: the index pages contain the following:
            (pk,DB_TRX_ID,DB_ROLL_PTR,b,c)=metadata(0,0,1<<55,NULL,NULL,f=NULL),(1,0,1<<55,1,1),(2,DB_TRX_ID,insert,0,NULL)
            (c,d,pk)=(NULL,0,2),(NULL,0,3),(1,1,1)
            (e,pk)=(0,2),(1,1)

            Bug: The record (c,d,pk)=(NULL,0,3) was not rolled back! This will be caught by the assertion failure later.

            At the time of the assertion failure, only the record (3,DB_TRX_ID,insert,0,NULL) had been inserted again; the secondary indexes had not been changed. The assertion would fail, because the re-insert of the record (c,d,pk)=(NULL,0,3) would hit the previously inserted record that we had failed to roll back.

            marko Marko Mäkelä added a comment - - edited I simplified the test case a little further, and added ROW_FORMAT=REDUNDANT to ease the analysis of page dumps: CREATE TABLE t1 ( pk TINYINT AUTO_INCREMENT PRIMARY KEY , b TINYINT DEFAULT 0, c TINYINT, d TINYINT AS (b), e TINYINT AS (b), UNIQUE (c,d), UNIQUE (e) ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT; INSERT INTO t1 (b,c) VALUES (1,1); ALTER TABLE t1 ADD f INT , ALGORITHM=INSTANT; INSERT IGNORE INTO t1 () VALUES (),(),(); DROP TABLE t1; Both secondary indexes must be UNIQUE . Column c must be DEFAULT NULL . Column b could also be declared NOT NULL . The column pk is necessary; with an internally created DB_ROW_ID  the test would pass. I was not able to replace the INSERT IGNORE with something that would use explicit ROLLBACK . For me, the crash occurs on the unique index (c,d) after trx_rollback_to_savepoint_low had been invoked for the second row that the INSERT IGNORE tried to insert. Upon entering trx_rollback_to_savepoint_low , the index (c,d) contains the following entries: (c,d,pk)=(NULL,0,2),(NULL,0,3),(1,1,1). Initially it may feel strange that no duplicate was flagged for the two (NULL,0), but we must keep in mind that because NULL is not equal to NULL, UNIQUE indexes can allow 'duplicates' where at least one of the unique columns is NULL. The unique index (e) contains (e,pk)=(0,2),(1,1), and it caused the rollback to be initiated due to the attempt to insert the duplicate value 0, in the tuple (e,pk)=(0,3). The PRIMARY KEY index contains the following: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c)=metadata(0,0,1<<55,NULL,NULL,f=NULL),(1,0,1<<55,1,1),(2,DB_TRX_ID,insert,0,NULL),(3,DB_TRX_ID,insert,0,NULL). The first record pk=0 the MDEV-11369 hidden metadata record. Unlike other rows, it explicitly stores a value for the instantly added column f . After rollback and at the time of the assertion failure: the index pages contain the following: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c)=metadata(0,0,1<<55,NULL,NULL,f=NULL),(1,0,1<<55,1,1),(2,DB_TRX_ID,insert,0,NULL) (c,d,pk)=(NULL,0,2),(NULL,0,3),(1,1,1) (e,pk)=(0,2),(1,1) Bug: The record (c,d,pk)=(NULL,0,3) was not rolled back! This will be caught by the assertion failure later. At the time of the assertion failure, only the record (3,DB_TRX_ID,insert,0,NULL) had been inserted again; the secondary indexes had not been changed. The assertion would fail, because the re-insert of the record (c,d,pk)=(NULL,0,3) would hit the previously inserted record that we had failed to roll back.

            I can reproduce the crash with lower-level operations (no AUTO_INCREMENT):

            CREATE TABLE t1 (
              pk TINYINT PRIMARY KEY,
              b TINYINT NOT NULL,
              c TINYINT,
              d TINYINT AS (b),
              e TINYINT AS (b),
              UNIQUE(c,d),
              UNIQUE(e)
            ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
            INSERT INTO t1 (pk,b,c) VALUES (1,1,1);
            ALTER TABLE t1 ADD f TINYINT, ALGORITHM=INSTANT;
            BEGIN;
            INSERT INTO t1 SET pk=2, b=0;
            --error ER_DUP_ENTRY
            INSERT INTO t1 SET pk=3, b=0;
            --error ER_DUP_ENTRY
            INSERT INTO t1 SET pk=3, b=0;
            COMMIT;
            DROP TABLE t1;
            

            It will crash on the last INSERT.

            marko Marko Mäkelä added a comment - I can reproduce the crash with lower-level operations (no AUTO_INCREMENT ): CREATE TABLE t1 ( pk TINYINT PRIMARY KEY , b TINYINT NOT NULL , c TINYINT, d TINYINT AS (b), e TINYINT AS (b), UNIQUE (c,d), UNIQUE (e) ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT; INSERT INTO t1 (pk,b,c) VALUES (1,1,1); ALTER TABLE t1 ADD f TINYINT, ALGORITHM=INSTANT; BEGIN ; INSERT INTO t1 SET pk=2, b=0; --error ER_DUP_ENTRY INSERT INTO t1 SET pk=3, b=0; --error ER_DUP_ENTRY INSERT INTO t1 SET pk=3, b=0; COMMIT ; DROP TABLE t1; It will crash on the last INSERT .

            This could be the minimal test case, with explicit ROLLBACK and with a non-unique secondary index:

            CREATE TABLE t1 (
              pk TINYINT PRIMARY KEY, b TINYINT NOT NULL, c TINYINT, d TINYINT AS (b),
              INDEX(c,d)
            ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
            INSERT INTO t1 (pk,b,c) VALUES (1,1,1);
            ALTER TABLE t1 ADD f TINYINT, ALGORITHM=INSTANT;
            INSERT INTO t1 SET pk=2, b=0;
            BEGIN; INSERT INTO t1 SET pk=3, b=0; ROLLBACK;
            INSERT INTO t1 SET pk=3, b=0;
            DROP TABLE t1;
            

            marko Marko Mäkelä added a comment - This could be the minimal test case, with explicit ROLLBACK and with a non-unique secondary index: CREATE TABLE t1 ( pk TINYINT PRIMARY KEY , b TINYINT NOT NULL , c TINYINT, d TINYINT AS (b), INDEX (c,d) ) ENGINE=InnoDB ROW_FORMAT=REDUNDANT; INSERT INTO t1 (pk,b,c) VALUES (1,1,1); ALTER TABLE t1 ADD f TINYINT, ALGORITHM=INSTANT; INSERT INTO t1 SET pk=2, b=0; BEGIN ; INSERT INTO t1 SET pk=3, b=0; ROLLBACK ; INSERT INTO t1 SET pk=3, b=0; DROP TABLE t1;

            The root cause is that dtuple_get_nth_v_field() is being implemented differently between debug and release builds, and that the debug build was missing

            ut_ad(tuple->v_fields == &tuple->fields[tuple->n_fields]);
            

            to document and enforce the assumption.
            With the corrected predicate, the non-debug build will write the correct value for the virtual column d to the undo log record:

            diff --git a/storage/innobase/include/data0data.h b/storage/innobase/include/data0data.h
            --- a/storage/innobase/include/data0data.h
            +++ b/storage/innobase/include/data0data.h
            @@ -259,7 +259,7 @@ dtuple_get_nth_v_field(
             	ulint		n);
             #else /* UNIV_DEBUG */
             # define dtuple_get_nth_field(tuple, n) ((tuple)->fields + (n))
            -# define dtuple_get_nth_v_field(tuple, n) ((tuple)->fields + (tuple)->n_fields + (n))
            +# define dtuple_get_nth_v_field(tuple, n) ((tuple)->v_fields + (n))
             #endif /* UNIV_DEBUG */
             /*********************************************************************//**
             Gets info bits in a data tuple.
            

            I think that the #define might have made some sense in the old days when InnoDB was C code.

            Motivated by this finding, I reimplemented the accessors as const-preserving inline functions and remove duplication between the .h and .ic files. While doing that, I noticed many places where InnoDB was throwing away const-ness, mostly related to virtual columns and spatial indexes.

            marko Marko Mäkelä added a comment - The root cause is that dtuple_get_nth_v_field() is being implemented differently between debug and release builds, and that the debug build was missing ut_ad(tuple->v_fields == &tuple->fields[tuple->n_fields]); to document and enforce the assumption. With the corrected predicate, the non-debug build will write the correct value for the virtual column d to the undo log record: diff --git a/storage/innobase/include/data0data.h b/storage/innobase/include/data0data.h --- a/storage/innobase/include/data0data.h +++ b/storage/innobase/include/data0data.h @@ -259,7 +259,7 @@ dtuple_get_nth_v_field( ulint n); #else /* UNIV_DEBUG */ # define dtuple_get_nth_field(tuple, n) ((tuple)->fields + (n)) -# define dtuple_get_nth_v_field(tuple, n) ((tuple)->fields + (tuple)->n_fields + (n)) +# define dtuple_get_nth_v_field(tuple, n) ((tuple)->v_fields + (n)) #endif /* UNIV_DEBUG */ /*********************************************************************//** Gets info bits in a data tuple. I think that the #define might have made some sense in the old days when InnoDB was C code. Motivated by this finding, I reimplemented the accessors as const -preserving inline functions and remove duplication between the .h and .ic files. While doing that, I noticed many places where InnoDB was throwing away const -ness, mostly related to virtual columns and spatial indexes.

            10.2 might be unaffected by this bug, but 10.3 definitely was affected if instant ADD COLUMN was being used on a table that already contained virtual columns.

            I have an explanation why elenst failed to reprocude a crash in 10.4. In MDEV-17468 I wrote:

            To make the situation worse, in MDEV-15562 I had to disable instant ADD/DROP/reorder of stored columns if indexed virtual columns exist.

            marko Marko Mäkelä added a comment - 10.2 might be unaffected by this bug, but 10.3 definitely was affected if instant ADD COLUMN was being used on a table that already contained virtual columns. I have an explanation why elenst failed to reprocude a crash in 10.4. In MDEV-17468 I wrote: To make the situation worse, in MDEV-15562 I had to disable instant ADD/DROP/reorder of stored columns if indexed virtual columns exist.

            This is still persisting in 10.3.15 with slightly different conditions. I've been tracking them in MDEV-19338

            darkain Vincent Milum Jr added a comment - This is still persisting in 10.3.15 with slightly different conditions. I've been tracking them in MDEV-19338

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.