Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15885

Wrong error message to ALTER TABLE ... ADD KEY ...

Details

    Description

      Source:
      ---------
      MariaDB 10.2 commit 87af52d7dd733e71fc7a9e39b882a4fd44f41fec
      2018-04-16 compiled with debug

      My test

      SET GLOBAL SQL_MODE = 'traditional';
      SET GLOBAL wait_timeout = 45;
      SET GLOBAL lock_wait_timeout = 2;
      SET GLOBAL innodb_lock_wait_timeout = 1;
      CREATE TABLE t1 (col1 INT, col4 CHAR(100)) ENGINE = InnoDB ;
      

      Now four concurrent sessions start to run one SQL in a loop:
      Session 1 --- INSERT INTO t1 (col1,col4) VALUES (1,'a') , (1,'a') ;
      Session 2 --- ALTER TABLE t1 ADD UNIQUE KEY uidx ( col4 );
      Session 3 --- ALTER TABLE t1 ADD               KEY  idx ( col1 );
                           The session aborts immediate if hitting error 1702.
      Session 4 --- ALTER TABLE t1 DROP KEY idx;
      

      After some time (you might need several test runs)

      • the session 3 which runs ALTER TABLE t1 ADD KEY idx ( col1 )
        could harvest the server response ERROR HY000: Index uidx is corrupted
      • but the final checks after the four concurrent sessions have disconnected
        show that there is no Index uidx and no corruption at all.

         SHOW CREATE TABLE t1;
          Table   Create Table
           t1      CREATE TABLE `t1` (
           `col1` int(11) DEFAULT NULL,
           `col4` char(100) DEFAULT NULL
           ) ENGINE=InnoDB DEFAULT CHARSET=latin1
           SELECT * FROM t1 LIMIT 3;
           col1    col4
           1       a
           1       a
           1       a
           CHECK TABLE t1 EXTENDED;
           Table   Op      Msg_type        Msg_text
           test.t1 check   status  OK
           ANALYZE TABLE t1;
           Table   Op      Msg_type        Msg_text
            test.t1 analyze status  OK
        

      Protocols of the concurrent sessions:

      1 --- All INSERTS passed
      2 --- masses of ERROR 23000: Duplicate entry 'a' for key 'uidx'
      3 --- One ERROR HY000: Index uidx is corrupted
      4 --- masses of ERROR 42000: Can't DROP INDEX `idx`; check that it exists
      

      IMHO the bug itself isn't that important because

      • there is no obvious damage to the table or indexes
      • running concurrent DDL on the same table is extreme unlikely
        but the error message from the server is in two points ugly
        1. A user might assume that something around his table is rotten, get in panic
        and either waste time with DDL or restore backups in order to fix something
        or try the same, but make an error and introduce by that damages
        2. ALTER TABLE t1 ADD KEY idx ( col1 ) harvesting an error message
        referring to an index with different name (uidx) looks very strange.

      Attachments

        Activity

          How to use the archive?
          ------------------------------

          cd <tree>/mysql-test
          tar xvzf <archive>
          Feel free to inspect the files run_mysqltest_prt.* which belong
          to my test run.

          ./mtr --mem ml001

          Please just ignore the majority of test output.
          .....
          CREATE TABLE t1 (col1 INT, col4 CHAR(100)) ENGINE = InnoDB ;
          ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql1 1
          ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql2 2
          ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql3 3
          ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql4 4
          SHOW CREATE TABLE t1;
          Table Create Table
          t1 CREATE TABLE `t1` (
          `col1` int(11) DEFAULT NULL,
          `col4` char(100) DEFAULT NULL
          ) ENGINE=InnoDB DEFAULT CHARSET=latin1
          SELECT * FROM t1 LIMIT 3;
          col1 col4
          1 a
          1 a
          1 a
          CHECK TABLE t1 EXTENDED;
          Table Op Msg_type Msg_text
          test.t1 check status OK
          ANALYZE TABLE t1;
          Table Op Msg_type Msg_text
          test.t1 analyze status OK
          DROP TABLE t1;
          "Was the error pattern found ---- Begin"
          ERROR HY000: Index uidx is corrupted <== In case you see that than session replayed
          and run_mysqltest_prt.3 shows that error message.
          "Was the error pattern found ---- End"
          main.ml001 'innodb' [ pass ] 70179

          MTR's internal check of the test case 'main.ml001' failed.
          ....

          mleich Matthias Leich added a comment - How to use the archive? ------------------------------ cd <tree>/mysql-test tar xvzf <archive> Feel free to inspect the files run_mysqltest_prt.* which belong to my test run. ./mtr --mem ml001 Please just ignore the majority of test output. ..... CREATE TABLE t1 (col1 INT, col4 CHAR(100)) ENGINE = InnoDB ; ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql1 1 ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql2 2 ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql3 3 ./mysqltest_background.sh 1523893362 1523893422 /mariadb/Source/10.2/mysql-test/var/tmp/my.sql4 4 SHOW CREATE TABLE t1; Table Create Table t1 CREATE TABLE `t1` ( `col1` int(11) DEFAULT NULL, `col4` char(100) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 SELECT * FROM t1 LIMIT 3; col1 col4 1 a 1 a 1 a CHECK TABLE t1 EXTENDED; Table Op Msg_type Msg_text test.t1 check status OK ANALYZE TABLE t1; Table Op Msg_type Msg_text test.t1 analyze status OK DROP TABLE t1; "Was the error pattern found ---- Begin" ERROR HY000: Index uidx is corrupted <== In case you see that than session replayed and run_mysqltest_prt.3 shows that error message. "Was the error pattern found ---- End" main.ml001 'innodb' [ pass ] 70179 MTR's internal check of the test case 'main.ml001' failed. ....

          The error 'index is corrupted' is technically correct, but it would be nicer to report 'duplicate key error' in this particular case. The following scenario is possible:

          1. ADD UNIQUE INDEX copies the data and creates the index.
          2. Optionally, any log from concurrent DML is fully replayed, and the unique index is completed.
          3. Subsequent DML operations start to update the completed but not yet committed index. If the uniqueness is violated, the ALTER TABLE must return an error, not the DML. The uncommitted index will be corrupted.
          4. ALTER TABLE notices that the index was corrupted as a result of DML that completed after index creation.

          Note that secondary indexes will be created one at a time. If multiple indexes are created, the first ones would be directly updated by concurrent DML while some indexes are still being created. Also, uncommitted added indexes would be directly updated by DML while the MDL upgrade is in progress, so that the ALTER TABLE would be committed.

          At least in some cases, the corrupted index would be reported as a duplicate key. Here is an excerpt from the 10.0 innodb.innodb-index-online test:

          # Now, create the index without any concurrent DML, while no duplicate exists.
          SET DEBUG_SYNC = 'row_log_apply_after SIGNAL created WAIT_FOR dml_done';
          --send
          ALTER TABLE t1 ADD UNIQUE INDEX(c2);
          connection default;
          SET DEBUG_SYNC = 'now WAIT_FOR created';
          # At this point, the index has been created inside InnoDB but not yet
          # in the MySQL data dictionary.
          eval $innodb_metrics_select;
          # A duplicate key error should now be triggered by InnoDB, but reported
          # by the ALTER TABLE because the index does not 'officially' exist yet.
          INSERT INTO t1 VALUES(6,3,1);
          SET DEBUG_SYNC = 'now SIGNAL dml_done';
          connection con1;
          # This is due to the duplicate entry (6,3,1).
          --error ER_DUP_UNKNOWN_IN_INDEX
          reap;
          DELETE FROM t1 WHERE c1=6;
          ALTER TABLE t1 ADD UNIQUE INDEX(c2);
          

          marko Marko Mäkelä added a comment - The error 'index is corrupted' is technically correct, but it would be nicer to report 'duplicate key error' in this particular case. The following scenario is possible: ADD UNIQUE INDEX copies the data and creates the index. Optionally, any log from concurrent DML is fully replayed, and the unique index is completed. Subsequent DML operations start to update the completed but not yet committed index. If the uniqueness is violated, the ALTER TABLE must return an error, not the DML. The uncommitted index will be corrupted. ALTER TABLE notices that the index was corrupted as a result of DML that completed after index creation. Note that secondary indexes will be created one at a time. If multiple indexes are created, the first ones would be directly updated by concurrent DML while some indexes are still being created. Also, uncommitted added indexes would be directly updated by DML while the MDL upgrade is in progress, so that the ALTER TABLE would be committed. At least in some cases, the corrupted index would be reported as a duplicate key. Here is an excerpt from the 10.0 innodb.innodb-index-online test: # Now, create the index without any concurrent DML, while no duplicate exists. SET DEBUG_SYNC = 'row_log_apply_after SIGNAL created WAIT_FOR dml_done' ; --send ALTER TABLE t1 ADD UNIQUE INDEX (c2); connection default ; SET DEBUG_SYNC = 'now WAIT_FOR created' ; # At this point, the index has been created inside InnoDB but not yet # in the MySQL data dictionary. eval $innodb_metrics_select; # A duplicate key error should now be triggered by InnoDB, but reported # by the ALTER TABLE because the index does not 'officially' exist yet. INSERT INTO t1 VALUES (6,3,1); SET DEBUG_SYNC = 'now SIGNAL dml_done' ; connection con1; # This is due to the duplicate entry (6,3,1). --error ER_DUP_UNKNOWN_IN_INDEX reap; DELETE FROM t1 WHERE c1=6; ALTER TABLE t1 ADD UNIQUE INDEX (c2);

          I am aware of the complexity but have also problems to agree with the report 'duplicate key error'.
          IMHO the error message should mostly explain to the user why his ALTER TABLE t1 ADD KEY idx ( col1 ) could not pass.
          The "ALTER TABLE t1 ADD KEY idx ( col1 )" operation itself

          • should be never capable to
          • cause a "duplicate key error" (idx is not defined UNIQUE)
          • affect a key with a name != "idx" like the "uidx" here so that this key "uidx" is than corrupted
            So a user should rather never get just such an error message if his DDL could not be served.
          • might be denied because of various reasons like
          • (arbitrary concurrency): the table and/or other indexes are permanent damaged/unusable
            Than please report:
            We cannot perform the DDL because this or that needs to be fixed but avoid any
            maybe upcoming impression that the shape of the "ALTER TABLE t1 ADD KEY idx ( col1 )"
            might be the reason.
          • (concurrency required):
          • MDL lock or similar not got, Deadlock etc.: Ok report the locking problem
          • the table and/or other indexes are caused by some concurrent already ongoing operation
            temporary in some "unusable" state or similar.
            Than please:
            Report that we cannot perform the DDL because some parallel ongoing SQL prevents that
            but please do not mention 'duplicate key error' or 'index corrupted' at all.
            What if the "ADD UNIQUE INDEX" sets a MDL lock preventing concurrent DDL on the same table?
          mleich Matthias Leich added a comment - I am aware of the complexity but have also problems to agree with the report 'duplicate key error'. IMHO the error message should mostly explain to the user why his ALTER TABLE t1 ADD KEY idx ( col1 ) could not pass. The "ALTER TABLE t1 ADD KEY idx ( col1 )" operation itself should be never capable to cause a "duplicate key error" (idx is not defined UNIQUE) affect a key with a name != "idx" like the "uidx" here so that this key "uidx" is than corrupted So a user should rather never get just such an error message if his DDL could not be served. might be denied because of various reasons like (arbitrary concurrency): the table and/or other indexes are permanent damaged/unusable Than please report: We cannot perform the DDL because this or that needs to be fixed but avoid any maybe upcoming impression that the shape of the "ALTER TABLE t1 ADD KEY idx ( col1 )" might be the reason. (concurrency required): MDL lock or similar not got, Deadlock etc.: Ok report the locking problem the table and/or other indexes are caused by some concurrent already ongoing operation temporary in some "unusable" state or similar. Than please: Report that we cannot perform the DDL because some parallel ongoing SQL prevents that but please do not mention 'duplicate key error' or 'index corrupted' at all. What if the "ADD UNIQUE INDEX" sets a MDL lock preventing concurrent DDL on the same table?

          I did not realize that there are multiple concurrent DDL on the table. I was under the impression that multiple DDL would be prevented by metadata lock (MDL), also during the time when online ALTER has downgraded the metadata lock. In that case, this would not be an InnoDB bug, but a bug in MDL.

          marko Marko Mäkelä added a comment - I did not realize that there are multiple concurrent DDL on the table. I was under the impression that multiple DDL would be prevented by metadata lock (MDL), also during the time when online ALTER has downgraded the metadata lock. In that case, this would not be an InnoDB bug, but a bug in MDL.

          The problem is that a running ALTER TABLE is not preventing another ALTER TABLE from being attempted on the same table.

          marko Marko Mäkelä added a comment - The problem is that a running ALTER TABLE is not preventing another ALTER TABLE from being attempted on the same table.

          People

            wlad Vladislav Vaintroub
            mleich Matthias Leich
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.