Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21602

CREATE TABLE…PRIMARY KEY…SELECT workaround causes DROP TABLE to ignore locks

Details

    Description

      A table is allowed to be dropped despite being in use:

       
      --connection crash_me
      CREATE TABLE t1 (pk INT PRIMARY KEY) ENGINE=InnoDB;
      xa start '1'; insert into t1 set pk=1; xa end '1'; xa prepare '1';
       
      # KILL server_process & restart
      xa recover; # =>
      +----------+--------------+--------------+------+
      | formatID | gtrid_length | bqual_length | data |
      +----------+--------------+--------------+------+
      |        1 |            1 |            0 | 1    |
      +----------+--------------+--------------+------+
      1 row in set (0.00 sec)
       
      --connection default
      --error 0
      DROP TABLE t1;
      

      Notice that use cases of survived prepared user XA have involved MDEV-742 disconnected prepared so instead of the crash a simple --diconnect would leave the trx prepared and offended by the following DROP.

      Attachments

        Issue Links

          Activity

            MDEV-22733 (which was caused by MDEV-742 in 10.5.2) was closed, claiming to be a duplicate of this bug, even though this bug affects much earlier versions than 10.5.

            marko Marko Mäkelä added a comment - MDEV-22733 (which was caused by MDEV-742 in 10.5.2) was closed, claiming to be a duplicate of this bug, even though this bug affects much earlier versions than 10.5.

            In MariaDB Server 10.6, this should be fixed on the InnoDB side by the DROP rewrite (and removal of the ‘background drop table’ queue) that is the third fix related to MDEV-25506. The CREATE TABLE…SELECT will consist of two transactions, mandated by the SQL layer:

            1. CREATE TABLE
            2. INSERT…SELECT: the error handling will invoke ha_innobase::delete_table() while the transaction is still active

            ha_innobase::delete_table() can avoid starting a new transaction, and instead reuse the second transaction that was started by the SQL layer. We can simply drop the table and commit the transaction. Such hack even seems to work for partitioned tables, even though I expect that only one of the ha_innobase::delete_table() invocations will use the hijacked transaction. I tested it with the following:

            --source include/have_innodb.inc
            --source include/have_partition.inc
             
            CREATE TABLE t(a INT) ENGINE=InnoDB;
            INSERT INTO t VALUES(1),(1);
             
            --error ER_DUP_ENTRY
            CREATE TABLE t1(PRIMARY KEY(a)) ENGINE=InnoDB
            SELECT * FROM t;
            --error ER_DUP_ENTRY
            CREATE TABLE t1(PRIMARY KEY(a)) ENGINE=InnoDB
            PARTITION BY HASH(a) PARTITIONS 5 SELECT * FROM t;
            --error ER_DUP_ENTRY
            CREATE TEMPORARY TABLE t0(PRIMARY KEY(a)) ENGINE=InnoDB SELECT * FROM t;
             
            DROP TABLE t;
            

            marko Marko Mäkelä added a comment - In MariaDB Server 10.6, this should be fixed on the InnoDB side by the DROP rewrite (and removal of the ‘background drop table’ queue) that is the third fix related to MDEV-25506 . The CREATE TABLE…SELECT will consist of two transactions, mandated by the SQL layer: CREATE TABLE INSERT…SELECT : the error handling will invoke ha_innobase::delete_table() while the transaction is still active ha_innobase::delete_table() can avoid starting a new transaction, and instead reuse the second transaction that was started by the SQL layer. We can simply drop the table and commit the transaction. Such hack even seems to work for partitioned tables, even though I expect that only one of the ha_innobase::delete_table() invocations will use the hijacked transaction. I tested it with the following: --source include/have_innodb.inc --source include/have_partition.inc   CREATE TABLE t(a INT ) ENGINE=InnoDB; INSERT INTO t VALUES (1),(1);   --error ER_DUP_ENTRY CREATE TABLE t1( PRIMARY KEY (a)) ENGINE=InnoDB SELECT * FROM t; --error ER_DUP_ENTRY CREATE TABLE t1( PRIMARY KEY (a)) ENGINE=InnoDB PARTITION BY HASH(a) PARTITIONS 5 SELECT * FROM t; --error ER_DUP_ENTRY CREATE TEMPORARY TABLE t0( PRIMARY KEY (a)) ENGINE=InnoDB SELECT * FROM t;   DROP TABLE t;

            For 10.6 as part of the work-in-progress MDEV-25506 part 3 of 3, I implemented this CREATE…SELECT detection in ha_innobase::delete_table(). The first condition is for avoiding CREATE…REPLACE.

              trx_t *trx= parent_trx;
              if (!trx->lock.table_locks.empty() &&
                  thd_sql_command(trx->mysql_thd) == SQLCOM_CREATE_TABLE)
              {
            #if 0 // MDEV-21602 FIXME: this fails for innodb.innodb and some others
                for (const lock_t *l : trx->lock.table_locks)
                  if (l && l->type_mode == (LOCK_IX | LOCK_TABLE) &&
                      l->un_member.tab_lock.table == table)
                    goto create_select;
                sql_print_warning("InnoDB: CREATE...SELECT did not hold expected locks");
            create_select:
            #endif
                /* CREATE TABLE...PRIMARY KEY...SELECT ought to be dropping the
                table because a duplicate key was detected. We shall hijack the
                existing transaction to drop the table and commit the transaction.
                If this is a partitioned table, one partition will use this hijacked
                transaction; others will use a separate transaction, one per partition. */
            

            The warning message above would also be issued for the above partitioning test, if the duplicates are not being reported for the first partition. (We would hijack the transaction anyway for dropping the first partition.) I think that such logic is fragile, and it would be better to have something that reliably distinguishes CREATE TABLE…SELECT from other statements, such as CREATE OR REPLACE TABLE.

            marko Marko Mäkelä added a comment - For 10.6 as part of the work-in-progress MDEV-25506 part 3 of 3, I implemented this CREATE…SELECT detection in ha_innobase::delete_table() . The first condition is for avoiding CREATE…REPLACE . trx_t *trx= parent_trx; if (!trx->lock.table_locks.empty() && thd_sql_command(trx->mysql_thd) == SQLCOM_CREATE_TABLE) { #if 0 // MDEV-21602 FIXME: this fails for innodb.innodb and some others for ( const lock_t *l : trx->lock.table_locks) if (l && l->type_mode == (LOCK_IX | LOCK_TABLE) && l->un_member.tab_lock.table == table) goto create_select; sql_print_warning( "InnoDB: CREATE...SELECT did not hold expected locks" ); create_select: #endif /* CREATE TABLE...PRIMARY KEY...SELECT ought to be dropping the table because a duplicate key was detected. We shall hijack the existing transaction to drop the table and commit the transaction. If this is a partitioned table, one partition will use this hijacked transaction; others will use a separate transaction, one per partition. */ The warning message above would also be issued for the above partitioning test, if the duplicates are not being reported for the first partition. (We would hijack the transaction anyway for dropping the first partition.) I think that such logic is fragile, and it would be better to have something that reliably distinguishes CREATE TABLE…SELECT from other statements, such as CREATE OR REPLACE TABLE .

            MDEV-25506 in MariaDB Server 10.6.2 essentially fixed this. A less hacky solution could still be preferred.

            marko Marko Mäkelä added a comment - MDEV-25506 in MariaDB Server 10.6.2 essentially fixed this. A less hacky solution could still be preferred.

            My concern regarding detecting CREATE…SELECT in 10.6 was addressed.

            marko Marko Mäkelä added a comment - My concern regarding detecting CREATE…SELECT in 10.6 was addressed .

            People

              marko Marko Mäkelä
              Elkin Andrei Elkin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.