Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33993

Possible server hang on DROP INDEX or RENAME INDEX

Details

    Description

      Related to MDEV-33955, I asked mleich to test if there are any problems if a TRUNCATE TABLE on the InnoDB persistent statistics table is being executed concurrently with DDL operations. He produced an rr replay trace that would hit an assertion failure when a FULLTEXT INDEX is being dropped:

      10.6 0ccdf54b644352f42e1768bc660be7ab50c1e9d2

      mariadbd: /data/Server/10.6C/storage/innobase/lock/lock0lock.cc:3484: ib_lock_t* lock_table_create(dict_table_t*, unsigned int, trx_t*, ib_lock_t*): Assertion `!trx->dict_operation_lock_mode || (strstr(table->name.m_name, "/FTS_") && strstr(table->name.m_name, "_CONFIG") + sizeof("_CONFIG") == table->name.m_name + strlen(table->name.m_name) + 1)' failed.
      ...
      #7  0x0000556b211ff54c in lock_table_create (table=table@entry=0x64013650, type_mode=type_mode@entry=1, trx=trx@entry=0x388167814180, c_lock=c_lock@entry=0x0) at /data/Server/10.6C/storage/innobase/lock/lock0lock.cc:3484
      #8  0x0000556b211ffd9d in lock_table_low (table=table@entry=0x64013650, mode=mode@entry=LOCK_IX, thr=thr@entry=0x49be3ba983d8, trx=trx@entry=0x388167814180) at /data/Server/10.6C/storage/innobase/lock/lock0lock.cc:3778
      #9  0x0000556b212000de in lock_table (table=0x64013650, fktable=fktable@entry=0x0, mode=mode@entry=LOCK_IX, thr=thr@entry=0x49be3ba983d8) at /data/Server/10.6C/storage/innobase/lock/lock0lock.cc:3837
      #10 0x0000556b2131c785 in row_sel_step (thr=thr@entry=0x49be3ba983d8) at /data/Server/10.6C/storage/innobase/row/row0sel.cc:2396
      #11 0x0000556b21277dea in que_thr_step (thr=thr@entry=0x49be3ba983d8) at /data/Server/10.6C/storage/innobase/que/que0que.cc:564
      #12 0x0000556b2127838c in que_run_threads_low (thr=thr@entry=0x49be3ba983d8) at /data/Server/10.6C/storage/innobase/que/que0que.cc:644
      #13 0x0000556b21278453 in que_run_threads (thr=0x49be3ba983d8) at /data/Server/10.6C/storage/innobase/que/que0que.cc:664
      #14 0x0000556b21278b54 in que_eval_sql (info=info@entry=0x49be3801c750, sql=sql@entry=0x556b219bb1e8 "PROCEDURE DELETE_FROM_INDEX_STATS () IS\nBEGIN\nDELETE FROM \"mysql/innodb_index_stats\" WHERE\ndatabase_name = :database_name AND\ntable_name = :table_name AND\nindex_name = :index_name;\nEND;\n", 
          trx=trx@entry=0x388167814180) at /data/Server/10.6C/storage/innobase/que/que0que.cc:703
      #15 0x0000556b214695da in dict_stats_exec_sql (pinfo=pinfo@entry=0x49be3801c750, sql=sql@entry=0x556b219bb1e8 "PROCEDURE DELETE_FROM_INDEX_STATS () IS\nBEGIN\nDELETE FROM \"mysql/innodb_index_stats\" WHERE\ndatabase_name = :database_name AND\ntable_name = :table_name AND\nindex_name = :index_name;\nEND;\n", 
          trx=trx@entry=0x388167814180) at /data/Server/10.6C/storage/innobase/dict/dict0stats.cc:550
      #16 0x0000556b2146986b in dict_stats_delete_from_index_stats (database_name=database_name@entry=0x3ac06a0752a0 "test", table_name=table_name@entry=0x3ac06a075440 "t1", index_name=0x49be39ece890 "ftidx", trx=trx@entry=0x388167814180) at /data/Server/10.6C/storage/innobase/dict/dict0stats.cc:4332
      #17 0x0000556b211b4248 in commit_try_norebuild (ha_alter_info=ha_alter_info@entry=0x3ac06a076090, ctx=ctx@entry=0x49be38014ca0, altered_table=altered_table@entry=0x3ac06a076150, old_table=0x49be38049318, trx=trx@entry=0x388167814180, table_name=0x49be38075ced "t1")
          at /data/Server/10.6C/storage/innobase/handler/handler0alter.cc:10631
      #18 0x0000556b2119e655 in ha_innobase::commit_inplace_alter_table (this=<optimized out>, altered_table=0x3ac06a076150, ha_alter_info=0x3ac06a076090, commit=true) at /data/Server/10.6C/storage/innobase/handler/handler0alter.cc:11471
      

      The reduced state of SQL statements related to this table looks like the following:

      CREATE TABLE IF NOT EXISTS t1 (col1 INT, col2 INT, col3 INT, col4 TEXT) ENGINE = InnoDB;
      ALTER TABLE t1 ADD FULLTEXT KEY ftidx ( col4 );
      ALTER TABLE t1 DROP KEY ftidx;
      

      As far as I can tell, the problem is that commit_try_norebuild() is invoking dict_stats_delete_from_index_stats(), even though earlier in ha_innobase::commit_inplace_alter_table() we failed to acquire exclusive locks on both statistics tables:

      		if (table_stats && index_stats
      		    && !strcmp(table_stats->name.m_name, TABLE_STATS_NAME)
      		    && !strcmp(index_stats->name.m_name, INDEX_STATS_NAME)
      		    && !(error = lock_table_for_trx(table_stats,
      						    trx, LOCK_X))) {
      			error = lock_table_for_trx(index_stats, trx, LOCK_X);
      		}
      

      In the rr replay trace that I analyzed, we had table_stats=nullptr. The function commit_try_rebuild() takes a parameter statistics_exist, which will prevent access to statistics tables in such cases, but the corresponding parameter is missing from commit_try_norebuild().

      Attachments

        Issue Links

          Activity

            Writing a DEBUG_SYNC test case for this was tricky. First, I managed to get an infinite loop in the following, for acquiring a metadata lock on mysql.innodb_table_stats:

            #0  dict_sys_t::freeze (this=0x56051754c440 <dict_sys>) at /mariadb/10.6m/storage/innobase/include/dict0dict.h:1539
            #1  dict_acquire_mdl_shared<false> (table=table@entry=0x7fde70008130, mdl_context=mdl_context@entry=0x7fde680010f0, mdl=mdl@entry=0x7fde98dfa670, table_op=table_op@entry=DICT_TABLE_OP_NORMAL)
                at /mariadb/10.6m/storage/innobase/dict/dict0dict.cc:726
            #2  0x00005605170805b1 in dict_acquire_mdl_shared<false> (table=table@entry=0x7fde70008130, thd=<optimized out>, mdl=mdl@entry=0x7fde98dfa670, table_op=table_op@entry=DICT_TABLE_OP_NORMAL)
                at /mariadb/10.6m/storage/innobase/dict/dict0dict.cc:831
            

            This happens before and after my attempted fix and the following change to a test:

            diff --git a/mysql-test/suite/innodb/t/innodb-alter-debug.test b/mysql-test/suite/innodb/t/innodb-alter-debug.test
            index c4a68ac71b7..e61b2fa7657 100644
            --- a/mysql-test/suite/innodb/t/innodb-alter-debug.test
            +++ b/mysql-test/suite/innodb/t/innodb-alter-debug.test
            @@ -144,15 +144,14 @@ DROP TABLE t;
             --echo #
             
             create table t1(f1 int not null primary key,
            -
            -                f2 int not null, index idx(f2))engine=innodb;
            +                f2 int not null, index idx(f2), index i(f2,f1))engine=innodb;
             
             insert into t1 values(1, 1);
             
             connect(con1,localhost,root,,,);
             SET DEBUG_SYNC='before_delete_table_stats SIGNAL blocked WAIT_FOR go';
            -SET innodb_lock_wait_timeout=0;
            -send ALTER TABLE t1 FORCE, ALGORITHM=COPY;
            +send SET STATEMENT innodb_lock_wait_timeout=0 FOR
            +ALTER TABLE t1 FORCE, ALGORITHM=COPY;
             
             connection default;
             SET DEBUG_SYNC='now WAIT_FOR blocked';
            @@ -166,6 +165,25 @@ reap;
             connection default;
             COMMIT;
             SET DEBUG_SYNC=RESET;
            +CHECK TABLE t1;
            +
            +connection con1;
            +SET DEBUG_SYNC='mdl_upgrade_lock SIGNAL blocked WAIT_FOR go';
            +send TRUNCATE TABLE mysql.innodb_table_stats;
            +
            +connection default;
            +SET DEBUG_SYNC='now WAIT_FOR blocked';
            +SET @save_timeout=@@GLOBAL.lock_wait_timeout;
            +SET GLOBAL lock_wait_timeout=0;
            +ALTER TABLE t1 DROP INDEX i;
            +SET GLOBAL lock_wait_timeout=@save_timeout;
            +SET DEBUG_SYNC='now SIGNAL go';
            +
            +connection con1;
            +reap;
            +
            +connection default;
            +SET DEBUG_SYNC=RESET;
             
             connection con1;
             ALTER TABLE t1 RENAME KEY idx TO idx1, ALGORITHM=COPY;
            

            As far as I understand, we would need more delicate timing, so that the TRUNCATE TABLE would be right in the middle of renaming or dropping the statistics table. So, let us try a simple RENAME to temporarily hide the statistics table:

            diff --git a/mysql-test/suite/innodb/t/innodb-alter-debug.test b/mysql-test/suite/innodb/t/innodb-alter-debug.test
            index c4a68ac71b7..6b94bfd214f 100644
            --- a/mysql-test/suite/innodb/t/innodb-alter-debug.test
            +++ b/mysql-test/suite/innodb/t/innodb-alter-debug.test
            @@ -144,15 +144,14 @@ DROP TABLE t;
             --echo #
             
             create table t1(f1 int not null primary key,
            -
            -                f2 int not null, index idx(f2))engine=innodb;
            +                f2 int not null, index idx(f2), index i(f2,f1))engine=innodb;
             
             insert into t1 values(1, 1);
             
             connect(con1,localhost,root,,,);
             SET DEBUG_SYNC='before_delete_table_stats SIGNAL blocked WAIT_FOR go';
            -SET innodb_lock_wait_timeout=0;
            -send ALTER TABLE t1 FORCE, ALGORITHM=COPY;
            +send SET STATEMENT innodb_lock_wait_timeout=0 FOR
            +ALTER TABLE t1 FORCE, ALGORITHM=COPY;
             
             connection default;
             SET DEBUG_SYNC='now WAIT_FOR blocked';
            @@ -167,6 +166,21 @@ connection default;
             COMMIT;
             SET DEBUG_SYNC=RESET;
             
            +RENAME TABLE mysql.innodb_table_stats TO mysql.innodb_table_stats_hidden;
            +connection con1;
            +SET DEBUG_SYNC='innodb_commit_inplace_before_lock SIGNAL blocked WAIT_FOR go';
            +send ALTER TABLE t1 DROP INDEX i;
            +
            +connection default;
            +SET DEBUG_SYNC='now WAIT_FOR blocked';
            +RENAME TABLE mysql.innodb_table_stats_hidden TO mysql.innodb_table_stats;
            +SET DEBUG_SYNC='now SIGNAL go';
            +
            +connection con1;
            +reap;
            +connection default;
            +SET DEBUG_SYNC=RESET;
            +
             connection con1;
             ALTER TABLE t1 RENAME KEY idx TO idx1, ALGORITHM=COPY;
             disconnect con1;
            diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc
            index 087a0c32319..74797011475 100644
            --- a/storage/innobase/handler/handler0alter.cc
            +++ b/storage/innobase/handler/handler0alter.cc
            @@ -11386,6 +11386,8 @@ ha_innobase::commit_inplace_alter_table(
             		}
             	}
             
            +	DEBUG_SYNC(m_user_thd, "innodb_commit_inplace_before_lock");
            +
             	DBUG_EXECUTE_IF("stats_lock_fail",
             			error = DB_LOCK_WAIT_TIMEOUT;
             			trx_rollback_for_mysql(trx););
            

            This finally did the trick:

            10.6 0ccdf54b644352f42e1768bc660be7ab50c1e9d2 with the above patch

            2024-04-25 12:39:28 7 [ERROR] InnoDB: Table mysql.innodb_table_stats not found.
            mariadbd: /mariadb/10.6m/storage/innobase/lock/lock0lock.cc:3484: ib_lock_t* lock_table_create(dict_table_t*, unsigned int, trx_t*, ib_lock_t*): Assertion `!trx->dict_operation_lock_mode || (strstr(table->name.m_name, "/FTS_") && strstr(table->name.m_name, "_CONFIG") + sizeof("_CONFIG") == table->name.m_name + strlen(table->name.m_name) + 1)' failed.
            

            marko Marko Mäkelä added a comment - Writing a DEBUG_SYNC test case for this was tricky. First, I managed to get an infinite loop in the following, for acquiring a metadata lock on mysql.innodb_table_stats : #0 dict_sys_t::freeze (this=0x56051754c440 <dict_sys>) at /mariadb/10.6m/storage/innobase/include/dict0dict.h:1539 #1 dict_acquire_mdl_shared<false> (table=table@entry=0x7fde70008130, mdl_context=mdl_context@entry=0x7fde680010f0, mdl=mdl@entry=0x7fde98dfa670, table_op=table_op@entry=DICT_TABLE_OP_NORMAL) at /mariadb/10.6m/storage/innobase/dict/dict0dict.cc:726 #2 0x00005605170805b1 in dict_acquire_mdl_shared<false> (table=table@entry=0x7fde70008130, thd=<optimized out>, mdl=mdl@entry=0x7fde98dfa670, table_op=table_op@entry=DICT_TABLE_OP_NORMAL) at /mariadb/10.6m/storage/innobase/dict/dict0dict.cc:831 This happens before and after my attempted fix and the following change to a test: diff --git a/mysql-test/suite/innodb/t/innodb-alter-debug.test b/mysql-test/suite/innodb/t/innodb-alter-debug.test index c4a68ac71b7..e61b2fa7657 100644 --- a/mysql-test/suite/innodb/t/innodb-alter-debug.test +++ b/mysql-test/suite/innodb/t/innodb-alter-debug.test @@ -144,15 +144,14 @@ DROP TABLE t; --echo # create table t1(f1 int not null primary key, - - f2 int not null, index idx(f2))engine=innodb; + f2 int not null, index idx(f2), index i(f2,f1))engine=innodb; insert into t1 values(1, 1); connect(con1,localhost,root,,,); SET DEBUG_SYNC='before_delete_table_stats SIGNAL blocked WAIT_FOR go'; -SET innodb_lock_wait_timeout=0; -send ALTER TABLE t1 FORCE, ALGORITHM=COPY; +send SET STATEMENT innodb_lock_wait_timeout=0 FOR +ALTER TABLE t1 FORCE, ALGORITHM=COPY; connection default; SET DEBUG_SYNC='now WAIT_FOR blocked'; @@ -166,6 +165,25 @@ reap; connection default; COMMIT; SET DEBUG_SYNC=RESET; +CHECK TABLE t1; + +connection con1; +SET DEBUG_SYNC='mdl_upgrade_lock SIGNAL blocked WAIT_FOR go'; +send TRUNCATE TABLE mysql.innodb_table_stats; + +connection default; +SET DEBUG_SYNC='now WAIT_FOR blocked'; +SET @save_timeout=@@GLOBAL.lock_wait_timeout; +SET GLOBAL lock_wait_timeout=0; +ALTER TABLE t1 DROP INDEX i; +SET GLOBAL lock_wait_timeout=@save_timeout; +SET DEBUG_SYNC='now SIGNAL go'; + +connection con1; +reap; + +connection default; +SET DEBUG_SYNC=RESET; connection con1; ALTER TABLE t1 RENAME KEY idx TO idx1, ALGORITHM=COPY; As far as I understand, we would need more delicate timing, so that the TRUNCATE TABLE would be right in the middle of renaming or dropping the statistics table. So, let us try a simple RENAME to temporarily hide the statistics table: diff --git a/mysql-test/suite/innodb/t/innodb-alter-debug.test b/mysql-test/suite/innodb/t/innodb-alter-debug.test index c4a68ac71b7..6b94bfd214f 100644 --- a/mysql-test/suite/innodb/t/innodb-alter-debug.test +++ b/mysql-test/suite/innodb/t/innodb-alter-debug.test @@ -144,15 +144,14 @@ DROP TABLE t; --echo # create table t1(f1 int not null primary key, - - f2 int not null, index idx(f2))engine=innodb; + f2 int not null, index idx(f2), index i(f2,f1))engine=innodb; insert into t1 values(1, 1); connect(con1,localhost,root,,,); SET DEBUG_SYNC='before_delete_table_stats SIGNAL blocked WAIT_FOR go'; -SET innodb_lock_wait_timeout=0; -send ALTER TABLE t1 FORCE, ALGORITHM=COPY; +send SET STATEMENT innodb_lock_wait_timeout=0 FOR +ALTER TABLE t1 FORCE, ALGORITHM=COPY; connection default; SET DEBUG_SYNC='now WAIT_FOR blocked'; @@ -167,6 +166,21 @@ connection default; COMMIT; SET DEBUG_SYNC=RESET; +RENAME TABLE mysql.innodb_table_stats TO mysql.innodb_table_stats_hidden; +connection con1; +SET DEBUG_SYNC='innodb_commit_inplace_before_lock SIGNAL blocked WAIT_FOR go'; +send ALTER TABLE t1 DROP INDEX i; + +connection default; +SET DEBUG_SYNC='now WAIT_FOR blocked'; +RENAME TABLE mysql.innodb_table_stats_hidden TO mysql.innodb_table_stats; +SET DEBUG_SYNC='now SIGNAL go'; + +connection con1; +reap; +connection default; +SET DEBUG_SYNC=RESET; + connection con1; ALTER TABLE t1 RENAME KEY idx TO idx1, ALGORITHM=COPY; disconnect con1; diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc index 087a0c32319..74797011475 100644 --- a/storage/innobase/handler/handler0alter.cc +++ b/storage/innobase/handler/handler0alter.cc @@ -11386,6 +11386,8 @@ ha_innobase::commit_inplace_alter_table( } } + DEBUG_SYNC(m_user_thd, "innodb_commit_inplace_before_lock"); + DBUG_EXECUTE_IF("stats_lock_fail", error = DB_LOCK_WAIT_TIMEOUT; trx_rollback_for_mysql(trx);); This finally did the trick: 10.6 0ccdf54b644352f42e1768bc660be7ab50c1e9d2 with the above patch 2024-04-25 12:39:28 7 [ERROR] InnoDB: Table mysql.innodb_table_stats not found. mariadbd: /mariadb/10.6m/storage/innobase/lock/lock0lock.cc:3484: ib_lock_t* lock_table_create(dict_table_t*, unsigned int, trx_t*, ib_lock_t*): Assertion `!trx->dict_operation_lock_mode || (strstr(table->name.m_name, "/FTS_") && strstr(table->name.m_name, "_CONFIG") + sizeof("_CONFIG") == table->name.m_name + strlen(table->name.m_name) + 1)' failed.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.