Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26936

Recovery crash on rolling back DELETE FROM SYS_INDEXES

    XMLWordPrintable

Details

    Description

      In yesterday’s stress tests by mleich we got a crash in a kill+restart test. I started a server on the saved data directory and got the following crash:

      10.6 dbd6c6dc01228fe6e63f3f7dc695eb56ca8cd28d

      2021-10-29 13:52:47 0 [Note] InnoDB: 128 rollback segments are active.
      mariadbd: /mariadb/10.6m/storage/innobase/trx/trx0trx.cc:1221: void trx_t::evict_table(table_id_t, bool): Assertion `!locked || (table->locks).start->trx == this' failed.
      

      This occurred while we were rolling back a DELETE operation of a SYS_INDEXES record for which NAME='\xffuidx'. The special byte '\xff' indicates that it is a stub for ADD INDEX uidx. At the same time, we had recovered a DML transaction that is holding a lock on the same user table (t1). The assertion fails, because tables must never be evicted if other transactions are holding locks on them.

      With some effort, I created a repeatable test case for this:

      --source include/have_innodb.inc
      # The embedded server tests do not support restarting.
      --source include/not_embedded.inc
      --source include/have_debug.inc
      --source include/have_debug_sync.inc
       
      connection default;
      CREATE TABLE t1(a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
      INSERT INTO t1 VALUES(1,1);
       
      connect ddl, localhost, root;
      SET DEBUG_SYNC = 'row_merge_after_scan SIGNAL scanned WAIT_FOR commit';
      SET DEBUG_SYNC = 'before_commit_rollback_inplace SIGNAL c WAIT_FOR ever';
      send ALTER TABLE t1 ADD UNIQUE INDEX(b), ALGORITHM=INPLACE;
       
      connection default;
      SET DEBUG_SYNC = 'now WAIT_FOR scanned';
      BEGIN;
      INSERT INTO t1 VALUES(2,1);
      SET DEBUG_SYNC = 'now SIGNAL commit';
      SET DEBUG_SYNC = 'now WAIT_FOR c';
      SET GLOBAL innodb_flush_log_at_trx_commit=1;
      INSERT INTO t1 VALUES(3,3);
      sleep 1;
       
      --source include/kill_mysqld.inc
      disconnect ddl;
      --source include/start_mysqld.inc
       
      CHECK TABLE t1;
      SHOW CREATE TABLE t1;
      SELECT * FROM t1;
      DROP TABLE t1;
      

      Note: I have no idea why that sleep 1 is needed. I suspect MDEV-26789 or some related changes. Is our durability broken now?
      The test requires the following synchronization point:

      diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc
      index adeaf87f7fe..79a308a20a6 100644
      --- a/storage/innobase/handler/handler0alter.cc
      +++ b/storage/innobase/handler/handler0alter.cc
      @@ -8764,6 +8764,7 @@ inline bool rollback_inplace_alter_table(Alter_inplace_info *ha_alter_info,
             ut_d(dict_table_check_for_dup_indexes(ctx->old_table, CHECK_ABORTED_OK));
           }
       
      +    DEBUG_SYNC(ctx->trx->mysql_thd, "before_commit_rollback_inplace");
           commit_unlock_and_unlink(ctx->trx);
           if (fts_exist)
             purge_sys.resume_FTS();
      

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.