Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27234

InnoDB dictionary recovery wrongly uses READ UNCOMMITTED isolation level instead of READ COMMITTED

Details

    Description

      The following patch tries to do extra checkpoint in fil_rename_tablespace(). This lead
      to fail to write few tablespace file name during fil_names_clear().

      diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
      index 681cee32fd5..0b6e4764d98 100644
      --- a/storage/innobase/fil/fil0fil.cc
      +++ b/storage/innobase/fil/fil0fil.cc
      @@ -1918,6 +1918,7 @@ fil_rename_tablespace(
              ut_ad(strchr(new_file_name, '/'));
       
              if (!recv_recovery_is_on()) {
      +               log_make_checkpoint();
                      mysql_mutex_lock(&log_sys.mutex);
              }
      

      The following test case were failing in 10.6:

      innodb.innodb-alter-tempfile
      CURRENT_TEST: innodb.innodb-alter-tempfile
      mysqltest: At line 45: query 'show create table t1' failed: ER_GET_ERRNO (1030): Got error 194 "Tablespace is missing for a table" from storage engine InnoDB
       
      innodb.alter_crash 'innodb'              w4 [ fail ]
              Test ended at 2021-12-12 10:22:42
       
      CURRENT_TEST: innodb.alter_crash
      mysqltest: At line 125: query 'INSERT INTO t2 VALUES (5,6),(7,8)' failed: ER_GET_ERRNO (1030): Got error 194 "Tablespace is missing for a table" from storage engine InnoDB
       
      innodb.instant_alter_crash 'innodb'      w5 [ fail ]
              Test ended at 2021-12-12 10:23:28
       
      CURRENT_TEST: innodb.instant_alter_crash
      mysqltest: At line 200: query 'SHOW CREATE TABLE t3' failed: ER_GET_ERRNO (1030): Got error 194 "Tablespace is missing for a table" from storage engine InnoDB
       
      innodb.truncate_crash 'innodb'           w4 [ fail ]
              Test ended at 2021-12-12 10:24:16
       
      CURRENT_TEST: innodb.truncate_crash
      mysqltest: At line 21: query 'SELECT COUNT(*) FROM t1' failed: ER_GET_ERRNO (1030): Got error 194 "Tablespace is missing for a table" from storage engine InnoDB
      
      

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä added a comment - - edited

            The trx_t::evict_table() is necessary for evicting table definitions on the rollback of DDL transactions, for example, on a CREATE TABLE that failed due to incorrect FOREIGN KEY constraints.

            My current work-in-progress fix still shows the index mismatch message for data.tar.xz:

            2022-03-21 17:11:23 3 [ERROR] Cannot find index Marvão_uidx1 in InnoDB index dictionary.
            2022-03-21 17:11:23 3 [ERROR] InnoDB indexes are inconsistent with what defined in .frm for table ./test/t3
            2022-03-21 17:11:23 3 [ERROR] InnoDB could not find key no 0 with name Marvão_uidx1 from dict cache for table test/t3
            2022-03-21 17:11:23 3 [ERROR] InnoDB: Table test/t3 contains 3 indexes inside InnoDB, which is different from the number of indexes 3 defined in the .frm file. See https://mariadb.com/kb/en/innodb-troubleshooting/
            

            marko Marko Mäkelä added a comment - - edited The trx_t::evict_table() is necessary for evicting table definitions on the rollback of DDL transactions, for example, on a CREATE TABLE that failed due to incorrect FOREIGN KEY constraints. My current work-in-progress fix still shows the index mismatch message for data.tar.xz : 2022-03-21 17:11:23 3 [ERROR] Cannot find index Marvão_uidx1 in InnoDB index dictionary. 2022-03-21 17:11:23 3 [ERROR] InnoDB indexes are inconsistent with what defined in .frm for table ./test/t3 2022-03-21 17:11:23 3 [ERROR] InnoDB could not find key no 0 with name Marvão_uidx1 from dict cache for table test/t3 2022-03-21 17:11:23 3 [ERROR] InnoDB: Table test/t3 contains 3 indexes inside InnoDB, which is different from the number of indexes 3 defined in the .frm file. See https://mariadb.com/kb/en/innodb-troubleshooting/

            I spent a lot of time on the test case before I confirmed with rr replay that before the server is killed, the ADD PRIMARY KEY was actually committed. The actual reason why the #sql-ib file was not dropped is that the fil_delete_tablespace() call in row_purge_remove_clust_if_poss_low() did not find the tablespace, because dict_check_sys_tables() was skipping delete-marked records of SYS_TABLES.

            My work-in-progress fix of that broke a number of mtr tests (mostly IMPORT TABLESPACE) and still fails to recover data.tar.xz correctly.

            marko Marko Mäkelä added a comment - I spent a lot of time on the test case before I confirmed with rr replay that before the server is killed, the ADD PRIMARY KEY was actually committed. The actual reason why the #sql-ib file was not dropped is that the fil_delete_tablespace() call in row_purge_remove_clust_if_poss_low() did not find the tablespace, because dict_check_sys_tables() was skipping delete-marked records of SYS_TABLES . My work-in-progress fix of that broke a number of mtr tests (mostly IMPORT TABLESPACE ) and still fails to recover data.tar.xz correctly.

            The data.tar.xz cannot be helped by this fix. The warning message mentions an unexpectedly missing index. In fact, in dict_load_index_low(), a delete-marked record of SYS_INDEXES for it was encountered and correctly filtered out in my development branch, because no active transaction for the DB_TRX_ID was found. We would need an rr replay trace of such a failure to understand why the DDL recovery apparently chose the wrong .frm file when recovering from a crash during DROP INDEX Marvão_uidx1.

            marko Marko Mäkelä added a comment - The data.tar.xz cannot be helped by this fix. The warning message mentions an unexpectedly missing index. In fact, in dict_load_index_low() , a delete-marked record of SYS_INDEXES for it was encountered and correctly filtered out in my development branch, because no active transaction for the DB_TRX_ID was found. We would need an rr replay trace of such a failure to understand why the DDL recovery apparently chose the wrong .frm file when recovering from a crash during DROP INDEX Marvão_uidx1 .

            I was able to fix the recovery of data.tar.xz. It turned out that there was a SYS_INDEXES record for an ADD INDEX stub that had been deleted by a committed transaction. The table->def_trx_id was wrongly updated to the DB_TRX_ID of that record, and InnoDB wrongly informed the DDL log recovery that the ALTER TABLE transaction had been committed inside InnoDB. Yes, we must pay attention to the DB_TRX_ID of committed delete-marked SYS_INDEXES records, but only if the SYS_INDEXES.NAME does not identify the index as incomplete (by starting with a 0xff byte).

            For the crash in the recovery of the test innodb.instant_alter_crash I got an idea: btr_cur_instant_init_low() should perform a READ COMMITTED of the instant ADD COLUMN metadata record. In the case of the crash, that record had been inserted by an uncommitted transaction.

            marko Marko Mäkelä added a comment - I was able to fix the recovery of data.tar.xz . It turned out that there was a SYS_INDEXES record for an ADD INDEX stub that had been deleted by a committed transaction. The table->def_trx_id was wrongly updated to the DB_TRX_ID of that record, and InnoDB wrongly informed the DDL log recovery that the ALTER TABLE transaction had been committed inside InnoDB. Yes, we must pay attention to the DB_TRX_ID of committed delete-marked SYS_INDEXES records, but only if the SYS_INDEXES.NAME does not identify the index as incomplete (by starting with a 0xff byte). For the crash in the recovery of the test innodb.instant_alter_crash I got an idea: btr_cur_instant_init_low() should perform a READ COMMITTED of the instant ADD COLUMN metadata record. In the case of the crash, that record had been inserted by an uncommitted transaction.
            mleich Matthias Leich added a comment - - edited

            origin/bb-10.6-MDEV-27234 13dfdd3953c076177f8b61355cb70a887a958b4b 2022-03-24T12:13:54+02:00
            behaved well in RQG testing (battery for broad range coverage only).

            mleich Matthias Leich added a comment - - edited origin/bb-10.6- MDEV-27234 13dfdd3953c076177f8b61355cb70a887a958b4b 2022-03-24T12:13:54+02:00 behaved well in RQG testing (battery for broad range coverage only).

            People

              marko Marko Mäkelä
              thiru Thirunarayanan Balathandayuthapani
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.