Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25655

Atomic DDL: InnoDB: Cannot replay rename of tablespace, error Data structure corruption

Details

    Description

      Two examples follow. I have a few more if needed.


      Case 1

      bb-10.6-monty-innodb 790a52644c

      2021-05-11 19:04:37 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54303,54303
      2021-05-11 19:04:41 0 [Note] InnoDB: Starting final batch to recover 2264 pages from redo log.
      2021-05-11 19:04:44 0 [ERROR] InnoDB: Cannot replay rename of tablespace 81 from './test/tt6.ibd' to './test/table75_innodb_int_autoinc.ibd' because the target file exists
      2021-05-11 19:04:44 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1473] with error Data structure corruption
      

      In the case above, the test was running concurrent DDL in 4 threads and was sigkilled in the process. The last executed statements in each thread, according to the general log, were

      28 Query    DROP TABLE `table53_innodb_int_autoinc`, `table76_innodb_int_autoinc`, vv2
      30 Query    ALTER TABLE `table89_innodb_int_autoinc` DROP KEY `ncol4`, ADD KEY IF NOT EXISTS n ( `ncol4` ) COMMENT 'pad'
      31 Query    UNLOCK TABLES
      29 Query    RENAME TABLE `t15` TO tt6
      

      The logs, datadirs and rr profiles are available.


      Case 2

      bb-10.6-monty-innodb 790a52644c

      2021-05-11 17:59:28 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54303,54303
      2021-05-11 17:59:29 0 [Note] InnoDB: Starting final batch to recover 456 pages from redo log.
      2021-05-11 17:59:30 0 [ERROR] InnoDB: Cannot replay rename of tablespace 13 from './test/tt11.ibd' to './test/t5_innodb.ibd' because the target file exists
      2021-05-11 17:59:30 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1473] with error Data structure corruption
      

      In the case above, the test was running various DDL in a single thread in PS protocol, got sigkilled upon running a query, and failed with the error above upon recovery.
      The logs and the data directories are available. Unfortunately, in PS protocol the last statement may be missing from the general log.
      rr profiles are also available, but the profile before crash seems to be broken somehow.
      So, this case is better than the previous one in the sense that there was only one thread involved, but worse in everything else.

      I think I've seen similar failures on previous releases without atomic DDL, so it's possibly not anything new, but I would expect it to be gone with atomic DDL. Strangely, I didn't see them in previous tests on atomic DDL, but now they have suddenly appeared in bulk.

      Attachments

        Issue Links

          Activity

            I can repeat the recovery failure of atomic_ddl/MDEV-25655-1/data_before_recovery locally. It is a corner case:

            2021-05-12 10:03:54 0 [Note] InnoDB: Starting final batch to recover 2264 pages from redo log.
            2021-05-12 10:03:55 0 [ERROR] InnoDB: Cannot replay rename of tablespace 81 from './test/tt6.ibd' to './test/table75_innodb_int_autoinc.ibd' because the target file exists
            

            In the output produced by --debug=d,ib_log, we see quite a few records for that tablespace 81. The file test/table75_innodb_int_autoinc.ibd consists of 622,592 NUL bytes before and after recovery. Also the file test/tt6.ibd initially consists of 212,992 NUL bytes, but after recovery (both in the original run and on my local rerun) I see the tablespace ID 432 in it.

            One thinkable fix might be to allow the rename to happen, with the motivation that both files were garbage to begin with. I will try to determine the exact sequence of relevant events and come up with a less intrusive fix. The last durable commit of the server before the SIGKILL was for DROP /* QNO 1101 CON_ID 28 */ TABLE `table53_innodb_int_autoinc.

            marko Marko Mäkelä added a comment - I can repeat the recovery failure of atomic_ddl/ MDEV-25655 -1/data_before_recovery locally. It is a corner case: 2021-05-12 10:03:54 0 [Note] InnoDB: Starting final batch to recover 2264 pages from redo log. 2021-05-12 10:03:55 0 [ERROR] InnoDB: Cannot replay rename of tablespace 81 from './test/tt6.ibd' to './test/table75_innodb_int_autoinc.ibd' because the target file exists In the output produced by --debug=d,ib_log , we see quite a few records for that tablespace 81. The file test/table75_innodb_int_autoinc.ibd consists of 622,592 NUL bytes before and after recovery. Also the file test/tt6.ibd initially consists of 212,992 NUL bytes, but after recovery (both in the original run and on my local rerun) I see the tablespace ID 432 in it. One thinkable fix might be to allow the rename to happen, with the motivation that both files were garbage to begin with. I will try to determine the exact sequence of relevant events and come up with a less intrusive fix. The last durable commit of the server before the SIGKILL was for DROP /* QNO 1101 CON_ID 28 */ TABLE `table53_innodb_int_autoinc .

            Before the SIGKILL, the server executed 2 related DDL statements (and many unrelated ones):

            RENAME /* QNO 881 CON_ID 30 */ TABLE `table75_innodb_int_autoinc` TO tt6, vv19 TO `table69_innodb_int_autoinc`, `table35_innodb_int_autoinc` TO vv10, vv15 TO vv7, tt9 TO tt4;
            CREATE /* QNO 1443 CON_ID 30 */ TABLE IF NOT EXISTS tt6 AS SELECT * FROM `table61_innodb_int_autoinc`;
            

            The RENAME was rolled back on the high level. In InnoDB, not only the rename of table75_innodb_int_autoinc to tt6 was committed, but also the rename of tt6 back to table75_innodb_int_autoinc was committed.
            Also the CREATE TABLE tt6 was committed inside InnoDB.

            At the time of the SIGKILL, both files contain only NUL bytes.
            I tried and failed to repeat this with the following test:

            --source include/have_innodb.inc
             
            SET GLOBAL innodb_max_dirty_pages_pct=0.0;
            let $wait_condition =
            SELECT variable_value = 0
            FROM information_schema.global_status
            WHERE variable_name = 'INNODB_BUFFER_POOL_PAGES_DIRTY';
            --source include/wait_condition.inc
             
            create table t1(a int primary key) engine=innodb;
            rename table t1 to t2;
            rename table t2 to t1;
            create table t2(a int primary key) engine=innodb;
            --let $restart_timeout=0
            --source include/restart_mysqld.inc
            drop table t1,t2;
            

            I tried a few things to ensure that both files would be recovered as filled with NUL bytes, but my reduced test did not fail for me:

            ./mtr --mysqld=--innodb-write-io-threads=2 --mysqld=--skip-innodb-use-native-aio
            ./mtr --mysqld=--innodb-page-cleaner-disabled-debug=1 --mysqld=--skip-innodb-use-native-aio
            

            marko Marko Mäkelä added a comment - Before the SIGKILL, the server executed 2 related DDL statements (and many unrelated ones): RENAME /* QNO 881 CON_ID 30 */ TABLE `table75_innodb_int_autoinc` TO tt6, vv19 TO `table69_innodb_int_autoinc`, `table35_innodb_int_autoinc` TO vv10, vv15 TO vv7, tt9 TO tt4; CREATE /* QNO 1443 CON_ID 30 */ TABLE IF NOT EXISTS tt6 AS SELECT * FROM `table61_innodb_int_autoinc`; The RENAME was rolled back on the high level. In InnoDB, not only the rename of table75_innodb_int_autoinc to tt6 was committed, but also the rename of tt6 back to table75_innodb_int_autoinc was committed. Also the CREATE TABLE tt6 was committed inside InnoDB. At the time of the SIGKILL, both files contain only NUL bytes. I tried and failed to repeat this with the following test: --source include/have_innodb.inc   SET GLOBAL innodb_max_dirty_pages_pct=0.0; let $wait_condition = SELECT variable_value = 0 FROM information_schema.global_status WHERE variable_name = 'INNODB_BUFFER_POOL_PAGES_DIRTY' ; --source include/wait_condition.inc   create table t1(a int primary key ) engine=innodb; rename table t1 to t2; rename table t2 to t1; create table t2(a int primary key ) engine=innodb; --let $restart_timeout=0 --source include/restart_mysqld.inc drop table t1,t2; I tried a few things to ensure that both files would be recovered as filled with NUL bytes, but my reduced test did not fail for me: ./mtr --mysqld=--innodb-write-io-threads=2 --mysqld=--skip-innodb-use-native-aio ./mtr --mysqld=--innodb-page-cleaner-disabled-debug=1 --mysqld=--skip-innodb-use-native-aio

            This is an anomaly of the refined MDEV-24626 recovery. Before starting to apply any log to deferred-recovery tablespaces, we must complete all renames.

            marko Marko Mäkelä added a comment - This is an anomaly of the refined MDEV-24626 recovery. Before starting to apply any log to deferred-recovery tablespaces, we must complete all renames.

            root cause analysis and fix (in the MDEV-24626 branch)

            marko Marko Mäkelä added a comment - root cause analysis and fix (in the MDEV-24626 branch)

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.