[MDEV-25655] Atomic DDL: InnoDB: Cannot replay rename of tablespace, error Data structure corruption Created: 2021-05-11  Updated: 2021-05-12  Resolved: 2021-05-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: N/A
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-17567 Atomic DDL Closed
relates to MDEV-24626 Remove synchronous write of page0 and... Closed

 Description   

Two examples follow. I have a few more if needed.


Case 1

bb-10.6-monty-innodb 790a52644c

2021-05-11 19:04:37 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54303,54303
2021-05-11 19:04:41 0 [Note] InnoDB: Starting final batch to recover 2264 pages from redo log.
2021-05-11 19:04:44 0 [ERROR] InnoDB: Cannot replay rename of tablespace 81 from './test/tt6.ibd' to './test/table75_innodb_int_autoinc.ibd' because the target file exists
2021-05-11 19:04:44 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1473] with error Data structure corruption

In the case above, the test was running concurrent DDL in 4 threads and was sigkilled in the process. The last executed statements in each thread, according to the general log, were

28 Query    DROP TABLE `table53_innodb_int_autoinc`, `table76_innodb_int_autoinc`, vv2
30 Query    ALTER TABLE `table89_innodb_int_autoinc` DROP KEY `ncol4`, ADD KEY IF NOT EXISTS n ( `ncol4` ) COMMENT 'pad'
31 Query    UNLOCK TABLES
29 Query    RENAME TABLE `t15` TO tt6

The logs, datadirs and rr profiles are available.


Case 2

bb-10.6-monty-innodb 790a52644c

2021-05-11 17:59:28 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54303,54303
2021-05-11 17:59:29 0 [Note] InnoDB: Starting final batch to recover 456 pages from redo log.
2021-05-11 17:59:30 0 [ERROR] InnoDB: Cannot replay rename of tablespace 13 from './test/tt11.ibd' to './test/t5_innodb.ibd' because the target file exists
2021-05-11 17:59:30 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1473] with error Data structure corruption

In the case above, the test was running various DDL in a single thread in PS protocol, got sigkilled upon running a query, and failed with the error above upon recovery.
The logs and the data directories are available. Unfortunately, in PS protocol the last statement may be missing from the general log.
rr profiles are also available, but the profile before crash seems to be broken somehow.
So, this case is better than the previous one in the sense that there was only one thread involved, but worse in everything else.

I think I've seen similar failures on previous releases without atomic DDL, so it's possibly not anything new, but I would expect it to be gone with atomic DDL. Strangely, I didn't see them in previous tests on atomic DDL, but now they have suddenly appeared in bulk.



 Comments   
Comment by Marko Mäkelä [ 2021-05-12 ]

I can repeat the recovery failure of atomic_ddl/MDEV-25655-1/data_before_recovery locally. It is a corner case:

2021-05-12 10:03:54 0 [Note] InnoDB: Starting final batch to recover 2264 pages from redo log.
2021-05-12 10:03:55 0 [ERROR] InnoDB: Cannot replay rename of tablespace 81 from './test/tt6.ibd' to './test/table75_innodb_int_autoinc.ibd' because the target file exists

In the output produced by --debug=d,ib_log, we see quite a few records for that tablespace 81. The file test/table75_innodb_int_autoinc.ibd consists of 622,592 NUL bytes before and after recovery. Also the file test/tt6.ibd initially consists of 212,992 NUL bytes, but after recovery (both in the original run and on my local rerun) I see the tablespace ID 432 in it.

One thinkable fix might be to allow the rename to happen, with the motivation that both files were garbage to begin with. I will try to determine the exact sequence of relevant events and come up with a less intrusive fix. The last durable commit of the server before the SIGKILL was for DROP /* QNO 1101 CON_ID 28 */ TABLE `table53_innodb_int_autoinc.

Comment by Marko Mäkelä [ 2021-05-12 ]

Before the SIGKILL, the server executed 2 related DDL statements (and many unrelated ones):

RENAME /* QNO 881 CON_ID 30 */ TABLE `table75_innodb_int_autoinc` TO tt6, vv19 TO `table69_innodb_int_autoinc`, `table35_innodb_int_autoinc` TO vv10, vv15 TO vv7, tt9 TO tt4;
CREATE /* QNO 1443 CON_ID 30 */ TABLE IF NOT EXISTS tt6 AS SELECT * FROM `table61_innodb_int_autoinc`;

The RENAME was rolled back on the high level. In InnoDB, not only the rename of table75_innodb_int_autoinc to tt6 was committed, but also the rename of tt6 back to table75_innodb_int_autoinc was committed.
Also the CREATE TABLE tt6 was committed inside InnoDB.

At the time of the SIGKILL, both files contain only NUL bytes.
I tried and failed to repeat this with the following test:

--source include/have_innodb.inc
 
SET GLOBAL innodb_max_dirty_pages_pct=0.0;
let $wait_condition =
SELECT variable_value = 0
FROM information_schema.global_status
WHERE variable_name = 'INNODB_BUFFER_POOL_PAGES_DIRTY';
--source include/wait_condition.inc
 
create table t1(a int primary key) engine=innodb;
rename table t1 to t2;
rename table t2 to t1;
create table t2(a int primary key) engine=innodb;
--let $restart_timeout=0
--source include/restart_mysqld.inc
drop table t1,t2;

I tried a few things to ensure that both files would be recovered as filled with NUL bytes, but my reduced test did not fail for me:

./mtr --mysqld=--innodb-write-io-threads=2 --mysqld=--skip-innodb-use-native-aio
./mtr --mysqld=--innodb-page-cleaner-disabled-debug=1 --mysqld=--skip-innodb-use-native-aio

Comment by Marko Mäkelä [ 2021-05-12 ]

This is an anomaly of the refined MDEV-24626 recovery. Before starting to apply any log to deferred-recovery tablespaces, we must complete all renames.

Comment by Marko Mäkelä [ 2021-05-12 ]

root cause analysis and fix (in the MDEV-24626 branch)

Generated at Thu Feb 08 09:39:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.