[MDEV-27111] atomic.rename_table test case fails on bb-10.6-MDEV-27022 branch Created: 2021-11-23  Updated: 2022-06-22  Resolved: 2021-12-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.6, 10.7.2

Type: Bug Priority: Major
Reporter: Thirunarayanan Balathandayuthapani Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-27022 Buffer pool is being flushed during r... Closed
blocks MDEV-28923 atomic.rename_table occasionally fails Closed
Duplicate
duplicates MDEV-26954 Failed ALTER causes table corruption ... Closed
Relates
relates to MDEV-27234 InnoDB dictionary recovery wrongly us... Closed

 Description   

InnoDB fails to identify the tablespace after recovery in rename_table test case.

2021-11-23 18:35:55 0 [ERROR] InnoDB: Tablespace 6 was not found at ./test/t2.ibd.
2021-11-23 18:35:55 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore this and to permanently lose all changes to the tablespace.
2021-11-23 18:35:55 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1447] with error Tablespace not found

InnoDB removes the space id from deferred tablespace when name is being matched. I think defer tablespace should handle rename table in a better way



 Comments   
Comment by Thirunarayanan Balathandayuthapani [ 2021-11-23 ]

Patch is in bb-10.6-MDEV-27111

Comment by Thirunarayanan Balathandayuthapani [ 2021-11-26 ]

matthias reported the following issue:

pluto:/data/rqg/results/1637683377/TBR-1105/data/rqg/1637683377/102/1/rr
_RR_TRACE_DIR="." rr replay --mark-stdio mysqld-1     # Fate of DB server till SIGKILL
_RR_TRACE_DIR="." rr replay --mark-stdio mysqld-2     # Restart attempt which fails with
 
[rr 590711 6386]2021-11-23 19:16:54 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=50236030,53957790
[rr 590711 6503]2021-11-23 19:16:54 0 [Note] InnoDB: Ignoring data file './test/#sql-alter-6442a-12.ibd' with space ID 63. Another data file called ./test/t4.ibd exists with the same space ID.
[rr 590711 6581]2021-11-23 19:16:54 0 [Note] InnoDB: Ignoring data file './test/#sql-alter-6442a-19.ibd' with space ID 65. Another data file called ./test/t2.ibd exists with the same space ID.
[rr 590711 6919]2021-11-23 19:16:55 0 [ERROR] InnoDB: Tablespace 72 was not found at ./test/t5.ibd.
[rr 590711 6923]2021-11-23 19:16:55 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore this and to permanently lose all changes to the tablespace.
[rr 590711 6927]2021-11-23 19:16:55 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1447] with error Tablespace not found
[rr 590711 6931]2021-11-23 19:16:55 0 [Note] InnoDB: Starting shutdown...

Comment by Thirunarayanan Balathandayuthapani [ 2021-11-26 ]

I think there could be race condition between rename operation and fil_names_clear() written during checkpoint operation
we do write redo log for rename operation before renaming the tablespace (space->chain->start)->name

 
fil_name_write_rename(id, old_path, path);
  }
 
  return fil_rename_tablespace(id, old_path, path) ? DB_SUCCESS : DB_ERROR;

If we do checkpoint(fil_names_clear) between these 2 operation then InnoDB will try to recover the table using old_name
some like this:

lsn: 1000
mlog_file_rename 5, t1, #sql-ib183
 
lsn:1100
mlog_file_name 5, t1
 
LOG_CHECKPOINT_END_LSN is 1100

Comment by Marko Mäkelä [ 2021-12-13 ]

MDEV-27234 is another bug in the recovery of rename operations.

Comment by Marko Mäkelä [ 2022-06-20 ]

For the record, before the fix of MDEV-28870 I was able to reproduce the following failure locally:

10.6 0e4cf497ca11a7298e2bd896cb594bd52085a1d4

atomic.rename_table 'innodb'             w3 [ fail ]
        Test ended at 2022-06-20 11:21:37
 
CURRENT_TEST: atomic.rename_table
mysqltest: At line 155: query 'let $res=`select t1.a+t2.b+t3.c+t4.d from t1,t2,t3,t4`' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
2022-06-20 11:21:36 0 [ERROR] InnoDB: Tablespace 8 was not found at ./test/t5.ibd.
2022-06-20 11:21:36 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1197] with error Tablespace not found
2022-06-20 11:21:36 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-06-20 11:21:36 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

With the fix of MDEV-28870, the test passed for 120×10 runs (on a system where nproc reports 40).
My test invocation was the following, on a debug build:

nice ./mtr --mysqld=--innodb-max-dirty-pages-pct=0 --repeat=10 --parallel=120 atomic.rename_table{,,,,,,,,,,,,,,,,,,,}{,,,,,}

I think that innodb_max_dirty_pages_pct=0 should increase the chances of hitting the MDEV-28870 scenario, that is, a race condition between log checkpoint and processing FILE_RENAME or FILE_DELETE.

Generated at Thu Feb 08 09:50:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.