Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
Description
While MDEV-28870 reduced the failure probability of test atomic.rename_table, the test still occasionally fails, especially on the kvm-ubsan builder. Here is the relevant part of one failure:
10.6 0fa19fdebf0925be6ec5503938d541332f259cb5 |
CURRENT_TEST: atomic.rename_table
|
mysqltest: At line 155: query 'let $res=`select t1.a+t2.b+t3.c+t4.d from t1,t2,t3,t4`' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
|
…
|
2022-06-22 8:00:48 0 [ERROR] InnoDB: Tablespace 5 was not found at ./test/t5.ibd.
|
2022-06-22 8:00:48 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore this and to permanently lose all changes to the tablespace.
|
2022-06-22 8:00:48 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1447] with error Tablespace not found
|
I would need at least a copy of a data directory that failed to recover. Preferrably from 10.8 or a later branch, because the MDEV-14425 log file format is easier to read, thanks to the lack of any block framing. An rr record trace of the killed server could significantly ease the analysis.
Attachments
Issue Links
- is blocked by
-
MDEV-27111 atomic.rename_table test case fails on bb-10.6-MDEV-27022 branch
-
- Closed
-
-
MDEV-28870 InnoDB: Missing FILE_CREATE, FILE_DELETE or FILE_MODIFY before FILE_CHECKPOINT during crash recovery
-
- Closed
-
- is caused by
-
MDEV-24626 Remove synchronous write of page0 and flushing file during file creation
-
- Closed
-
I checked a copy of the data directory for a 10.9 kvm-ubsan failure that had failed like this:
10.9 f421d8f50dc6f730b1d01b0443880178e550e03a
CURRENT_TEST: atomic.rename_table
mysqltest: At line 155: query 'let $res=`select t1.a+t2.b+t3.c+t4.d from t1,t2,t3,t4`' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
…
***Warnings generated in error logs during shutdown after running tests: atomic.rename_table
2022-06-22 16:21:45 0 [ERROR] InnoDB: Tablespace 8 was not found at ./test/t5.ibd.
wget https://hasky.askmonty.org/logs/kvm-ubsan/1824-var.tar.gz
tar xzf 1824-var.tar.gz
cd 10.9
rsync -avil --delete ../var/log/atomic.rename_table-innodb/mysqld.1/data/ /dev/shm/data/
sql/mariadbd --datadir /dev/shm/data
data.tar.xz
is a copy of the data directory, for posterity. I think that the original copy will be available for a limited time only.
The first mention of the name test/t5 in the ib_logfile0 looks like a WRITE|0x80 record for SYS_TABLES at LSN (and file byte offset) 0x10e71, presumably updating an earlier record:
00010e70: 80bf 030b 0201 0774 6573 742f 7435 7800 .......test/t5x.
00010e80: 80bf 004e b296 cd01 7c68 b37f 2029 0008 ...N....|h.. )..
The second mention is a FILE_RENAME mini-transaction at 0x10f00 (the last 5 bytes are the end marker and the checksum):
00010f00: a00f 0500 2e2f 7465 7374 2f74 312e 6962 ...../test/t1.ib
00010f10: 6400 2e2f 7465 7374 2f74 352e 6962 6401 d../test/t5.ibd.
00010f20: 935c 03a1
Note: At this point, the tablespace ID was 5.
There is quite a bit of renaming going on. It is best to search the ib_logfile0 for test/t5.ibd and not test/t5 (which would be updates of SYS_TABLES.NAME). There is also a FILE_RENAME for renaming test/t5.ibd back to test/t1.ibd as tablespace 5.
The last record that mentions test/t5.ibd is a FILE_RENAME of test/t4.ibd to test/t5.ibd:
00077610: ____ ____ ____ ____ __a0 0f08 002e 2f74 ............../t
00077620: 6573 742f 7434 2e69 6264 002e 2f74 6573 est/t4.ibd../tes
00077630: 742f 7435 2e69 6264 0144 89f4 65__ ____ t/t5.ibd........
The tablespace identifier here is 8, just like in the message.
All data files test/t1.ibd, test/t2.ibd, test/t4.ibd, test/t5.ibd in the data directory are 6*16384 bytes long and consist of NUL bytes.
The last checkpoint is 0xec63, and it seems to be straight at the end of the ./mtr --bootstrap. All of the test/ tables were created after that, so the data directory should be fully recoverable.
thiru, it looks like the deferred recovery logic of
MDEV-24626still has some trouble with rename operations.