[MDEV-25852] Orphan #sql*.ibd files are occasionally left behind after killed ALTER TABLE Created: 2021-06-03 Updated: 2021-06-14 Resolved: 2021-06-09 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Affects Version/s: | 10.6 |
| Fix Version/s: | 10.6.2 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Marko Mäkelä | Assignee: | Thirunarayanan Balathandayuthapani |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | recovery | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
Even with the third fix related to
Occasionally, the list_files will list a file #sql-ib*.ibd or #sql-backup-*.ibd. If that last line is commented out, the test will continue, and the discrepancy will be reported by check-testcase (which does not count as a failure for mtr). Here are a couple of examples:
and
Nothing else is failing. Attached is a copy of a data directory where we end up with an orphan file. The following patch would fix that, but as collateral damage, we would fail in a different way, by prematurely deleting a file t2.ibd during the test:
Here is my incorrect patch:
|
| Comments |
| Comment by Marko Mäkelä [ 2021-06-03 ] | |||||||||||||||||||||||||||||||||||
|
dd.tar.gz | |||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-03 ] | |||||||||||||||||||||||||||||||||||
|
I debugged dd-t2.tar.gz
That file name belongs to another tablespace, with id=7:
The incorrect name for tablespace 6 was assigned here:
I think that we must make the deferred_spaces aware of FILE_RENAME records. I made an attempt at that, and managed to fix both dd.tar.gz
| |||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-04 ] | |||||||||||||||||||||||||||||||||||
|
thiru, thank you, your refinement of my fix does the trick. There is also a race condition in the test. We must wait for purge, so that any #sql-ib.ibd files for dropped tables will be removed before the test ends. With that fixed, a debug build of the server test failed due to an unrelated message:
The release build of the same completed successfully:
I plan to repeat another variant of the test, with FULLTEXT INDEX instead of a normal INDEX, so that more files will be created and deleted. But I think that this problem is solved now, and I will take care of pushing the fix. Thank you for finishing it! | |||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-04 ] | |||||||||||||||||||||||||||||||||||
|
The following worked too, for 27000 runs of the release build. However, if I replace either --echo with send, it would occasionally report orphan FTS_*.ibd files for the common tables (but no orphan tables). That should be fixed by MDEV-25850.
| |||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-07 ] | |||||||||||||||||||||||||||||||||||
|
Unfortunately, this still needs more work. data_copy.tar.bz2
Furthermore, I started to think that it is useless to open the same files over and over again:
We should already have filtered out duplicated name at this point. Currently, we are paying the overhead of several system calls in fil_ibd_load(). If some file t.ibd did not contain a valid tablespace ID, it will not contain it on a subsequent call either, because we are checking the consistency of the files before we are applying any changes, right? Maybe we should have a mapping from file names to tablespace IDs. If the first page of the file was not correctly written before recovery, then we would store 0 as the tablespace ID for that file name. Last, I do not see why Datafile::read_first_page() has to be so complicated. We could simply attempt to read innodb_page_size bytes once. If the file is too short or the data is all-zero, we will add it to deferred_spaces. If its FSP_SPACE_FLAGS do not match innodb_page_size we will reject the file (normally, refuse recovery). Finally, on checksum mismatch we might also simply add the file to deferred_spaces. Page 0 will never be reinitialized, and when it is initialized during tablespace creation, also all other pages will be reinitialized. So it should be of no help to try to find the tablespace ID in subsequent pages of the file. |