[MDEV-27551] mariabackup --backup aborts if a file is deleted during enumerate_ibd_files() Created: 2022-01-20  Updated: 2023-12-14

Status: Open
Project: MariaDB Server
Component/s: mariabackup
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Vladislav Lesin
Resolution: Unresolved Votes: 0
Labels: affects-tests, rr-profile-analyzed

Issue Links:
Relates
relates to MDEV-14992 BACKUP: in-server backup Open
relates to MDEV-31904 mariabackup --backup: [ERROR] InnoDB:... Open

 Description   

The following failure was caught while testing MDEV-14425.

[rr 1612533 2732][00] 2022-01-19 10:05:20 mariabackup: Generating a list of tablespaces
[rr 1612533 2945]2022-01-19 10:05:20 0 [ERROR] InnoDB: File ./test/FTS_000000000000086a_DELETED.ibd was not found
[rr 1612533 2949]2022-01-19 10:05:20 0 [ERROR] InnoDB: Operating system error number 2 in a file operation.
[rr 1612533 2953]2022-01-19 10:05:20 0 [ERROR] InnoDB: Error number 2 means 'No such file or directory'
[rr 1612533 2957]2022-01-19 10:05:20 0 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/
[rr 1612533 2961]2022-01-19 10:05:20 0 [Warning] InnoDB: Cannot open './test/FTS_000000000000086a_DELETED.ibd'.
[rr 1612533 2965][00] FATAL ERROR: 2022-01-19 10:05:20 Failed to validate first page of the file test/FTS_000000000000086a_DELETED, error 62

I analyzed the root cause.

ssh sdp
rr replay /data/results/1642613284/TBR-1327/dev/shm/rqg/1642613284/48/1_clone/rr/latest-trace

The tablespace object was created in:

#0  0x000056552dfa610d in fil_space_t::create (id=2151, flags=23, 
    purpose=FIL_TYPE_TABLESPACE, crypt_data=0x0, mode=FIL_ENCRYPTION_DEFAULT)
    at /data/Server/bb-10.8-MDEV-14425_2/storage/innobase/fil/fil0fil.cc:930
#1  0x000056552ca1c5e8 in xb_load_single_table_tablespace (
    dirname=0x7ffdcbe6c370 "test", 
    filname=0x7ffdcbe6d3d0 "FTS_", '0' <repeats 13 times>, "86a_DELETED.ibd", 
    is_remote=false, skip_node_page0=false, defer_space_id=0)
    at /data/Server/bb-10.8-MDEV-14425_2/extra/mariabackup/xtrabackup.cc:3427
#2  0x000056552ca1de92 in enumerate_ibd_files (
    callback=0x56552ca1bd44 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>)

I was first going to blame MDEV-24626. No, this is a simple racer condition and should be repeatable in all supported versions. This is where we got the idea to read that file:

#2  0x000056552ca1d02a in os_file_readdir_next_file (
    dirname=0x60b000006550 "./test", dir=0x62d00008c400, info=0x7ffdcbe6d3d0)
    at /data/Server/bb-10.8-MDEV-14425_2/extra/mariabackup/xtrabackup.cc:3609
#3  0x000056552ca1d5ce in fil_file_readdir_next_file (err=0x7ffdcbe6c360, 
    dirname=0x60b000006550 "./test", dir=0x62d00008c400, info=0x7ffdcbe6d3d0)
    at /data/Server/bb-10.8-MDEV-14425_2/extra/mariabackup/xtrabackup.cc:3680
#4  0x000056552ca1deb9 in enumerate_ibd_files (
    callback=0x56552ca1bd44 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>)

Apparently, the file was deleted by the server right after the readdir() call.

My suggested fix would be to make enumerate_ibd_files() tolerate missing files in the following type of stack trace:

#0  os_file_create_func (name=0x0, create_mode=0, purpose=61, type=100, 
    read_only=true, success=0x7ffdcbe6ba70)
    at /data/Server/bb-10.8-MDEV-14425_2/storage/innobase/os/os0file.cc:1187
#1  0x000056552dfa034e in fil_node_open_file_low (node=0x606000001e20)
    at /data/Server/bb-10.8-MDEV-14425_2/storage/innobase/fil/fil0fil.cc:355
#2  0x000056552dfa15ae in fil_node_open_file (node=0x606000001e20)
    at /data/Server/bb-10.8-MDEV-14425_2/storage/innobase/fil/fil0fil.cc:450
#3  0x000056552dfa7e25 in fil_space_t::read_page0 (this=0x6120000034c0)
    at /data/Server/bb-10.8-MDEV-14425_2/storage/innobase/fil/fil0fil.cc:1078
#4  0x000056552ca1c798 in xb_load_single_table_tablespace (
    dirname=0x7ffdcbe6c370 "test", 
    filname=0x7ffdcbe6d3d0 "FTS_", '0' <repeats 13 times>, "86a_DELETED.ibd", 
    is_remote=false, skip_node_page0=false, defer_space_id=0)
    at /data/Server/bb-10.8-MDEV-14425_2/extra/mariabackup/xtrabackup.cc:3438
#5  0x000056552ca1de92 in enumerate_ibd_files (
    callback=0x56552ca1bd44 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>)
    at /data/Server/bb-10.8-MDEV-14425_2/extra/mariabackup/xtrabackup.cc:3791



 Comments   
Comment by Marko Mäkelä [ 2022-07-01 ]

This would be trivially fixed by implementing MDEV-14992. There could be other fixes as well.

Generated at Thu Feb 08 09:53:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.