[MDEV-20755] InnoDB: Database page corruption on disk or a failed file read of tablespace upon prepare of mariabackup incremental backup Created: 2019-10-05 Updated: 2023-05-18 Resolved: 2020-10-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Encryption, mariabackup, Storage Engine - InnoDB |
| Affects Version/s: | 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.2.35, 10.3.26, 10.4.16, 10.5.7, 10.6.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | corruption, not-10.5, recovery, rr-profile-analyzed | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
Note: Can't be a duplicate of
The test case is non-deterministic, run with --repeat=N and other options listed below. It usually fails for me within 5 attempts, but it can vary on different machines and builds. The probability seems higher with larger number of tables, columns and data. Running in shm also seems to help.
Fails on 10.2-10.4, all of debug, ASAN, non-debug builds. |
| Comments |
| Comment by Marko Mäkelä [ 2019-10-07 ] | ||||||||||||||||||||||||||||||||||
|
Incremental backup is specific to Mariabackup (and Percona XtraBackup). | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-08-21 ] | ||||||||||||||||||||||||||||||||||
|
I analyzed the trace with the following breakpoints:
The MLOG_INIT_FILE_PAGE2 suggests that we should not read the page 89:9. But we are doing exactly that! | ||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2020-08-26 ] | ||||||||||||||||||||||||||||||||||
|
InnoDB skips the page creation using redo log because the tablespace has MLOG_INDEX_LOAD lsn for it.
So it forces the page to be read from disk. IMO, there is no issue in InnoDB recovery.
It proves that innodb_log_optimize_ddl is enabled. | ||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-08-26 ] | ||||||||||||||||||||||||||||||||||
|
Because I suspect that the problem is that the function backup_fix_ddl() is not compatible with incremental backup. thiru observed that xtrabackup_apply_delta() was applying changes to this tablespace. Maybe backup_fix_ddl() should have replaced or removed the previously generated .delta files when re-copying the entire data file? | ||||||||||||||||||||||||||||||||||
| Comment by Daniel Kessel [ 2020-09-14 ] | ||||||||||||||||||||||||||||||||||
|
We are also encountering this on version 10.4.14 when using mariabackup as galera SST mode. It would help us to have this bug fixed so we don't have to fall back to rsync or mysqldump as SST mode. | ||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2020-10-06 ] | ||||||||||||||||||||||||||||||||||
Briefly.The first page of the tablespace is rewritten during preparing incremental backup because tablespace is "created" instead of opening the existing tablespace, because tablespace id from *.meta file is not equal to the tablespace id of data file in full backup directory. Detailed.I have repeated the bug locally with the test, provided by Elena, and done the following analysis. 1) The content of "corrupted" page read by mariabackup during incremental --backup and --prepare is the same. But during --backup the page passes validation, and during --prepare it does not pass. 2) The "corrupted" page is encrypted. In mariabackup if the page is encrypted, buf_page_is_corrupted() is invoked during backup only if --extended_validation option is set. But even if we set the option, buf_page_is_corrupted() returns false on --backup, and true on --prepare. The difference is that during --backup the page is passed to buf_page_is_corrupted() after decryption, but during --prepare the page is not decrypted before it is passed to the function. 3) The page is not decrypted in --prepare because space->crypt_data is NULL. space->crypt_data is NULL because fil_space_read_crypt_data() can't find CRYPT_MAGIC in the first page. So it turned out that the first page of the tablespace after the second --prepare does not contain crypt data, but it does contain it after the first --prepare. What means the first page was rewritten during incremental --prepare. 4) When innodb_log_optimize_ddl is enabled, backup_optimized_ddl_op() callback is invoked when MLOG_INDEX_LOAD is read on incremental --backup. The tablespace is pushed in the special container, and then is treated as a newly created tablespace in backup_fix_ddl(), and re-copied in *.new file. But as this is incremental backup, all copied data files are passed through wf_incremental filter, which creates *.delta file, and copies only the pages with LSN greater then incremental_lsn. So after backup_fix_ddl() is executed there will be *.new.delta file with the only page - "corrupted" page from (2), the file does not contain the first page of the tablespace. 5) When incremental --prepare is executed(the second --prepare in Elena's test), the *.new.delta file is renamed to *.ibd.delta file, and applied as regular *.delta file. It turned out that the first page of the tablespace is rewritten when delta is applied with the following stack trace:
So the new tablespace is "created" instead of opening the existing tablespace because tablespace id from *.meta file(the tablespace id of *.new.delta file, created on (4)) can not be found in fil_system->spaces (fil_space_get_by_id() invoked from xb_delta_open_matching_space() returns NULL). fil_system->spaces is filled in xb_load_tablespaces() before applying delta's. It iterates data files in full backup directory and fill fil_system->spaces. So the first page of the tablespace is rewritten during preparing incremental backup because tablespace is "created" instead of opening the existing tablespace, because tablespace id from *.meta file is not equal to the tablespace id of data file in full backup directory. How to fix?I see the following variants: 1) I suppose that if tablespace id is changed during incremental backup, all non-zero pages of this tablespace contain new tablespace id, i.e. all non-zero pages were changed during incremental backup, so there is no need to get delta in backup_fix_ddl() for such data files, we could just copy the whole data file as *.new. 2) Check if data file exists in full backup directory in xb_delta_open_matching_space(), then change it's id, and invoke fil_space_create() only for non-existing files. | ||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2020-10-14 ] | ||||||||||||||||||||||||||||||||||
|
I failed to create stable test case for it. The test scenario I am trying to implement, is the following: 1) Disable pages flushing. As page 0 is not flushed by the server, it's LSN is 0 just after the table is optimized, and that is why the page is not copied to .delta file during incremental backup, and that is why the tablespace is created during incremental prepare as there is no tablespace with the same id during prepare, and the tablespace is created without CRYPT_MAGIC mark on page 0. I supposed that the following page_is_corrupted() call must fail as it will check crc's for encrypted page which was not decrypted as fil_space_read_crypt_data() invoked on tablespace open does not read crypt data from page 0. But the test is unstable because the sequence of recv_recover_page() calls for different pages is not specified during recovery, and sometimes fil_parse_write_crypt_data() for page 0 is invoked before page_is_corrupted() for any other page, and the page is decrypted successfully before crc check as crypt data is set correctly for the tablespace in fil_parse_write_crypt_data(). We could insert DBUG_EXECUTE_IF() in fil_parse_write_crypt_data() to avoid space->crypt_data initialization on MLOG_FILE_WRITE_CRYPT_DATA redo record execution to make the test stable. But I am not sure this is correct because I am not sure that changing any data on some page recovery, which could influence recovery process of the other pages, is correct. It's supposed that pages recovery can be done independently for different pages. So fil_parse_write_crypt_data() call from recv_recover_page() looks very suspicious. | ||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2020-10-20 ] | ||||||||||||||||||||||||||||||||||
|
Pushed https://github.com/MariaDB/server/tree/bb-10.2-MDEV-20755-incr-copy-new-files for testing. | ||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2020-10-22 ] | ||||||||||||||||||||||||||||||||||
|
There will be conflicts on merging it to 10.4, they are resolved in https://github.com/MariaDB/server/tree/10.4-MDEV-20755-incr-copy-new-files branch. The fix should also be merged in 10.5+ despite the issue does not affect 10.5, because it processes files, created by some DDL during backup, more effectively for incremental backup and prepare. As it is a performance fix only for 10.5, there is no need to merge the test case for it. Besides, the test case will always pass for 10.5+ as ddl log optimization is deprecated for those versions. 10.5 conflicts are resolved here: https://github.com/MariaDB/server/tree/10.5-MDEV-20755-incr-copy-new-files. | ||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2020-10-23 ] | ||||||||||||||||||||||||||||||||||
|
The fix was reviewed by wlad. |