[MDEV-15270] mariabackup.data_directory, mariabackup.partial_exclude failed in buildbot with error on exec Created: 2018-02-11 Updated: 2022-07-26 Resolved: 2022-07-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Backup, Tests |
| Affects Version/s: | 10.2 |
| Fix Version/s: | 10.2.13, 10.3.5, 10.4.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
10.2 cdb7a8fa6928f3fb103ed7f66486dc91 kvm-deb-xenial-x86 2018-01-10 23:54:03 4987 nm Normal run, no --ps-protocol
Logs not available |
| Comments |
| Comment by Elena Stepanova [ 2018-02-11 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looks similar, but with log (for now)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2018-02-11 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko, can you check where this message comes from? I recall doublewrite was disabled in mariabackup in 10.1. There should not be "Restoring from doublewrite buffer", neither backup not prepare should ever use it. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2018-02-11 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-12 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I pushed a simple clean-up patch to prevent any potential disaster (writing to a source file in --backup). It probably does not prevent any disaster, because we would have opened the source files in read-only mode, and the writes were blocked by that. thiru, please try to reproduce this problem. It probably is most feasible with DBUG_EXECUTE_IF("backup_read_page_fail", …) and something like
Implement appropriate retry logic when a data file page appears corrupted. Test this with the system tablespace, the undo tablespace files, and with .ibd files, both with the first page and with some subsequent page (say, page 1). At least in the function xb_load_single_table_tablespace() we should try rereading, and we should not call exit(EXIT_FAILURE) but instead return any errors to the caller. Also, we could try to tolerate missing files, because the user could be executing DROP TABLE or RENAME TABLE in parallel with the backup. Once you have worked on the 10.2 fix, you should be able to tell if this fix is needed in Mariabackup 10.1. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I pushed a cleanup that removes Mariabackup use of the doublewrite buffer to MariaDB 10.2.13. With that fix merged, a test failed in bb-10.2-ext:
Just like I anticipated in my previous comment, Mariabackup fails to reread the page. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-07-26 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
According to the cross-reference, the last regular failures of the mentioned tests in main branches occurred in 2018. The bogus use of the doublewrite buffer was removed in MariaDB 10.2.13. The probability of intermittent failures should have been drastically reduced by |