[MDEV-27621] Backup fails with FATAL ERROR: Was only able to copy log from .. to .., not .. Created: 2022-01-25 Updated: 2022-02-01 Resolved: 2022-01-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | mariabackup |
| Affects Version/s: | 10.8 |
| Fix Version/s: | 10.8.1 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
The test performs a number of tiny updates on a single table, and after they are done, runs mariabackup --backup. On earlier versions, it works fine, but on 10.8 it fails almost reliably with
The value of innodb_log_file_size is indeed small, in order to minimize the test, but the issue is scalable. I started from the default 96M, with several tables with more columns, more keys, and bigger number of UPDATEs, but otherwise very similar load. I suppose it scales further up the same way, I didn't try. It fails for me most of the times, both in shm and on SSD, but it's not deterministic and the result can vary on different machines and builds. Possibly it can be made more stable by more precise handling of the InnoDB log.
The failure started happening on 10.8 after this commit:
|
| Comments |
| Comment by Marko Mäkelä [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
|
The test works for me with the cycles values of 2000, 5000, 10000 when running on /dev/shm and using libpmem (mmap interface to the server’s 2-megabyte ib_logfile0). I then switched to the pwrite and pread based interface:
Also with this (and cycles=10000) the test passes for me:
On /dev/shm (fake PMEM) it was slightly faster, again with cycles=10000:
On NVMe, I saw about 16 MiB/s writes and 2.5 MiB/s reads happening while the test is running. And the test finally failed (already with cycles=2000):
The likely reason why you see this happening after I tried once more, with innodb_flush_method=O_DSYNC so that O_DIRECT I/O will be used on the log file. Again, I am seeing about the same write and read rates.
I do not think that there is any easy fix for this. Because there are much more writes happening than reads, the issue cannot even be that mariadb-backup --backup would be unnecessarily re-reading too large portions of the log file. elenst, when you use a larger redo log (and larger workload), will the test pass if you invoke mariadb-backup with a larger innodb_log_buffer_size? The minimum value is 2MiB (changed in Ever since I heard about the closed-source InnoDB Hot Backup product (ibbackup) in 2003 or 2004, I thought that the inter-process communication via the database log and data files is error-prone and broken by design. This is just another proof of that. As far as I can tell, the only reason why the separate ibbackup process was created was licensing: Innobase Oy wanted to get additional revenue directly from end users, by offering a closed-source online backup program. Later, Percona XtraBackup was created. It implemented the fairly small part of the closed-source code. At that time, some #ifdef UNIV_HOTBACKUP hints were hepfully available in the InnoDB storage engine code. And yet later, mariabackup or mariadb-backup was created out of xtrabackup, initially, to support file format changes that are unique to MariaDB. This problem would ultimately be fixed by implementing server-side backup (MDEV-14992), so that the same server process will produce both the main (circular) server log and a replica (append-only) log that will be needed to recover or prepare the backup. | ||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
Normal non-MTR-downgraded tests run (and now frequently fail) with default innodb_log_file_size and innodb_log_buffer_size, which for 10.8 are now 96M and 16M accordingly. Same tests running with default values never failed this way before | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
|
I tested the following modification to the test case:
That test passed on my NVMe, using a gcc -Og debug build:
Note: I saw a great reduction of the write rate, possibly because the server would be less busy with checkpoint flushing due to the ridiculously small redo log size. We should keep in mind that normally, we recommend the buffer pool size to be something like 80% of the available RAM, and (now that Perhaps mleich can play with this further? I think that we need reasonable parameters, let’s say, at least innodb_log_file_size=96M which I believe is the default. Increase the cycles until it starts to fail (if it starts to fail at all), and then see what is the minimum value of --innodb-log-buffer-size with which mariadb-backup will start to work. Its default value 2M can definitely be too small. The server’s innodb_log_buffer_size should not matter. | ||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
|
I don't know what you mean by "default value 2M". As I said, the default value of -innodb-log-buffer-size is 16M, assuming that mariabackup uses the same defaults as the server. | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
|
For mariadb-backup, the default innodb_log_buffer_size (which as of | ||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2022-01-26 ] | ||||||||||||||||||||||||||||||||||||
|
Okay. In my tests it also fails with mariabackup --backup --innodb-log-buffer-size=96M, plus at least server-side
With this, I'm passing over further experiments to mleich. | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-27 ] | ||||||||||||||||||||||||||||||||||||
|
I can confirm that this is indeed directly caused by an implementation deficiency in The problem is that when the circular ib_logfile0 wraps around, then mariadb-backup --backup will only (re)read the portion from the current position. I added rr record to the exec line in the provided script and on the second try, it failed with an extreme case. We were re-reading the last 512 bytes, and each time got PREMATURE_EOF from the log parser, for trying to parse the last 16 bytes of that block.
With the above patch, the simple test also passed on my NVMe, with 12 tests running in parallel. The read rates were about 1/5 of the write rates throughout the test.
| ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-27 ] | ||||||||||||||||||||||||||||||||||||
|
A test of the code before the above fix produced a backup with an invalid log file:
The problem with the log file is that at the offset where we expect to find optional FILE_MODIFY records and a FILE_CHECKPOINT record that points to the checkpoint LSN, we instead find some data that could be in the middle of some mini-transaction. The data is in clear text and does appear in several .ibd files. A possible explanation is that due to this bug, mariadb-backup --backup kept retrying a read of the (say) 2048 last bytes of the server’s ib_logfile0 until the server had generated a multiple of (innodb_log_file_size-12288)·n bytes more log, and mariadb-backup would finally observe that a mini-transaction luckily ended at the very end of that block. This then allowed the log file parsing in the backup to wrap around to byte offset 12288. The effect of this would be that (innodb_log_file_size-12288)·n bytes were omitted from the backed-up log. |