[MDEV-28994] Backup produces garbage when using memory-mapped log (PMEM) Created: 2022-07-01  Updated: 2022-07-05  Resolved: 2022-07-01

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.8
Fix Version/s: 10.8.4, 10.9.2, 10.10.1

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: affects-tests, corruption

Issue Links:
Problem/Incident
is caused by MDEV-14425 Change the InnoDB redo log format to ... Closed
Relates
relates to MDEV-14992 BACKUP: in-server backup Open

 Description   

MDEV-14425 introduced a memory-mapped interface to the redo log when the log is stored in persistent memory, or in the /dev/shm file system. This improves performance and makes debugging more convenient, because the entire circular log file will be accessible at log_sys.buf in rr replay traces.

However, this also exposed us to a (very reasonable) limitation of the rr debugger: it is assumed that the memory-mapped file will not be changed by processes that are not traced by the same rr record instance.

Because it may be hard to remember the rule that mariadb-backup --backup must not be traced with rr record when the PMEM interface is being used, we’d better make backup refuse the use of the "fake PMEM" (mmap on redo log located in /dev/shm) when the process is apparently being run under rr record. Normal file system calls should be fine. We can use innodb_use_native_aio=0 as a proxy for detecting rr record, because both the old io_setup (libaio) nor the newer io_uring interface will return ENOSYS under rr record.

Access to log files on real persistent memory (mount -o dax) will not be affected. It is assumed that rr record mariadb-backup --backup seldomly needs to be run such that the log is stored in persistent memory.



 Comments   
Comment by Marko Mäkelä [ 2022-07-01 ]

mleich did some more testing. Unfortunately, the problems occur even when not using rr. Both /dev/shm and PMEM are affected. We even tried the following patch to force mariadb-backup --backup to always read the server’s ib_logfile0 via file system calls (pread(2)):

diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc
index efdd527a28f..60d661f95f1 100644
--- a/storage/innobase/log/log0log.cc
+++ b/storage/innobase/log/log0log.cc
@@ -176,7 +176,7 @@ void log_t::attach(log_file_t file, os_offset_t size)
 #ifdef HAVE_PMEM
   ut_ad(!buf);
   ut_ad(!flush_buf);
-  if (size && !(size_t(size) & 4095))
+  if (size && !(size_t(size) & 4095) && srv_operation != SRV_OPERATION_BACKUP)
   {
     void *ptr=
       my_mmap(0, size_t(size),

mleich, can you please fill in more details?

I am afraid that the only way to fix this is to make the server process provide a copy of the log to the backup process. That would be a subset of MDEV-14992.

Comment by Marko Mäkelä [ 2022-07-01 ]

I forgot to mention the most important part: The error that was reported by mariadb-backup --prepare was

[ERROR] InnoDB: Missing FILE_CHECKPOINT…

In the dataset that I analyzed, the ib_logfile0 that had been produced by mariadb-backup --backup looked like garbage. I did not recognize proper mini-transaction boundaries. The checkpoint header pointed to the middle of something that looked a little like updating some FULLTEXT INDEX related fields, while it should have been a FILE_CHECKPOINT record, optionally preceded by a number of FILE_MODIFY records. Because of this mismatch, no mini-transaction was successfully parsed from the log. The end-of-minitransaction byte should always be 1 and never 0 in a log file that is produced by backup, but here the parser reached the byte 0. That triggered the end of file.

What could have happened is the following:

  1. Backup parsed a mini-transaction from the memory-mapped buffer. This took some time.
  2. Meanwhile, the server might have overwritten the circular log buffer with something else, that is, backup was too slow to keep up and the backup should actually have failed.
  3. Backup did memcpy() of the overwritten data to the output.

Never using mmap for reading the log in mariadb-backup --backup should fix this type of scenario.

Generated at Thu Feb 08 10:05:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.