[MDEV-27939] Log buffer wrap-around errors on PMEM Created: 2022-02-24  Updated: 2022-02-25  Resolved: 2022-02-25

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB
Affects Version/s: 10.8.1, 10.8.2
Fix Version/s: 10.8.3

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: crash, recovery
Environment:

GNU/Linux with libpmem


Issue Links:
Problem/Incident
is caused by MDEV-14425 Change the InnoDB redo log format to ... Closed

 Description   

mleich reproduced an assertion failure in mariadb-backup --prepare:

mariabackup: /data/Server/bb-10.8-MDEV-26827/extra/mariabackup/xtrabackup.cc:3017: bool xtrabackup_copy_logfile():
Assertion `recv_sys.offset < recv_sys.len' failed.

In the core dump, we actually have recv_sys.offset == recv_sys.len, so it looks like the assertion should be relaxed to include the equality. Furthermore, at the end of the loop body, the recv_sys.offset needs to wrap around to log_sys.START_OFFSET (0x3000).

To be able to repeat that bug, I modified an existing test case to trigger log file wrap-around exactly at a record boundary:

diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test
index 00c7e66516c..62b16ef93fd 100644
--- a/mysql-test/suite/mariabackup/huge_lsn.test
+++ b/mysql-test/suite/mariabackup/huge_lsn.test
@@ -40,27 +40,33 @@ do "$ENV{MTR_SUITE_DIR}/../innodb/include/crc32.pl";
 my $file= "$ENV{MYSQLD_DATADIR}/ib_logfile0";
 open(FILE, ">", $file) or die "Unable to open $file\n";
 binmode FILE;
-# the desired log sequence number, plus 16
 my $extra_repeat = 139820;
+# The desired log sequence number is 17596481011216 (0x1000fffffe10).
+my $file_size=4<<20;
 my $lsn_hi=4096,$lsn_lo=0xfffffe00 - $extra_repeat * 15;
 my $polynomial = 0x82f63b78; # CRC-32C
 my ($header, $checkpoint, $log);
-$header = "Phys" . pack("x[4]NN", $lsn_hi, $lsn_lo) .
+$header = "Phys" . pack("x[4]NN", $lsn_hi, $lsn_lo - $file_size + 0x300f) .
           "some Perl code" . pack("x[478]");
 $header .= pack("Nx[3584]", mycrc32($header, 0, $polynomial));
 $checkpoint = pack("NNNNx[44]", $lsn_hi, $lsn_lo, $lsn_hi, $lsn_lo);
 $checkpoint .= pack("Nx[8128]", mycrc32($checkpoint, 0, $polynomial));
 $log = pack("CxxNN", 0xfa, $lsn_hi, $lsn_lo);
-$log .= pack("CN", 1, mycrc32($log, 0, $polynomial));
+$log .= pack("CN", 0, mycrc32($log, 0, $polynomial));
 
 # Write more than 2MiB of FILE_MODIFY mini-transactions to exercise the parser.
 my $extra = pack("CCxa*", 0xb9, 127, "a/b.ibd");
-$extra .= pack("CN", 1, mycrc32($extra, 0, $polynomial));
+$extra .= pack("CN", 0, mycrc32($extra, 0, $polynomial));
 
-print FILE $header, $checkpoint, $extra x $extra_repeat, $log;
+print FILE $header, $checkpoint, $extra x ($extra_repeat - 1), $log;
+seek(FILE, $file_size - 15, 0);
+$extra = pack("CCxa*", 0xb9, 127, "c/d.ibd");
+$extra .= pack("CN", 1, mycrc32($extra, 0, $polynomial));
+print FILE $extra;
 close(FILE) or die "Unable to close $file\n";
 EOF
---let SEARCH_PATTERN= redo log from 2\\.012MiB to [0-9.]*[KMGT]iB; LSN=17596481011216\\b
+--let SEARCH_PATTERN= InnoDB: log sequence number 17596481011216
+--let $restart_parameters=--innodb-log-file-size=4M
 }
 
 --source include/start_mysqld.inc
@@ -80,6 +86,7 @@ INSERT INTO t VALUES(2);
 echo # xtrabackup prepare;
 --disable_result_log
 exec $XTRABACKUP  --prepare --target-dir=$targetdir;
+--let $restart_parameters=
 --source include/restart_and_restore.inc
 --enable_result_log
 SELECT * FROM t;

This test works fine when the mmap() based interface to the log is not being used. That interface would be used when InnoDB is linked with libpmem and the data directory resides in /dev/shm or in a mount -o dax mounted file system. If those conditions are met, InnoDB would fail to start up like this:

10.8

CURRENT_TEST: mariabackup.huge_lsn
mysqltest: At line 74: query 'CREATE TABLE t(i INT) ENGINE INNODB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'INNODB'
2022-02-24 17:03:33 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(17596478913900) at 17596478913900

An analysis shows that parsing wrongly continues at recv_sys.offset=0x400000, that is, right past the end of the mmap() buffer. The offset should have wrapped around to log_sys.START_OFFSET.

So, there is something to fix in both the server and the backup.



 Comments   
Comment by Marko Mäkelä [ 2022-02-24 ]

The following fixes the server startup for the changed test:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index e3e27d02969..f53d088b99a 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2323,6 +2323,11 @@ inline recv_sys_t::parse_mtr_result recv_sys_t::parse(store_t store, source &l)
   ut_d(const source el{l});
   lsn+= l - begin;
   offset= l.ptr - log_sys.buf;
+  if (!log_sys.is_pmem());
+  else if (offset == log_sys.file_size)
+    offset= log_sys.START_OFFSET;
+  else
+    ut_ad(offset < log_sys.file_size);
 
   ut_d(std::set<page_id_t> freed);
 #if 0 && defined UNIV_DEBUG /* MDEV-21727 FIXME: enable this */

However, the test is not exercising the suspected backup bug, because backup is starting from recv_sys.offset=log_sys.START_LSN in the trace that I analyzed. The reason is that the checkpoint LSN had advanced to 0x1000fffffe20 (one extra checkpoint after startup). The reason for that is that the log is being rebuilt on startup because the log file was being rebuilt in order to add encryption.

With these additional tweaks, I finally exercised the backup code:

diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test
index ceb5983ec8c..6a53b97f959 100644
--- a/mysql-test/suite/mariabackup/huge_lsn.test
+++ b/mysql-test/suite/mariabackup/huge_lsn.test
@@ -66,7 +66,7 @@ print FILE $extra;
 close(FILE) or die "Unable to close $file\n";
 EOF
 --let SEARCH_PATTERN= InnoDB: log sequence number 17596481011216
---let $restart_parameters=--innodb-log-file-size=4M
+--let $restart_parameters=--innodb-log-file-size=4M --innodb-encrypt-log=0
 }
 
 --source include/start_mysqld.inc

And this temporary fault injection, to ignore the unwanted new checkpoint during backup, specific to this test case:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index f53d088b99a..f852b530fed 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -1796,7 +1796,8 @@ dberr_t recv_sys_t::find_checkpoint()
         continue;
       }
 
-      if (checkpoint_lsn >= log_sys.next_checkpoint_lsn)
+      if (checkpoint_lsn >= log_sys.next_checkpoint_lsn &&
+          checkpoint_lsn != 0x1000fffffe10)
       {
         log_sys.next_checkpoint_lsn= checkpoint_lsn;
         log_sys.next_checkpoint_no= field == log_t::CHECKPOINT_1;

Backup did not crash in the expected place, because the pointer was being wrapped to recv_sys.offset=log_sys.START_OFFSET thanks to the above fix. But, something else failed:

diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc
index 851c83c76db..486ee5cdb57 100644
--- a/extra/mariabackup/xtrabackup.cc
+++ b/extra/mariabackup/xtrabackup.cc
@@ -803,9 +803,8 @@ static std::string filename_to_spacename(const void *filename, size_t len)
 	ut_a(table);
 	*table = 0;
 	char *db = strrchr(f, '/');
-	ut_a(db);
 	*table = '/';
-	std::string s(db+1);
+	std::string s(db ? db+1 : f);
 	free(f);
 	return s;
 }

The function expected a file name like "./c/b.ibd" and not just "c/b.ibd" like the test case is writing it. This wasn’t exercised earlier because the server had always rebuilt the log file at startup before backup got to read it. Finally, with that fixed, we got the next problem at mariadb-backup --prepare:

2022-02-24 19:21:22 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(17596478913900) at 17596478913900

I will continue fixing this tomorrow.

Comment by Marko Mäkelä [ 2022-02-25 ]

There was one more bug in backup, which caused 2 extra bytes to be written to the log, thus corrupting the backup. Also this one affected PMEM only:

@@ -3032,8 +3031,8 @@ static bool xtrabackup_copy_logfile()
                 ds_write(dst_log_file, &seq_1, 1))
               goto write_error;
             if (so < -1 &&
-                ds_write(dst_log_file, log_sys.buf + recv_sys.len - (1 - so),
-                         1 - so))
+                ds_write(dst_log_file, log_sys.buf + recv_sys.len + (1 + so),
+                         -(1 + so)))
               goto write_error;
             if (ds_write(dst_log_file, log_sys.buf + log_sys.START_OFFSET,
                          recv_sys.offset - log_sys.START_OFFSET))

Generated at Thu Feb 08 09:56:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.