Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.8.1, 10.8.2
-
GNU/Linux with libpmem
Description
mleich reproduced an assertion failure in mariadb-backup --prepare:
mariabackup: /data/Server/bb-10.8-MDEV-26827/extra/mariabackup/xtrabackup.cc:3017: bool xtrabackup_copy_logfile():
|
Assertion `recv_sys.offset < recv_sys.len' failed.
|
In the core dump, we actually have recv_sys.offset == recv_sys.len, so it looks like the assertion should be relaxed to include the equality. Furthermore, at the end of the loop body, the recv_sys.offset needs to wrap around to log_sys.START_OFFSET (0x3000).
To be able to repeat that bug, I modified an existing test case to trigger log file wrap-around exactly at a record boundary:
diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test
|
index 00c7e66516c..62b16ef93fd 100644
|
--- a/mysql-test/suite/mariabackup/huge_lsn.test
|
+++ b/mysql-test/suite/mariabackup/huge_lsn.test
|
@@ -40,27 +40,33 @@ do "$ENV{MTR_SUITE_DIR}/../innodb/include/crc32.pl";
|
my $file= "$ENV{MYSQLD_DATADIR}/ib_logfile0";
|
open(FILE, ">", $file) or die "Unable to open $file\n";
|
binmode FILE;
|
-# the desired log sequence number, plus 16
|
my $extra_repeat = 139820;
|
+# The desired log sequence number is 17596481011216 (0x1000fffffe10).
|
+my $file_size=4<<20;
|
my $lsn_hi=4096,$lsn_lo=0xfffffe00 - $extra_repeat * 15;
|
my $polynomial = 0x82f63b78; # CRC-32C
|
my ($header, $checkpoint, $log);
|
-$header = "Phys" . pack("x[4]NN", $lsn_hi, $lsn_lo) .
|
+$header = "Phys" . pack("x[4]NN", $lsn_hi, $lsn_lo - $file_size + 0x300f) .
|
"some Perl code" . pack("x[478]");
|
$header .= pack("Nx[3584]", mycrc32($header, 0, $polynomial));
|
$checkpoint = pack("NNNNx[44]", $lsn_hi, $lsn_lo, $lsn_hi, $lsn_lo);
|
$checkpoint .= pack("Nx[8128]", mycrc32($checkpoint, 0, $polynomial));
|
$log = pack("CxxNN", 0xfa, $lsn_hi, $lsn_lo);
|
-$log .= pack("CN", 1, mycrc32($log, 0, $polynomial));
|
+$log .= pack("CN", 0, mycrc32($log, 0, $polynomial));
|
|
# Write more than 2MiB of FILE_MODIFY mini-transactions to exercise the parser.
|
my $extra = pack("CCxa*", 0xb9, 127, "a/b.ibd");
|
-$extra .= pack("CN", 1, mycrc32($extra, 0, $polynomial));
|
+$extra .= pack("CN", 0, mycrc32($extra, 0, $polynomial));
|
|
-print FILE $header, $checkpoint, $extra x $extra_repeat, $log;
|
+print FILE $header, $checkpoint, $extra x ($extra_repeat - 1), $log;
|
+seek(FILE, $file_size - 15, 0);
|
+$extra = pack("CCxa*", 0xb9, 127, "c/d.ibd");
|
+$extra .= pack("CN", 1, mycrc32($extra, 0, $polynomial));
|
+print FILE $extra;
|
close(FILE) or die "Unable to close $file\n";
|
EOF
|
---let SEARCH_PATTERN= redo log from 2\\.012MiB to [0-9.]*[KMGT]iB; LSN=17596481011216\\b
|
+--let SEARCH_PATTERN= InnoDB: log sequence number 17596481011216
|
+--let $restart_parameters=--innodb-log-file-size=4M
|
}
|
|
--source include/start_mysqld.inc
|
@@ -80,6 +86,7 @@ INSERT INTO t VALUES(2);
|
echo # xtrabackup prepare;
|
--disable_result_log
|
exec $XTRABACKUP --prepare --target-dir=$targetdir;
|
+--let $restart_parameters=
|
--source include/restart_and_restore.inc
|
--enable_result_log
|
SELECT * FROM t; |
This test works fine when the mmap() based interface to the log is not being used. That interface would be used when InnoDB is linked with libpmem and the data directory resides in /dev/shm or in a mount -o dax mounted file system. If those conditions are met, InnoDB would fail to start up like this:
10.8 |
CURRENT_TEST: mariabackup.huge_lsn
|
mysqltest: At line 74: query 'CREATE TABLE t(i INT) ENGINE INNODB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'INNODB'
|
…
|
2022-02-24 17:03:33 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(17596478913900) at 17596478913900
|
An analysis shows that parsing wrongly continues at recv_sys.offset=0x400000, that is, right past the end of the mmap() buffer. The offset should have wrapped around to log_sys.START_OFFSET.
So, there is something to fix in both the server and the backup.
Attachments
Issue Links
- is caused by
-
MDEV-14425 Change the InnoDB redo log format to reduce write amplification
-
- Closed
-
The following fixes the server startup for the changed test:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index e3e27d02969..f53d088b99a 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2323,6 +2323,11 @@ inline recv_sys_t::parse_mtr_result recv_sys_t::parse(store_t store, source &l)
ut_d(const source el{l});
lsn+= l - begin;
offset= l.ptr - log_sys.buf;
+ if (!log_sys.is_pmem());
+ else if (offset == log_sys.file_size)
+ offset= log_sys.START_OFFSET;
+ else
+ ut_ad(offset < log_sys.file_size);
ut_d(std::set<page_id_t> freed);
However, the test is not exercising the suspected backup bug, because backup is starting from recv_sys.offset=log_sys.START_LSN in the trace that I analyzed. The reason is that the checkpoint LSN had advanced to 0x1000fffffe20 (one extra checkpoint after startup). The reason for that is that the log is being rebuilt on startup because the log file was being rebuilt in order to add encryption.
With these additional tweaks, I finally exercised the backup code:
diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test
index ceb5983ec8c..6a53b97f959 100644
--- a/mysql-test/suite/mariabackup/huge_lsn.test
+++ b/mysql-test/suite/mariabackup/huge_lsn.test
@@ -66,7 +66,7 @@ print FILE $extra;
close(FILE) or die "Unable to close $file\n";
EOF
--let SEARCH_PATTERN= InnoDB: log sequence number 17596481011216
---let $restart_parameters=--innodb-log-file-size=4M
+--let $restart_parameters=--innodb-log-file-size=4M --innodb-encrypt-log=0
}
And this temporary fault injection, to ignore the unwanted new checkpoint during backup, specific to this test case:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index f53d088b99a..f852b530fed 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -1796,7 +1796,8 @@ dberr_t recv_sys_t::find_checkpoint()
continue;
}
- if (checkpoint_lsn >= log_sys.next_checkpoint_lsn)
+ if (checkpoint_lsn >= log_sys.next_checkpoint_lsn &&
+ checkpoint_lsn != 0x1000fffffe10)
{
log_sys.next_checkpoint_lsn= checkpoint_lsn;
Backup did not crash in the expected place, because the pointer was being wrapped to recv_sys.offset=log_sys.START_OFFSET thanks to the above fix. But, something else failed:
diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc
index 851c83c76db..486ee5cdb57 100644
--- a/extra/mariabackup/xtrabackup.cc
+++ b/extra/mariabackup/xtrabackup.cc
@@ -803,9 +803,8 @@ static std::string filename_to_spacename(const void *filename, size_t len)
ut_a(table);
*table = 0;
char *db = strrchr(f, '/');
- ut_a(db);
*table = '/';
- std::string s(db+1);
+ std::string s(db ? db+1 : f);
free(f);
return s;
The function expected a file name like "./c/b.ibd" and not just "c/b.ibd" like the test case is writing it. This wasn’t exercised earlier because the server had always rebuilt the log file at startup before backup got to read it. Finally, with that fixed, we got the next problem at mariadb-backup --prepare:
2022-02-24 19:21:22 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(17596478913900) at 17596478913900
I will continue fixing this tomorrow.