[MDEV-14874] innodb_encrypt_log corrupts the log when the LSN crosses 32-bit boundary Created: 2018-01-05  Updated: 2020-08-25  Resolved: 2018-01-08

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.1.3
Fix Version/s: 10.2.9, 10.3.2, 10.1.31

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: backup, encryption, recovery

Issue Links:
Relates
relates to MDEV-13318 Crash recovery failure after the serv... Closed
relates to MDEV-11782 Redefine the innodb_encrypt_log format Closed
relates to MDEV-13416 mariabackup --backup fails to copy lo... Closed

 Description   

If the InnoDB log sequence number is at least 4294967296, then Mariabackup --backup will fail to decrypt encrypted redo log blocks.

Furthermore, if the most significant 32-bit half of the LSN changes during the redo log scanning in crash recovery, MariaDB will fail to decrypt the redo log blocks. This can be repeated by porting the following 10.2 test to 10.1:

cp mysql-test/suite/mariabackup/{xb_file_key_management,huge_lsn}.opt
--- ../10.1/mysql-test/suite/mariabackup/huge_lsn.test	2018-01-05 16:56:51.798820291 +0200
+++ mysql-test/suite/mariabackup/huge_lsn.test	2017-10-10 09:07:34.950402715 +0300
@@ -16,7 +16,7 @@
 my $ps= $ENV{INNODB_PAGE_SIZE};
 my $page;
 die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
-substr($page,26,8) = pack("NN", 4096, ~1024);
+substr($page,26,8) = pack("NN", 4096, 0);
 substr($page,0,4)=pack("N",0xdeadbeef);
 substr($page,$ps-8,4)=pack("N",0xdeadbeef);
 sysseek(FILE, 0, 0) || die "Unable to rewind $file\n";
@@ -28,7 +28,7 @@
 
 --source include/start_mysqld.inc
 let SEARCH_FILE= $MYSQLTEST_VARDIR/log/mysqld.1.err;
---let SEARCH_PATTERN= InnoDB: .*started; log sequence number 17596481010700
+--let SEARCH_PATTERN= InnoDB: 5\.7\.\d+ started; log sequence number 17592186044428
 --source include/search_pattern_in_file.inc
 
 CREATE TABLE t(i INT) ENGINE INNODB;

Without the following patch, mariabackup would fail even earlier:

diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc
index 437fc4aa7f9..9d8e8cd1061 100644
--- a/extra/mariabackup/xtrabackup.cc
+++ b/extra/mariabackup/xtrabackup.cc
@@ -4039,6 +4039,7 @@ xtrabackup_backup_func(void)
 
 	mutex_enter(&log_sys->mutex);
 	xtrabackup_choose_lsn_offset(checkpoint_lsn_start);
+	srv_start_lsn = checkpoint_lsn_start;
 	mutex_exit(&log_sys->mutex);
 
 	/* copy log file by current position */

It looks like we have to replace srv_start_lsn in log_blocks_crypt with a proper parameter, similar to how it is in the 10.2 function log_crypt().
Only in that way, we can keep the current semantics of the variable srv_start_lsn while being able to decrypt the entire redo log, even if the most significant 32 bits of the LSN change. The dependence on srv_start_lsn was there already in the initial implementation that appeared in MariaDB 10.1.3.



 Comments   
Comment by Marko Mäkelä [ 2018-01-08 ]

Mariabackup 10.1 would trivially fail to read any encrypted redo log if the LSN exceeds 4294967295, but a nastier problem is that InnoDB in MariaDB 10.1 could encrypt the redo log with wrong parameters. That is, not only backup, but also crash recovery may fail.

In the MariaDB 10.2 series, a similar fix was part of MDEV-13318.

Generated at Thu Feb 08 08:16:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.