[MDEV-13318] Crash recovery failure after the server is killed during innodb_encrypt_log startup Created: 2017-07-13  Updated: 2020-10-19  Resolved: 2017-09-12

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB
Affects Version/s: 10.1.3, 10.2.4, 10.3.0
Fix Version/s: 10.2.9, 10.3.2

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MDEV-13813 Assertion `size % 512U == 0' failed ... Closed
causes MDEV-23982 Mariabackup hangs during backup Closed
is caused by MDEV-8041 InnoDB redo log encryption Closed
Relates
relates to MDEV-11782 Redefine the innodb_encrypt_log format Closed
relates to MDEV-12041 Implement key rotation for innodb_enc... Closed
relates to MDEV-13044 Do not create a redo log checkpoint a... Closed
relates to MDEV-14874 innodb_encrypt_log corrupts the log w... Closed

 Description   

basic tests work, but e.g. if I force checkpoint in xb_file_key_management test by restarting Server right before backup:

--source include/restart_mysqld.inc

Then backup operation

exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --target-dir=$targetdir

will not be able parse logs ( note 'aborted at' is bigger than 'recovery from' ):

2017-07-13 16:24:34 139729790940992 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1637690
2017-07-13 16:24:34 139729790940992 [ERROR] InnoDB: Incorrect log record type:73
2017-07-13 16:24:34 139729790940992 [Warning] InnoDB: Log scan aborted at LSN 1637376
2017-07-13 16:24:34 139729790940992 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
xtrabackup: innodb_init() returned 11 (Generic error).

Full log:

bash-4.2$ ./mtr xb_file_key_management
Logging: ./mtr  xb_file_key_management
vardir: /home/a/spd/_depot/m-tar/10.2.7/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/a/spd/_depot/m-tar/10.2.7/mysql-test/var'...
Checking supported features...
MariaDB Version 10.2.7-MariaDB
 - SSL connections supported
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
mariabackup.xb_file_key_management       [ fail ]
        Test ended at 2017-07-13 16:26:36
 
CURRENT_TEST: mariabackup.xb_file_key_management
mysqltest: At line 26: exec of '/home/a/spd/_depot/m-tar/10.2.7/bin/mariabackup  --prepare --target-dir=/home/a/spd/_depot/m-tar/10.2.7/mysql-test/var/tmp/backup 2>&1' failed, error: 256, status: 1, errno: 11
Output from before failure:
/home/a/spd/_depot/m-tar/10.2.7/bin/mariabackup based on MariaDB server 10.2.7-MariaDB Linux (x86_64) 
xtrabackup: cd to /home/a/spd/_depot/m-tar/10.2.7/mysql-test/var/tmp/backup/
Loading encryption plugin
         Encryption plugin parameter :  '--file_key_management_encryption_algorithm=aes_cbc'
         Encryption plugin parameter :  '--file_key_management_filekey=FILE:/home/a/spd/_depot/m-tar/10.2.7/mysql-test/suite/mariabackup/filekeys-data.key'
         Encryption plugin parameter :  '--file_key_management_filename=/home/a/spd/_depot/m-tar/10.2.7/mysql-test/suite/mariabackup/filekeys-data.enc'
         Encryption plugin parameter :  '--prepare'
         Encryption plugin parameter :  '--target-dir=/home/a/spd/_depot/m-tar/10.2.7/mysql-test/var/tmp/backup'
xtrabackup: This target seems to be not prepared yet.
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = .
xtrabackup: Starting InnoDB instance for recovery.
xtrabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Uses event mutexes
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Number of pools: 1
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Using SSE2 crc32 instructions
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Completed initialization of buffer pool
2017-07-13 16:26:35 140655682647808 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Highest supported file format is Barracuda.
2017-07-13 16:26:35 140656125790016 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1637690
2017-07-13 16:26:35 140656125790016 [ERROR] InnoDB: Incorrect log record type:104
2017-07-13 16:26:35 140656125790016 [Warning] InnoDB: Log scan aborted at LSN 1637376
2017-07-13 16:26:35 140656125790016 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
xtrabackup: innodb_init() returned 11 (Generic error).
 
 
 
The result from queries just before the failure was:
CREATE TABLE t(c VARCHAR(10)) ENGINE INNODB encrypted=yes;
INSERT INTO t VALUES('foobar1');
# xtrabackup backup
NOT FOUND /foobar1/ in ib_logfile0
# expect NOT FOUND
INSERT INTO t VALUES('foobar2');
# xtrabackup prepare
 
 - saving '/home/a/spd/_depot/m-tar/10.2.7/mysql-test/var/log/mariabackup.xb_file_key_management/' to '/home/a/spd/_depot/m-tar/10.2.7/mysql-test/var/log/mariabackup.xb_file_key_management/'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 8 seconds executing testcases
 
Failure: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): mariabackup.xb_file_key_management
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases



 Comments   
Comment by Marko Mäkelä [ 2017-08-29 ]

LSN=1637690 rounded down to a multiple of 512 is LSN 1637376. This (together with the invalid redo log record type 71) means that Mariabackup was unable to read the encrypted redo log file.

Where exactly was the line

--source include/restart_mysqld.inc

added? Please post an exact patch.

Comment by Andrii Nikitin (Inactive) [ 2017-09-11 ]

Sorry - I wanted to verify in current tree and it somehow escaped my queue.

Exact patch:

diff --git a/mysql-test/suite/mariabackup/xb_file_key_management.test b/mysql-test/suite/mariabackup/xb_file_key_management.test
index 3887a88..1a969f5 100644
--- a/mysql-test/suite/mariabackup/xb_file_key_management.test
+++ b/mysql-test/suite/mariabackup/xb_file_key_management.test
@@ -6,6 +6,7 @@ INSERT INTO t VALUES('foobar1');
 echo # xtrabackup backup;
 let $targetdir=$MYSQLTEST_VARDIR/tmp/backup;
 --disable_result_log
+--source include/restart_mysqld.inc
 exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --target-dir=$targetdir;
 --enable_result_log

I've checked 10.2.8 fails similar way:

mysqltest: At line 23: exec of '/home/a/mariadb-environs/_depot/m-tar/10.2.8/bin/mariabackup  --prepare --target-dir=/home/a/mariadb-environs/_depot/m-tar/10.2.8/mysql-test/var/tmp/backup 2>&1' failed, error: 256, status: 1, errno: 11
Output from before failure:
/home/a/mariadb-environs/_depot/m-tar/10.2.8/bin/mariabackup based on MariaDB server 10.2.8-MariaDB Linux (x86_64) 
xtrabackup: cd to /home/a/mariadb-environs/_depot/m-tar/10.2.8/mysql-test/var/tmp/backup/
Loading encryption plugin
	 Encryption plugin parameter :  '--file_key_management_encryption_algorithm=aes_cbc'
	 Encryption plugin parameter :  '--file_key_management_filekey=FILE:/home/a/mariadb-environs/_depot/m-tar/10.2.8/mysql-test/suite/mariabackup/filekeys-data.key'
	 Encryption plugin parameter :  '--file_key_management_filename=/home/a/mariadb-environs/_depot/m-tar/10.2.8/mysql-test/suite/mariabackup/filekeys-data.enc'
	 Encryption plugin parameter :  '--prepare'
	 Encryption plugin parameter :  '--target-dir=/home/a/mariadb-environs/_depot/m-tar/10.2.8/mysql-test/var/tmp/backup'
xtrabackup: This target seems to be not prepared yet.
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = .
xtrabackup: Starting InnoDB instance for recovery.
xtrabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Uses event mutexes
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Number of pools: 1
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Using SSE2 crc32 instructions
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Completed initialization of buffer pool
2017-09-11  8:42:08 140484638906112 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-11  8:42:08 140484947158528 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1637690
2017-09-11  8:42:08 140484947158528 [ERROR] InnoDB: Incorrect log record type:78
2017-09-11  8:42:08 140484947158528 [Warning] InnoDB: Log scan aborted at LSN 1637376
2017-09-11  8:42:08 140484947158528 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
xtrabackup: innodb_init() returned 11 (Generic error).

Current 10.2 tree fails little differently: backup succeeds, just InnoDB cannot start after restore:

mysqltest: At line 29: query 'SELECT * FROM t' failed: 1286: Unknown storage engine 'InnoDB'

With original error in Server log:

2017-09-11  8:50:19 139838501836608 [Note] /home/a/mariadb-environs/m1-10.2/build/sql/mysqld (mysqld 10.2.9-MariaDB-log) starting as process 26091 ...
2017-09-11  8:50:19 139838501836608 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
2017-09-11  8:50:19 139838501836608 [Note] Plugin 'SEQUENCE' is disabled.
2017-09-11  8:50:19 139838501836608 [Note] Plugin 'partition' is disabled.
2017-09-11  8:50:19 139838501836608 [Warning] InnoDB: Using innodb_file_format is deprecated and the parameter may be removed in future releases. See http://dev.mysql.com/doc/refman/5.7/en/innodb-file-format.html
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Uses event mutexes
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Compressed tables use zlib 1.2.8
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Using Linux native AIO
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Number of pools: 1
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Using SSE2 crc32 instructions
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Initializing buffer pool, total size = 8M, instances = 1, chunk size = 8M
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Completed initialization of buffer pool
2017-09-11  8:50:19 139838246328064 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-11  8:50:19 139838501836608 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1637178
2017-09-11  8:50:19 139838501836608 [ERROR] InnoDB: Incorrect log record type:40
2017-09-11  8:50:19 139838501836608 [Warning] InnoDB: Log scan aborted at LSN 1636864
2017-09-11  8:50:19 139838501836608 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2017-09-11  8:50:20 139838501836608 [Note] InnoDB: Starting shutdown...
2017-09-11  8:50:20 139838501836608 [ERROR] Plugin 'InnoDB' init function returned error.
2017-09-11  8:50:20 139838501836608 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

Comment by Andrii Nikitin (Inactive) [ 2017-09-11 ]

10.1 still isn't affected.
10.3 fails to start identically after restore:

2017-09-11  9:24:50 0 [Note] /home/a/mariadb-environs/m9-10.3/build/sql/mysqld (mysqld 10.3.2-MariaDB-log) starting as process 17721 ...
2017-09-11  9:24:50 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
2017-09-11  9:24:50 0 [Note] Plugin 'partition' is disabled.
2017-09-11  9:24:50 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-11  9:24:50 0 [Note] InnoDB: Uses event mutexes
2017-09-11  9:24:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.8
2017-09-11  9:24:50 0 [Note] InnoDB: Using Linux native AIO
2017-09-11  9:24:50 0 [Note] InnoDB: Number of pools: 1
2017-09-11  9:24:50 0 [Note] InnoDB: Using SSE2 crc32 instructions
2017-09-11  9:24:50 0 [Note] InnoDB: Initializing buffer pool, total size = 8M, instances = 1, chunk size = 8M
2017-09-11  9:24:50 0 [Note] InnoDB: Completed initialization of buffer pool
2017-09-11  9:24:50 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-09-11  9:24:50 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1636966
2017-09-11  9:24:50 0 [ERROR] InnoDB: Incorrect log record type:93
2017-09-11  9:24:50 0 [Warning] InnoDB: Log scan aborted at LSN 1636864
2017-09-11  9:24:50 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2017-09-11  9:24:50 0 [Note] InnoDB: Starting shutdown...
2017-09-11  9:24:50 0 [ERROR] Plugin 'InnoDB' init function returned error.
2017-09-11  9:24:50 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

Comment by Marko Mäkelä [ 2017-09-11 ]

This is a bug that was introduced in MDEV-11782.
It looks like innodb_encrypt_log might not be fully crash-safe.

The function log_crypt() was computing the initialization vector incorrectly. A log write that comprises multiple redo log would derive the initialization vector for every block from the same LSN. Recovery or backup cannot know how many blocks were written per request. Maybe most writes actually are performed one 512-byte redo log block at a time?

In this modified test, apparently there is only one redo log block that needs to be decrypted. Mariabackup --prepare is using the wrong initialization vector, because its log-scanning logic slightly differs from the normal server startup.

This bug should not only affect backup, but also the crash recovery from MariaDB 10.2.4 or later when using innodb_encrypt_log=ON.

Comment by Marko Mäkelä [ 2017-09-11 ]

The bug that I suspected to have been introduced in MDEV-11782 actually pre-exists in the 10.1 server. To experience that bug, you would seem to need a server crash and recovery around the time when the most significant 32 bits of log_sys->lsn are changing, so that the most significant bits of log_block_get_start_lsn() would be wrong. Not very likely.

The problem at hand seems to be unique to Mariabackup. I am still debugging it.

Comment by Marko Mäkelä [ 2017-09-11 ]

Starting with MariaDB 10.2.2, log_write_up_to() introduces the variable pad_size. The unused pad_size bytes would be encrypted as well, wasting CPU. I think that it is cleanest to invoke log_crypt() directly from log_write_up_to(), on the actual payload bytes.

Comment by Marko Mäkelä [ 2017-09-11 ]

The test case revealed several problems:

  • a glitch in the way how log_crypt() converts LSN to initialization vector, and the most significant 32 bits of the LSN are about to change
  • lack of input validation in Mariabackup 10.2 (data_len in xtrabackup_copy_log() was out of range)
  • inherently unsafe redo log rekeying at server startup; crash recovery would fail before the first log checkpoint. To hide this problem, InnoDB startup used to do an extra redo log checkpoint, but luckily this test case was quick enough to invoke xtrabackup before the checkpoint had been completed.

MDEV-12041 has been filed to properly implement redo log encryption key rotation. Until then, we must not call log_crypt_init() except when creating new redo log files.

Comment by Marko Mäkelä [ 2017-09-11 ]

bb-10.2-marko

Comment by Marko Mäkelä [ 2017-09-11 ]

I believe that many of these problems exist in the MariaDB 10.1 impelmentation of innodb_encrypt_log. Backporting the fix would be nontrivial, because the redo log encryption code was significantly cleaned up by MDEV-11782 in 10.2.4.
The extra checkpoint in innobase_start_or_create_for_mysql() was added in MariaDB 10.1.5 as part of MDEV-8041.

Comment by Jan Lindström (Inactive) [ 2017-09-12 ]

ok to push.

Generated at Thu Feb 08 08:04:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.