[MDEV-9948] Failing assertion: new_state->key_version != ENCRYPTION_KEY_VERSION_INVALID in fil0crypt.cc Created: 2016-04-19  Updated: 2023-07-18  Resolved: 2022-04-15

Status: Closed
Project: MariaDB Server
Component/s: Encryption
Affects Version/s: 10.1, 10.5, 10.6, 10.7, 10.8
Fix Version/s: 10.5.16, 10.6.8, 10.7.4, 10.8.3, 10.9.1

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-9909 Enable plugins for MTR Closed
relates to MDEV-12863 No table can be created after second ... Closed
relates to MDEV-28280 Multiple key management plugins refus... Open

 Description   

Start server with

--plugin-load-add=file_key_management --plugin-load-add=example_key_management --innodb-encrypt-tables --innodb-encryption-threads=1

otherwise defaults.
Doesn't matter whether file-key-management-filename is provided or not.

Error log after startup

2016-04-19 20:38:43 139999079106432 [ERROR] Plugin 'file_key_management' registration as a ENCRYPTION failed.
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-04-19 20:38:43 139999079106432 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
2016-04-19 20:38:43 139999079106432 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_SYNC_DEBUG switched on !!!!!!!!!
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: The InnoDB memory heap is disabled
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Memory barrier is not used
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Using Linux native AIO
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Using generic crc32 instructions
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Completed initialization of buffer pool
2016-04-19 20:38:43 139999079106432 [Note] InnoDB: Highest supported file format is Barracuda.
2016-04-19 20:38:45 139999079106432 [Note] InnoDB: 128 rollback segment(s) are active.
2016-04-19 20:38:45 139999079106432 [Note] InnoDB: Waiting for purge to start
2016-04-19 20:38:45 139999079106432 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.28-76.1 started; log sequence number 1616819
2016-04-19 20:38:45 139999079106432 [Note] InnoDB: Creating #1 thread id 139998336632576 total threads 1.
2016-04-19 20:38:45 139998361810688 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-04-19 20:38:45 139999079106432 [Note] Plugin 'FEEDBACK' is disabled.
2016-04-19 20:38:45 139999079106432 [Note] Server socket created on IP: '::'.
2016-04-19 20:38:45 139999079106432 [Note] /bld/10.1/bin/mysqld: ready for connections.
Version: '10.1.13-MariaDB-debug'  socket: '/bld/10.1/data/tmp/mysql.sock'  port: 3306  Source distribution

Run

SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION;

Stack trace from 10.1 commit 1cf852d874b0e82ebfa3854300abaacd04d3eb01

2016-04-19 20:31:09 139814445774592 [ERROR] InnoDB: Used key_id 1 can't be found from key file.
2016-04-19 20:31:09 7f29165ff700  InnoDB: Assertion failure in thread 139814445774592 in file fil0crypt.cc line 994
InnoDB: Failing assertion: new_state->key_version != ENCRYPTION_KEY_VERSION_INVALID
 
#5  0x00007f2940b0e538 in abort () from /lib64/libc.so.6
#6  0x000056211447fa76 in fil_crypt_get_key_state (new_state=0x7f29165fee00) at /src/10.1/storage/xtradb/fil/fil0crypt.cc:994
#7  0x00005621144803e4 in fil_crypt_space_needs_rotation (state=0x7f29165fee10, key_state=0x7f29165fee00, recheck=0x7f29165fedeb) at /src/10.1/storage/xtradb/fil/fil0crypt.cc:1339
#8  0x0000562114480a61 in fil_crypt_find_space_to_rotate (key_state=0x7f29165fee00, state=0x7f29165fee10, recheck=0x7f29165fedeb) at /src/10.1/storage/xtradb/fil/fil0crypt.cc:1587
#9  0x00005621144820a5 in fil_crypt_thread (arg=0x0) at /src/10.1/storage/xtradb/fil/fil0crypt.cc:2260
#10 0x00007f2942a4f0a4 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f2940bbd04d in clone () from /lib64/libc.so.6



 Comments   
Comment by Elena Stepanova [ 2022-04-10 ]

Another case when the failure happens is if upon mariabackup prepare a key management plugin cannot be loaded on whatever reason. E.g. with file key management:

--source include/have_innodb.inc
 
--exec echo "1;0123456789012345678901234567890A" > $MYSQL_TMP_DIR/mykeys.txt
 
--let $restart_parameters= --plugin-load-add=file_key_management --file-key-management-filename=$MYSQL_TMP_DIR/mykeys.txt
--source include/restart_mysqld.inc
 
CREATE TABLE t1 (a VARCHAR(8)) ENGINE=InnoDB ENCRYPTED=YES;
INSERT INTO t1 VALUES ('foo'),('bar');
 
--exec $XTRABACKUP --backup -uroot --protocol=tcp --port=$MASTER_MYPORT --target-dir=$MYSQL_TMP_DIR/bk
--remove_file $MYSQL_TMP_DIR/mykeys.txt
--exec $XTRABACKUP --prepare --target-dir=$MYSQL_TMP_DIR/bk
 
 
# Cleanup
DROP TABLE t1;

10.5 cfdb6212

2022-04-11 00:25:10 0x7fba02efb700  InnoDB: Assertion failure in file /data/src/10.5/storage/innobase/fil/fil0crypt.cc line 549
InnoDB: Failing assertion: key_version != ENCRYPTION_KEY_VERSION_INVALID
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
220411  0:25:10 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.5.16-MariaDB-debug
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=1
thread_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6154 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/mnt8t/bld/10.5-debug-nightly/bin/mariabackup(my_print_stacktrace+0x40)[0x5651d504f404]
mysys/stacktrace.c:212(my_print_stacktrace)[0x5651d48096de]
sigaction.c:0(__restore_rt)[0x7fba044c8140]
linux/raise.c:51(__GI_raise)[0x7fba03ff7ce1]
stdlib/abort.c:81(__GI_abort)[0x7fba03fe1537]
ut/ut0mem.cc:45(ut_str3cat(char const*, char const*, char const*)::{lambda()#1}::operator()() const)[0x5651d4ef0792]
fil/fil0crypt.cc:552(fil_encrypt_buf_for_full_crc32(fil_space_crypt_t*, unsigned long, unsigned long, unsigned long, unsigned char const*, unsigned char*))[0x5651d4c169ba]
fil/fil0crypt.cc:598(fil_encrypt_buf(fil_space_crypt_t*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned char*, bool))[0x5651d4c16bcb]
fil/fil0crypt.cc:649(fil_space_encrypt(fil_space_t const*, unsigned long, unsigned char*, unsigned char*))[0x5651d4c16d97]
buf/buf0flu.cc:723(buf_page_encrypt(fil_space_t*, buf_page_t*, unsigned char*, unsigned long*))[0x5651d4baa384]
buf/buf0flu.cc:918(buf_flush_page(buf_page_t*, bool, fil_space_t*))[0x5651d4bab2a0]
buf/buf0flu.cc:1519(buf_do_flush_list_batch(unsigned long, unsigned long))[0x5651d4bade2a]
buf/buf0flu.cc:1595(buf_flush_list(unsigned long, unsigned long))[0x5651d4bae39c]
buf/buf0flu.cc:2385(buf_flush_page_cleaner(void*))[0x5651d4bb1b0f]
nptl/pthread_create.c:478(start_thread)[0x7fba044bcea7]
x86_64/clone.S:97(__GI___clone)[0x7fba040b9def]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /dev/shm/var_auto_JT0M/tmp/bk
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             385901               385901               processes 
Max open files            1024                 1024                 files     
Max locked memory         12659513856          12659513856          bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       385901               385901               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: core

Also happens with hashicorp plugin on 10.9.

Comment by Marko Mäkelä [ 2022-04-14 ]

On 10.6 and later, for the test case, the call stack where the object was initialized is related to MDEV-24626:

10.6 f7f0bc748e88f911b957d3dbf3bc0828a33675fb

#0  fil_space_crypt_t::fil_space_crypt_t (new_encryption=FIL_ENCRYPTION_ON, new_key_id=<optimized out>, new_min_key_version=0, new_type=0, this=0x55ba3d52d800)
    at /mariadb/10.6/storage/innobase/include/fil0crypt.h:128
#1  fil_space_create_crypt_data (type=0, min_key_version=0, key_id=<optimized out>, encrypt_mode=FIL_ENCRYPTION_ON) at /mariadb/10.6/storage/innobase/fil/fil0crypt.cc:226
#2  fil_space_create_crypt_data (key_id=<optimized out>, encrypt_mode=FIL_ENCRYPTION_ON) at /mariadb/10.6/storage/innobase/fil/fil0crypt.cc:245
#3  fil_space_read_crypt_data (zip_size=<optimized out>, page=page@entry=0x7ff359cfc000 "") at /mariadb/10.6/storage/innobase/fil/fil0crypt.cc:313
#4  0x000055ba3a2d5fe9 in recv_sys_t::recover_deferred (this=<optimized out>, p=<optimized out>, name="./test/t1.ibd", free_block=<optimized out>) at /mariadb/10.6/storage/innobase/log/log0recv.cc:897
#5  0x000055ba3a2dba33 in reinit_all (this=<optimized out>) at /mariadb/10.6/storage/innobase/log/log0recv.cc:757

I will check an older version to get a simpler view of this.

Comment by Marko Mäkelä [ 2022-04-14 ]

Before MDEV-12353 in 10.5, preparing a backup would fail due to the following:

	case MLOG_FILE_WRITE_CRYPT_DATA:
		dberr_t err;
		ptr = const_cast<byte*>(fil_parse_write_crypt_data(ptr, end_ptr, &err));
 
		if (err != DB_SUCCESS) {
			recv_sys->found_corrupt_log = TRUE;
		}
		break;

10.3 f130a5ea3c1cc504acecbad7615fb27c6e28dbfa

2022-04-14 12:11:57 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1636426
2022-04-14 12:11:57 0 [ERROR] InnoDB: ############### CORRUPT LOG RECORD FOUND ##################
2022-04-14 12:11:57 0 [Note] InnoDB: Log record type 100, page 5:0. Log parsing proceeded successfully up to 1636494. Previous log record type 1, is multi 1 Recv offset 527, prev 521

I do not think that it makes sense to address that error in the older versions.

Starting with 10.5 the assertion failure occurs during page flushing. The invalid object was initialized here:

10.5 e41500e4f2914af8142e06ca17925f6c03c7b7af

#0  fil_space_crypt_t::fil_space_crypt_t (this=0x55b1dd1fee50, new_type=0, 
    new_min_key_version=<optimized out>, new_key_id=1, 
    new_encryption=FIL_ENCRYPTION_ON)
    at /mariadb/10.5m/storage/innobase/include/fil0crypt.h:128
#1  0x000055b1dad33dc0 in fil_space_create_crypt_data (type=type@entry=0, 
    encrypt_mode=FIL_ENCRYPTION_ON, min_key_version=min_key_version@entry=0, 
    key_id=1) at /mariadb/10.5m/storage/innobase/fil/fil0crypt.cc:218
#2  0x000055b1dad33de3 in fil_space_create_crypt_data (
    encrypt_mode=<optimized out>, key_id=<optimized out>)
    at /mariadb/10.5m/storage/innobase/fil/fil0crypt.cc:243
#3  0x000055b1dad33eaf in fil_space_read_crypt_data (zip_size=<optimized out>, 
    page=0x55b1dd21c000 "")
    at /mariadb/10.5m/storage/innobase/fil/fil0crypt.cc:311
#4  0x000055b1dad279a9 in fil_ibd_load (space_id=5, filename=<optimized out>, 
    space=@0x7ffe724b95f8: 0x0)
    at /mariadb/10.5m/storage/innobase/fil/fil0fil.cc:3118
#5  0x000055b1dadf38e0 in fil_name_process (name=<optimized out>, 
    name@entry=0x7fb60376801c "./test/t1.ibd", len=len@entry=13, 
    space_id=<optimized out>, deleted=deleted@entry=false)
    at /mariadb/10.5m/storage/innobase/log/log0recv.cc:834
#6  0x000055b1dadf5769 in recv_sys_t::parse (
    this=this@entry=0x55b1db9198c0 <recv_sys>, checkpoint_lsn=47846, 
    store=store@entry=0x7ffe724b9ce4, apply=apply@entry=true)

I believe that an appropriate fix is to treat the file as if it was corrupted or non-existing when the encryption key is unavailable. Recovery would only succeed if innodb_force_recovery is set, and any recovered log records for the file would be ignored.

Comment by Marko Mäkelä [ 2022-04-14 ]

For a while, I was thinking if it would be better if mariadb-backup should fail to start up if the encryption plugin fails to load, like it does with the test case. Then I realized that also a properly working encryption plugin could fail to find a specific key. In the test case, it is one of the fallback stubs no_key() or no_get_key() that will be invoked.

The following fixes the test case for me. It should also fix crash recovery:

diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
index 94ea5182d35..5f5732a2a09 100644
--- a/storage/innobase/fil/fil0fil.cc
+++ b/storage/innobase/fil/fil0fil.cc
@@ -3118,6 +3118,10 @@ fil_ibd_load(
 		? fil_space_read_crypt_data(fil_space_t::zip_size(flags),
 					    first_page)
 		: NULL;
+	if (crypt_data && !crypt_data->is_key_found()) {
+		return FIL_LOAD_INVALID;
+	}
+
 	space = fil_space_t::create(
 		file.name(), space_id, flags, FIL_TYPE_TABLESPACE, crypt_data);
 

When it comes to creating a proper test case, I would appreciate some help. Otherwise, I would push this without a test case.

10.5 e41500e4f2914af8142e06ca17925f6c03c7b7af with the above patch

/dev/shm/10.5m/extra/mariabackup/mariabackup: Can't find file: '/dev/shm/10.5m/mysql-test/var/tmp/mykeys.txt' (errno: 2 "No such file or directory")
2022-04-14 17:37:26 0 [ERROR] Plugin 'file_key_management' init function returned error.
2022-04-14 17:37:26 0 [ERROR] Plugin 'file_key_management' registration as a ENCRYPTION failed.
[00] 2022-04-14 17:37:26 This target seems to be not prepared yet.
[00] 2022-04-14 17:37:26 mariabackup: using the following InnoDB configuration for recovery:
[00] 2022-04-14 17:37:26 innodb_data_home_dir = .
[00] 2022-04-14 17:37:26 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2022-04-14 17:37:26 innodb_log_group_home_dir = .
[00] 2022-04-14 17:37:26 InnoDB: Using Linux native AIO
[00] 2022-04-14 17:37:26 Starting InnoDB instance for recovery.
[00] 2022-04-14 17:37:26 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2022-04-14 17:37:26 0 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
2022-04-14 17:37:26 0 [Note] InnoDB: Uses event mutexes
2022-04-14 17:37:26 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-04-14 17:37:26 0 [Note] InnoDB: Number of pools: 1
2022-04-14 17:37:26 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-04-14 17:37:26 0 [Note] InnoDB: Using Linux native AIO
2022-04-14 17:37:26 0 [Note] InnoDB: Initializing buffer pool, total size = 104857600, chunk size = 104857600
2022-04-14 17:37:26 0 [Note] InnoDB: Completed initialization of buffer pool
2022-04-14 17:37:26 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47870,47870
2022-04-14 17:37:26 0 [Warning] InnoDB: We do not continue the crash recovery, because the table may become corrupt if we cannot apply the log records in the InnoDB log to it. To fix the problem and start mysqld:
2022-04-14 17:37:26 0 [Note] InnoDB: 1) If there is a permission problem in the file and mysqld cannot open the file, you should modify the permissions.
2022-04-14 17:37:26 0 [Note] InnoDB: 2) If the tablespace is not needed, or you can restore an older version from a backup, then you can remove the .ibd file, and use --innodb_force_recovery=1 to force startup without this file.
2022-04-14 17:37:26 0 [Note] InnoDB: 3) If the file system or the disk is broken, and you cannot remove the .ibd file, you can set --innodb_force_recovery.
2022-04-14 17:37:26 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1462] with error Generic error
[00] FATAL ERROR: 2022-04-14 17:37:26 mariabackup: innodb_init() returned 11 (Generic error).
mysqltest: At line 13: exec of '/dev/shm/10.5m/extra/mariabackup/mariabackup --prepare --target-dir=/dev/shm/10.5m/mysql-test/var/tmp/bk' failed, error: 256, status: 1, errno: 11
Output from before failure:
INSERT INTO t1 VALUES ('foo'),('bar');

Comment by Marko Mäkelä [ 2022-04-14 ]

The 10.6 version needed additional changes to the deferred recovery of tablespaces:

bb-10.6-MDEV-9948 50252088d5a2e202604462fb3b28e39923831642

2022-04-14 17:53:58 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=44267,44267
2022-04-14 17:53:58 0 [ERROR] InnoDB: Cannot apply log to [page id: space=5, page number=0] of corrupted file './test/t1.ibd'
2022-04-14 17:53:58 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1444] with error Data structure corruption
[00] FATAL ERROR: 2022-04-14 17:53:58 mariabackup: innodb_init() returned 37 (Data structure corruption).

Please test if that works for you, and if you can, create a polished test case. I think that recovery must succeed with innodb_force_recovery=1.

Comment by Marko Mäkelä [ 2022-04-15 ]

I modified the test encryption.innodb-redo-nokeys to cover this scenario. The test was flawed in a number of ways. It failed to ensure that InnoDB did not start, and it specified the parameter innodb_encrypt_tables=ON, which would cause InnoDB to refuse startup if the encryption plugin was not loaded. We want InnoDB to actually attempt startup and fail during the crash recovery.

Generated at Thu Feb 08 07:38:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.