Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.5, 10.6, 10.7(EOL)
Description
An encryption test failed on a mandatory builder:
10.5 c8346c0bacfdbe3fd61c67becff9934e75e08ed3 |
encryption.innodb-redo-nokeys 'ctr,innodb' w12 [ fail ]
|
...
|
2023-12-08 7:46:08 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=53205,1435867
|
2023-12-08 7:46:08 0 [ERROR] InnoDB: Recovery cannot access file ./test/t1.ibd (tablespace 5)
|
2023-12-08 7:46:08 0 [Note] InnoDB: You may set innodb_force_recovery=1 to ignore this and possibly get a corrupted database.
|
mariadbd: /home/buildbot/amd64-ubuntu-2204-debug-ps/build/storage/innobase/log/log0recv.cc:3510: dberr_t recv_recovery_from_checkpoint_start(lsn_t): Assertion `!recv_sys.found_corrupt_fs' failed.
|
This is reproducible also when attempting to start up 10.6 on a copy of the data directory (data.tar.xz).
The code path that outputs the error message when processing a FILE_MODIFY record will also set recv_sys.found_corrupt_fs. This is triggered by a fil_ibd_load() return value FIL_LOAD_INVALID because of missing encryption information:
if (crypt_data && !crypt_data->is_key_found()) { |
crypt_data->~fil_space_crypt_t();
|
ut_free(crypt_data);
|
return FIL_LOAD_INVALID; |
}
|
In 10.6 the code is different:
if (crypt_data && !fil_crypt_check(crypt_data, filename)) { |
return FIL_LOAD_INVALID; |
}
|
All we need to do is to relax the too strict debug assertion:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index d52a62cec99..05120871b0a 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -3507,7 +3507,7 @@ recv_recovery_from_checkpoint_start(lsn_t flush_lsn)
|
recv_group_scan_log_recs(checkpoint_lsn, &contiguous_lsn, false);
|
/* The first scan should not have stored or applied any records. */
|
ut_ad(recv_sys.pages.empty());
|
- ut_ad(!recv_sys.found_corrupt_fs);
|
+ ut_ad(!recv_sys.found_corrupt_fs || !srv_force_recovery);
|
|
if (srv_read_only_mode && recv_needed_recovery) {
|
mysql_mutex_unlock(&log_sys.mutex); |
In this way, the server startup will fail gracefully, just like it is expected by the test:
10.5 c8346c0bacfdbe3fd61c67becff9934e75e08ed3 with patch |
2023-12-08 10:47:50 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=53205,1435867
|
2023-12-08 10:47:50 0 [ERROR] InnoDB: Recovery cannot access file ./test/t1.ibd (tablespace 5)
|
2023-12-08 10:47:50 0 [Note] InnoDB: You may set innodb_force_recovery=1 to ignore this and possibly get a corrupted database.
|
2023-12-08 10:47:50 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT at 1435867 between the checkpoint 53205 and the end 1501184.
|
2023-12-08 10:47:50 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1480] with error Generic error
|
2023-12-08 10:47:50 0 [Note] InnoDB: Starting shutdown...
|
2023-12-08 10:47:51 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
|
In 10.8, the log format was changed and the logic around FILE_CHECKPOINT was rewritten in MDEV-14425. We would still invoke fil_name_process() on the FILE_MODIFY records, and recv_sys.found_corrupt_fs could be set for a similar reason, but there is no assertion failure in recv_recovery_from_checkpoint_start() after the initial call to recv_scan_log(). In our CI systems, I only found this type of failure on the 10.5 and 10.6 branches.
Attachments
Issue Links
- relates to
-
MDEV-14425 Change the InnoDB redo log format to reduce write amplification
- Closed