[MDEV-32971] Assertion !recv_sys.is_corrupt_fs() failed on recovery - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5, 10.6, 10.7(EOL)
Fix Version/s: 10.5.24, 10.6.17
Component/s: Storage Engine - InnoDB, Tests
Labels:
- debug
- not-10.8

Description

An encryption test failed on a mandatory builder:

10.5 c8346c0bacfdbe3fd61c67becff9934e75e08ed3
encryption.innodb-redo-nokeys 'ctr,innodb' w12 [ fail ]
...
2023-12-08 7:46:08 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=53205,1435867
2023-12-08 7:46:08 0 [ERROR] InnoDB: Recovery cannot access file ./test/t1.ibd (tablespace 5)
2023-12-08 7:46:08 0 [Note] InnoDB: You may set innodb_force_recovery=1 to ignore this and possibly get a corrupted database.
mariadbd: /home/buildbot/amd64-ubuntu-2204-debug-ps/build/storage/innobase/log/log0recv.cc:3510: dberr_t recv_recovery_from_checkpoint_start(lsn_t): Assertion `!recv_sys.found_corrupt_fs' failed.

This is reproducible also when attempting to start up 10.6 on a copy of the data directory (data.tar.xz).
The code path that outputs the error message when processing a FILE_MODIFY record will also set recv_sys.found_corrupt_fs. This is triggered by a fil_ibd_load() return value FIL_LOAD_INVALID because of missing encryption information:

	if (crypt_data && !crypt_data->is_key_found()) {

		crypt_data->~fil_space_crypt_t();

		ut_free(crypt_data);

		return FIL_LOAD_INVALID;

In 10.6 the code is different:

	if (crypt_data && !fil_crypt_check(crypt_data, filename)) {

		return FIL_LOAD_INVALID;

All we need to do is to relax the too strict debug assertion:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index d52a62cec99..05120871b0a 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -3507,7 +3507,7 @@ recv_recovery_from_checkpoint_start(lsn_t flush_lsn)

 	recv_group_scan_log_recs(checkpoint_lsn, &contiguous_lsn, false);

 	/* The first scan should not have stored or applied any records. */

 	ut_ad(recv_sys.pages.empty());

-	ut_ad(!recv_sys.found_corrupt_fs);

+	ut_ad(!recv_sys.found_corrupt_fs || !srv_force_recovery);

 	if (srv_read_only_mode && recv_needed_recovery) {

 		mysql_mutex_unlock(&log_sys.mutex);

In this way, the server startup will fail gracefully, just like it is expected by the test:

10.5 c8346c0bacfdbe3fd61c67becff9934e75e08ed3 with patch
2023-12-08 10:47:50 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=53205,1435867
2023-12-08 10:47:50 0 [ERROR] InnoDB: Recovery cannot access file ./test/t1.ibd (tablespace 5)
2023-12-08 10:47:50 0 [Note] InnoDB: You may set innodb_force_recovery=1 to ignore this and possibly get a corrupted database.
2023-12-08 10:47:50 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT at 1435867 between the checkpoint 53205 and the end 1501184.
2023-12-08 10:47:50 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1480] with error Generic error
2023-12-08 10:47:50 0 [Note] InnoDB: Starting shutdown...
2023-12-08 10:47:51 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

In 10.8, the log format was changed and the logic around FILE_CHECKPOINT was rewritten in ~~MDEV-14425~~. We would still invoke fil_name_process() on the FILE_MODIFY records, and recv_sys.found_corrupt_fs could be set for a similar reason, but there is no assertion failure in recv_recovery_from_checkpoint_start() after the initial call to recv_scan_log(). In our CI systems, I only found this type of failure on the 10.5 and 10.6 branches.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

data.tar.xz
2.33 MB
2023-12-08 08:18

Issue Links

relates to

MDEV-14425 Change the InnoDB redo log format to reduce write amplification

Closed

Activity

There are no comments yet on this issue.

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2023-12-08 08:53

Updated:: 2023-12-08 10:01

Resolved:: 2023-12-08 10:01

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration