[MDEV-12103] Reduce the time of looking for MLOG_CHECKPOINT during crash recovery Created: 2017-02-21  Updated: 2019-01-25  Resolved: 2017-03-03

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2.2
Fix Version/s: 10.2.5

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-18377 Assertion `!recv_sys->mlog_checkpoint... Closed
Relates
relates to MDEV-11027 InnoDB log recovery is too noisy Closed
relates to MDEV-11432 Change the informational redo log for... Closed
relates to MDEV-11782 Redefine the innodb_encrypt_log format Closed

 Description   

We should fix MySQL Bug #80788 in MariaDB 10.2.

When I made the InnoDB crash recovery more robust by implementing WL#7142, I also introduced an extra redo log scan pass that could be avoided.
Before MariaDB 10.2 is released as GA, we are free to change the redo log format and add extra information to the redo log checkpoint page, so that the extra scan can be avoided.



 Comments   
Comment by Marko Mäkelä [ 2017-03-01 ]

bb-10.2-marko

Comment by Jan Lindström (Inactive) [ 2017-03-02 ]

ok to push after considering the comment about removing the error message, there should be mechanism to detect that two or more redo log files do not form a consistent one redo log.

Comment by Marko Mäkelä [ 2017-03-02 ]

The individual redo log files form one logical redo log file, as if the files had been catenated together. I am afraid that we cannot easily extend the consistency checks. In the long term, I would like to have a single log file only. Starting with 10.2 (MDEV-12061 Allow innodb_log_files_in_group=1) we can use a single file.

In this innodb.innodb_bug59641 test failure it is clear that some revision to the logic is needed. I can occasionally repeat the failure locally by running a few of the same preceding tests on the same instance:

./mtr --no-reorder innodb.innodb_bug52663 innodb.innodb_bug53290 innodb.innodb_bug53592 innodb.innodb_bug54044 innodb.innodb_bug56143 innodb.innodb_bug56716 innodb.innodb_bug57252 innodb.innodb_bug57255 innodb.innodb_bug57904 innodb.innodb_bug59410 innodb.innodb_bug59641

The following should fix it:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index b0e0652470b..218e1367e83 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -1156,6 +1156,7 @@ recv_parse_or_apply_log_rec_body(
 		ut_d(page_type = fil_page_get_type(page));
 	} else if (apply
 		   && !is_predefined_tablespace(space_id)
+		   && recv_sys->scanned_lsn >= recv_sys->mlog_checkpoint_lsn
 		   && recv_spaces.find(space_id) == recv_spaces.end()) {
 		ib::fatal() << "Missing MLOG_FILE_NAME or MLOG_FILE_DELETE"
 			" for redo log record " << type << " (page "

I think that we need something more to ensure that we will catch tablespaces that are entered into recv_sys->addr_hash but missing from recv_spaces. Also a test for this kind of redo log corruption will be needed.

Comment by Marko Mäkelä [ 2017-03-02 ]

After extending the test innodb.log_corruption, I found out that recv_sys->recovered_lsn should be used instead of recv_sys->scanned_lsn. The inconsistency will be reported in recv_init_crash_recovery_spaces().

Comment by Marko Mäkelä [ 2017-03-02 ]

bb-10.2-marko

Comment by Jan Lindström (Inactive) [ 2017-03-02 ]

ok to push.

Generated at Thu Feb 08 07:55:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.