Details
Description
The test mariabackup.innodb_force_recovery occasionally failed in a 10.6 based development branch that contains a fix of MDEV-26827 and MDEV-26055. We have some rather rare failures of the test occurring on buildbot as well, like this:
10.10 4ce6e78059c1a91765c91afa5330737bd7568d0a |
mariabackup.innodb_force_recovery 'innodb' w13 [ fail ] timeout after 480 seconds
|
Test ended at 2022-07-28 12:27:20
|
|
Test case timeout after 480 seconds
|
|
== /buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/13/log/innodb_force_recovery.log ==
|
CREATE TABLE t(i INT) ENGINE INNODB;
|
INSERT INTO t VALUES(1);
|
# "innodb_force_recovery=1" should be allowed with "--prepare" only (mariabackup)
|
FOUND 1 /should only be used with "--prepare"/ in backup.log
|
# "innodb_force_recovery=1" should be allowed with "--apply-log" only (innobackupex)
|
FOUND 1 /should only be used with "--apply-log"/ in backup.log
|
# "innodb_force_recovery" should be limited to "SRV_FORCE_IGNORE_CORRUPT" (mariabackup)
|
FOUND 1 /innodb_force_recovery = 1/ in backup.log
|
# "innodb_force_recovery" should be limited to "SRV_FORCE_IGNORE_CORRUPT" (innobackupex)
|
FOUND 1 /innodb_force_recovery = 1/ in backup.log
|
# "innodb_force_recovery" should be read from "backup-my.cnf" (mariabackup)
|
FOUND 1 /innodb_force_recovery = 1/ in backup.log
|
# "innodb_force_recovery=1" should be read from "backup-my.cnf" (innobackupex)
|
FOUND 1 /innodb_force_recovery = 1/ in backup.log
|
# "innodb_force_recovery" from the command line should override "backup-my.cnf" (mariabackup)
|
NOT FOUND /innodb_force_recovery = 1/ in backup.log
|
# "innodb_force_recovery" from the command line should override "backup-my.cnf" (innobackupex)
|
NOT FOUND /innodb_force_recovery = 1/ in backup.log
|
# shutdown server
|
|
== /buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/13/tmp/analyze-timeout-mysqld.1.err ==
|
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local server through socket '/buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/tmp/13/mys' (2)
|
This fix worked in the case that I was able to reproduce:
diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc
|
index 5bf085079c8..4a01326047a 100644
|
--- a/storage/innobase/srv/srv0start.cc
|
+++ b/storage/innobase/srv/srv0start.cc
|
@@ -905,9 +905,7 @@ static lsn_t srv_prepare_to_delete_redo_log_file(bool old_exists)
|
{
|
DBUG_ENTER("srv_prepare_to_delete_redo_log_file");
|
|
- /* Disable checkpoints in the page cleaner. */
|
- ut_ad(!recv_sys.recovery_on);
|
- recv_sys.recovery_on= true;
|
+ ut_ad(recv_sys.recovery_on);
|
|
/* Clean the buffer pool. */
|
buf_flush_sync();
|
@@ -1589,10 +1587,10 @@ dberr_t srv_start(bool create_new_db)
|
}
|
}
|
|
- recv_sys.debug_free();
|
-
|
if (srv_operation == SRV_OPERATION_RESTORE
|
|| srv_operation == SRV_OPERATION_RESTORE_EXPORT) {
|
+ buf_flush_sync();
|
+ recv_sys.debug_free();
|
/* After applying the redo log from
|
SRV_OPERATION_BACKUP, flush the changes
|
to the data files and truncate or delete the log.
|
@@ -1684,6 +1682,8 @@ dberr_t srv_start(bool create_new_db)
|
return(srv_init_abort(err));
|
}
|
}
|
+
|
+ recv_sys.debug_free();
|
}
|
|
ut_ad(err == DB_SUCCESS); |
The failure that I observed was caused by the fact that buf_flush_page_cleaner() took a little longer to write out the recovered modified pages, and this thread had already executed recv_sys.debug_free(), which causes the page cleaner to attempt to write checkpoint information. The failures that I observed were caused by the fact that this thread had already closed and deleted the ib_logfile0 when buf_flush_page_cleaner() unexpectedly invoked log_checkpoint_low().
This type of failure should be caused by MDEV-23855, which made log checkpoints the responsibility of the page cleaner thread.
Because I have no stack traces of the test case timeouts, I am unable say if the above patch would fix them.
Attachments
Issue Links
- is caused by
-
MDEV-23855 InnoDB log checkpointing causes regression for write-heavy OLTP
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Link |
This issue is caused by |
issue.field.resolutiondate | 2023-03-16 12:40:38.0 | 2023-03-16 12:40:38.903 |
Fix Version/s | 10.5.20 [ 28512 ] | |
Fix Version/s | 10.6.13 [ 28514 ] | |
Fix Version/s | 10.8.8 [ 28518 ] | |
Fix Version/s | 10.9.6 [ 28520 ] | |
Fix Version/s | 10.10.4 [ 28522 ] | |
Fix Version/s | 10.11.3 [ 28524 ] | |
Fix Version/s | 11.0.2 [ 28706 ] | |
Fix Version/s | 10.7.8 [ 28515 ] | |
Fix Version/s | 10.5 [ 23123 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.7 [ 24805 ] | |
Fix Version/s | 10.8 [ 26121 ] | |
Fix Version/s | 10.9 [ 26905 ] | |
Fix Version/s | 10.10 [ 27530 ] | |
Fix Version/s | 10.11 [ 27614 ] | |
Fix Version/s | 11.0 [ 28320 ] | |
Resolution | Fixed [ 1 ] | |
Status | Open [ 1 ] | Closed [ 6 ] |