[MDEV-31827] InnoDB multi-batch recovery stops prematurely - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL)
Fix Version/s: 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1
Component/s: Storage Engine - InnoDB
Labels:

Description

In the final testing of ~~MDEV-31791~~, I observed the following failure:

11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
innodb.recovery_memory 'innodb,release' w4 [ 89 fail ] Found warnings/errors in server log file!
Test ended at 2023-08-02 12:21:21
line
2023-08-02 12:21:18 0 [ERROR] InnoDB: We scanned the log up to 418951162. A checkpoint was at 418061736 and the maximum LSN on a database page was 421016108. It is possible that the database is now corrupt!
2023-08-02 12:21:18 0 [ERROR] InnoDB: Page [page id: space=1, page number=88] log sequence number 421016108 is in the future! Current system log sequence number 418966369.
2023-08-02 12:21:18 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2023-08-02 12:21:18 0 [ERROR] InnoDB: Page [page id: space=1, page number=0] log sequence number 421016108 is in the future! Current system log sequence number 418966369.
2023-08-02 12:21:18 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.

A larger excerpt of the server error log indicates that we actually scanned more of the log, but for some reason failed to fully apply the final recovery batch and stopped prematurely at LSN 418951162 instead of 421869561:

2023-08-02 12:21:18 0 [Note] InnoDB: Multi-batch recovery needed at LSN 418950921

2023-08-02 12:21:18 0 [Note] InnoDB: End of log at LSN=421869561

2023-08-02 12:21:18 0 [Note] InnoDB: To recover: LSN 418950921/421869561; 106 pages

2023-08-02 12:21:18 0 [ERROR] InnoDB: We scanned the log up to 418951162. A checkpoint was at 418061736 and the maximum LSN on a database page was 421016108. It is possible that the database is now corrupt!

2023-08-02 12:21:18 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 15548 row operations to undo

2023-08-02 12:21:18 0 [Note] InnoDB: Trx id counter is 652795

2023-08-02 12:21:18 0 [Note] InnoDB: Opened 3 undo tablespaces

2023-08-02 12:21:18 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.

2023-08-02 12:21:18 0 [Note] InnoDB: Starting in background the rollback of recovered transactions

2023-08-02 12:21:18 0 [ERROR] InnoDB: Page [page id: space=1, page number=88] log sequence number 421016108 is in the future! Current system log sequence number 418966369.

...

2023-08-02 12:21:19 0 [Note] InnoDB: log sequence number 418951162; transaction id 652796

This might be a regression due to ~~MDEV-29911~~.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

MDEV-31827-data.tar.xz
3.37 MB
2023-08-02 14:26

Issue Links

is caused by

MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress

Closed

Activity

Ascending order - Click to sort in descending order

View 4 older comments

Marko Mäkelä added a comment - 2023-08-02 13:22

The LSN check in buf_page_check_lsn() is guarded both by the function parameter check_lsn and by the global variable recv_lsn_checks_on. That variable would be set in recv_recovery_from_checkpoint_start(), after parsing the log.

In ~~MDEV-29911~~, the parsing of the log was changed, and recv_recovery_from_checkpoint_start() may no longer assign to log_sys.lsn the latest LSN. In the core dump that I am analyzing, I see log_sys.write_lsn=recv_sys.lsn=272313244, which corresponds to the LSN at which we noticed that multi-batch recovery will be needed. The end of the log was parsed at 276035576.

I am yet to evaluate whether this also affects the 10.6 version of ~~MDEV-29911~~, or only 10.9 and later versions.

Marko Mäkelä added a comment - 2023-08-02 13:22 The LSN check in buf_page_check_lsn() is guarded both by the function parameter check_lsn and by the global variable recv_lsn_checks_on . That variable would be set in recv_recovery_from_checkpoint_start() , after parsing the log. In MDEV-29911 , the parsing of the log was changed, and recv_recovery_from_checkpoint_start() may no longer assign to log_sys.lsn the latest LSN. In the core dump that I am analyzing, I see log_sys.write_lsn=recv_sys.lsn=272313244 , which corresponds to the LSN at which we noticed that multi-batch recovery will be needed. The end of the log was parsed at 276035576. I am yet to evaluate whether this also affects the 10.6 version of MDEV-29911 , or only 10.9 and later versions.

Marko Mäkelä added a comment - 2023-08-02 14:00

The recovery fails nondeterministically. An attempt to recover a copy of the failed data directory happened to works just fine, at least under rr record:

2023-08-02 16:43:13 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB

2023-08-02 16:43:13 0 [Note] InnoDB: Completed initialization of buffer pool

2023-08-02 16:43:13 0 [Note] InnoDB: Buffered log writes (block size=512 bytes)

2023-08-02 16:43:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=271813827

2023-08-02 16:43:13 0 [Note] InnoDB: Multi-batch recovery needed at LSN 272340327

2023-08-02 16:43:13 0 [Note] InnoDB: End of log at LSN=276035576

2023-08-02 16:43:13 0 [Note] InnoDB: To recover: LSN 272340327/276035576; 278 pages

2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 273245127/276035576; 54 pages

2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 274181976/276035576; 55 pages

2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 275118954/276035576; 54 pages

2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 276008436/276035576; 59 pages

2023-08-02 16:43:14 0 [Note] InnoDB: To recover: 7 pages

2023-08-02 16:43:14 0 [Warning] InnoDB: Cannot change innodb_undo_tablespaces=0 because previous shutdown was not with innodb_fast_shutdown=0

…

2023-08-02 16:43:14 0 [Note] InnoDB: log sequence number 276035576; transaction id 535920

In the rr replay of that execution, I see that recv_sys.lsn (which would be assigned to the log_sys.lsn before recv_lsn_checks_on is set) was last updated to the correct value here:

#0  recv_sys_t::parse<recv_buf, true> (

    this=this@entry=0x55dd04a86c40 <recv_sys>,

    l=@0x7ffffe498a68: {ptr = 0x7f29b0eab3f8 ""},

    if_exists=if_exists@entry=true)

    at /mariadb/11/storage/innobase/include/log0log.h:313

#1  0x000055dd04733452 in recv_sys_t::parse_mtr<true> (if_exists=true)

    at /mariadb/11/storage/innobase/log/log0recv.cc:2996

#2  recv_sys_t::parse_pmem<true> (if_exists=if_exists@entry=true)

    at /mariadb/11/storage/innobase/log/log0recv.cc:3007

#3  0x000055dd04724f98 in recv_scan_log (last_phase=last_phase@entry=true)

    at /mariadb/11/storage/innobase/log/log0recv.cc:4009

#4  0x000055dd04725d07 in recv_recovery_from_checkpoint_start ()

    at /mariadb/11/storage/innobase/log/log0recv.cc:4506

#5  0x000055dd047ba383 in srv_start (create_new_db=false)

    at /mariadb/11/storage/innobase/srv/srv0start.cc:1513

It looks like we were reading the log wrong, hitting premature EOF while re-reading it. In the core dump, I saw recv_sys.lsn=272313244 and recv_sys.scanned_lsn=276035576. I will rerun the test with an additional assertion, hoping to catch this earlier.

Marko Mäkelä added a comment - 2023-08-02 14:00 The recovery fails nondeterministically. An attempt to recover a copy of the failed data directory happened to works just fine, at least under rr record : 2023-08-02 16:43:13 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB 2023-08-02 16:43:13 0 [Note] InnoDB: Completed initialization of buffer pool 2023-08-02 16:43:13 0 [Note] InnoDB: Buffered log writes (block size=512 bytes) 2023-08-02 16:43:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=271813827 2023-08-02 16:43:13 0 [Note] InnoDB: Multi-batch recovery needed at LSN 272340327 2023-08-02 16:43:13 0 [Note] InnoDB: End of log at LSN=276035576 2023-08-02 16:43:13 0 [Note] InnoDB: To recover: LSN 272340327/276035576; 278 pages 2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 273245127/276035576; 54 pages 2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 274181976/276035576; 55 pages 2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 275118954/276035576; 54 pages 2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 276008436/276035576; 59 pages 2023-08-02 16:43:14 0 [Note] InnoDB: To recover: 7 pages 2023-08-02 16:43:14 0 [Warning] InnoDB: Cannot change innodb_undo_tablespaces=0 because previous shutdown was not with innodb_fast_shutdown=0 … 2023-08-02 16:43:14 0 [Note] InnoDB: log sequence number 276035576; transaction id 535920 In the rr replay of that execution, I see that recv_sys.lsn (which would be assigned to the log_sys.lsn before recv_lsn_checks_on is set) was last updated to the correct value here: #0 recv_sys_t::parse<recv_buf, true> ( this=this@entry=0x55dd04a86c40 <recv_sys>, l=@0x7ffffe498a68: {ptr = 0x7f29b0eab3f8 ""}, if_exists=if_exists@entry=true) at /mariadb/11/storage/innobase/include/log0log.h:313 #1 0x000055dd04733452 in recv_sys_t::parse_mtr<true> (if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:2996 #2 recv_sys_t::parse_pmem<true> (if_exists=if_exists@entry=true) at /mariadb/11/storage/innobase/log/log0recv.cc:3007 #3 0x000055dd04724f98 in recv_scan_log (last_phase=last_phase@entry=true) at /mariadb/11/storage/innobase/log/log0recv.cc:4009 #4 0x000055dd04725d07 in recv_recovery_from_checkpoint_start () at /mariadb/11/storage/innobase/log/log0recv.cc:4506 #5 0x000055dd047ba383 in srv_start (create_new_db=false) at /mariadb/11/storage/innobase/srv/srv0start.cc:1513 It looks like we were reading the log wrong, hitting premature EOF while re-reading it. In the core dump, I saw recv_sys.lsn=272313244 and recv_sys.scanned_lsn=276035576 . I will rerun the test with an additional assertion, hoping to catch this earlier.

Marko Mäkelä added a comment - 2023-08-02 14:31

The attached MDEV-31827-data.tar.xz triggers an assertion failure of the following patch:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index 6f58be6ed35..07887e082c9 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -4662,6 +4662,7 @@ dberr_t recv_recovery_from_checkpoint_start()

 	mysql_mutex_lock(&recv_sys.mutex);

+	ut_a(recv_sys.scanned_lsn == recv_sys.lsn);

 	recv_sys.apply_log_recs = true;

 	recv_no_ibuf_operations = false;

 	ut_d(recv_no_log_write = srv_operation == SRV_OPERATION_RESTORE

If the datadir is located in /dev/shm and therefore the "fake PMEM" is being used, the dataset will recover fine as far as InnoDB is concerned:

10.9 8760fe11b89450d56269ed062b36b4238d45707f
2023-08-02 17:31:00 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB
2023-08-02 17:31:00 0 [Note] InnoDB: Completed initialization of buffer pool
2023-08-02 17:31:00 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces
2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
2023-08-02 17:31:00 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=72811914
2023-08-02 17:31:00 0 [Note] InnoDB: Multi-batch recovery needed at LSN 73324548
2023-08-02 17:31:00 0 [Note] InnoDB: End of log at LSN=73844162
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73324548/73844162; 328 pages
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73808457/73844162; 315 pages
2023-08-02 17:31:00 0 [ERROR] InnoDB: The change buffer is corrupted or has been removed on upgrade to MariaDB 11.0 or later
2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces
2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: 5 pages
2023-08-02 17:31:00 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.

Marko Mäkelä added a comment - 2023-08-02 14:31 The attached MDEV-31827-data.tar.xz triggers an assertion failure of the following patch: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 6f58be6ed35..07887e082c9 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -4662,6 +4662,7 @@ dberr_t recv_recovery_from_checkpoint_start() } mysql_mutex_lock(&recv_sys.mutex); + ut_a(recv_sys.scanned_lsn == recv_sys.lsn); recv_sys.apply_log_recs = true; recv_no_ibuf_operations = false; ut_d(recv_no_log_write = srv_operation == SRV_OPERATION_RESTORE If the datadir is located in /dev/shm and therefore the "fake PMEM" is being used, the dataset will recover fine as far as InnoDB is concerned: 10.9 8760fe11b89450d56269ed062b36b4238d45707f 2023-08-02 17:31:00 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB 2023-08-02 17:31:00 0 [Note] InnoDB: Completed initialization of buffer pool 2023-08-02 17:31:00 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces 2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces 2023-08-02 17:31:00 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=72811914 2023-08-02 17:31:00 0 [Note] InnoDB: Multi-batch recovery needed at LSN 73324548 2023-08-02 17:31:00 0 [Note] InnoDB: End of log at LSN=73844162 2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73324548/73844162; 328 pages 2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73808457/73844162; 315 pages 2023-08-02 17:31:00 0 [ERROR] InnoDB: The change buffer is corrupted or has been removed on upgrade to MariaDB 11.0 or later 2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces 2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces 2023-08-02 17:31:00 0 [Note] InnoDB: To recover: 5 pages 2023-08-02 17:31:00 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.

Marko Mäkelä added a comment - 2023-08-02 15:47

This patch did the trick:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index 6f58be6ed35..d8d976f5e67 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -4147,7 +4147,8 @@ static bool recv_scan_log(bool last_phase)

     if (recv_sys.is_corrupt_log())

       break;

-    if (recv_sys.offset < log_sys.get_block_size())

+    if (recv_sys.offset < log_sys.get_block_size() &&

+        recv_sys.lsn == recv_sys.scanned_lsn)

       goto got_eof;

     if (recv_sys.offset > buf_size / 4 ||

The failure scenario was that we did premature goto got_eof when the very first mini-transaction at the "rewind LSN" wraps around the circular ib_logfile0 to byte offset 12288. The mini-transaction started 2048-4 bytes before the end of ib_logfile0. 2048 bytes were read from the end of the file into the parsing buffer, and the recv_sys.offset was 4.

After applying that patch (and retaining all above mentioned assertions), while testing on /run/user/$UID (tmpfs with pwrite() and pread() based ib_logfile0 access) I only got a timing-related failure that does not affect correctness:

11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
innodb.recovery_memory 'innodb,release' w79 [ 41 fail ] Found warnings/errors in server log file!
Test ended at 2023-08-02 18:21:00
line
2023-08-02 18:20:55 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 480 OS file reads, 430 OS file writes, 0 OS fsyncs.

I restarted the test with innodb_log_file_size=5m (half the mtr default size), which should improve the failure rate. Got the same non-failure again:

11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
innodb.recovery_memory 'innodb,release' w34 [ 20 fail ] Found warnings/errors in server log file!
Test ended at 2023-08-02 18:40:52
line
2023-08-02 18:40:48 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 595 OS file reads, 432 OS file writes, 0 OS fsyncs.

I temporarily suppressed that warning (it should eventually be investigated and fixed), hoping to get 100×100 successful test runs with regular file I/O on RAM disk. We should have the results in 30 minutes.

Marko Mäkelä added a comment - 2023-08-02 15:47 This patch did the trick: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 6f58be6ed35..d8d976f5e67 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -4147,7 +4147,8 @@ static bool recv_scan_log(bool last_phase) if (recv_sys.is_corrupt_log()) break; - if (recv_sys.offset < log_sys.get_block_size()) + if (recv_sys.offset < log_sys.get_block_size() && + recv_sys.lsn == recv_sys.scanned_lsn) goto got_eof; if (recv_sys.offset > buf_size / 4 || The failure scenario was that we did premature goto got_eof when the very first mini-transaction at the "rewind LSN" wraps around the circular ib_logfile0 to byte offset 12288. The mini-transaction started 2048-4 bytes before the end of ib_logfile0 . 2048 bytes were read from the end of the file into the parsing buffer, and the recv_sys.offset was 4. After applying that patch (and retaining all above mentioned assertions), while testing on /run/user/$UID (tmpfs with pwrite() and pread() based ib_logfile0 access) I only got a timing-related failure that does not affect correctness: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches innodb.recovery_memory 'innodb,release' w79 [ 41 fail ] Found warnings/errors in server log file! Test ended at 2023-08-02 18:21:00 line 2023-08-02 18:20:55 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 480 OS file reads, 430 OS file writes, 0 OS fsyncs. I restarted the test with innodb_log_file_size=5m (half the mtr default size), which should improve the failure rate. Got the same non-failure again: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches innodb.recovery_memory 'innodb,release' w34 [ 20 fail ] Found warnings/errors in server log file! Test ended at 2023-08-02 18:40:52 line 2023-08-02 18:40:48 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 595 OS file reads, 432 OS file writes, 0 OS fsyncs. I temporarily suppressed that warning (it should eventually be investigated and fixed), hoping to get 100×100 successful test runs with regular file I/O on RAM disk. We should have the results in 30 minutes.

Marko Mäkelä added a comment - 2023-08-02 16:15

The following test with the patch and the suppressed warning was successful, on /run/user/$UID:

./mtr --mysqld=--innodb-log-file-size=5m --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}

11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
…
innodb.recovery_memory 'innodb,release' w11 [ 100 pass ] 14533
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 158268.156 of 1651 seconds executing testcases

Completed: All 10000 tests were successful.

Marko Mäkelä added a comment - 2023-08-02 16:15 The following test with the patch and the suppressed warning was successful, on /run/user/$UID : . /mtr --mysqld=--innodb-log- file -size=5m --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,} 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches … innodb.recovery_memory 'innodb,release' w11 [ 100 pass ] 14533 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 158268.156 of 1651 seconds executing testcases Completed: All 10000 tests were successful.

MariaDB Server

InnoDB multi-batch recovery stops prematurely

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration