|
Here is another occurrence that looks a little different (single-batch recovery):
innodb.recovery_memory 'innodb,release' w29 [ 13 fail ]
|
Test ended at 2023-08-02 13:44:18
|
|
CURRENT_TEST: innodb.recovery_memory
|
mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
|
…
|
2023-08-02 13:44:17 0 [Note] InnoDB: Completed initialization of buffer pool
|
2023-08-02 13:44:17 0 [Note] InnoDB: Buffered log writes (block size=512 bytes)
|
2023-08-02 13:44:17 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=62853214
|
2023-08-02 13:44:17 0 [Note] InnoDB: End of log at LSN=62856487
|
2023-08-02 13:44:17 0 [Note] InnoDB: Ignoring a doublewrite copy of page [page id: space=2, page number=36] with future log sequence number 62857383
|
…
|
2023-08-02 13:44:17 0 [ERROR] InnoDB: Page [page id: space=0, page number=7] log sequence number 62864128 is in the future! Current system log sequence number 62856588.
|
Another run, with one more patch and some additional locking to rule out a problem of misbehaving environment (MDEV-28976):
./mtr --mysqld=--external-locking=1 --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
|
diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
|
index f12fe1807eb..f9e0cb8284f 100644
|
--- a/storage/innobase/buf/buf0buf.cc
|
+++ b/storage/innobase/buf/buf0buf.cc
|
@@ -533,7 +533,7 @@ static void buf_page_check_lsn(bool check_lsn, const byte* read_buf)
|
const uint32_t page_no = mach_read_from_4(
|
read_buf + FIL_PAGE_OFFSET);
|
|
- ib::error() << "Page " << page_id_t(space_id, page_no)
|
+ ib::fatal() << "Page " << page_id_t(space_id, page_no)
|
<< " log sequence number " << page_lsn
|
<< " is in the future! Current system"
|
<< " log sequence number "
|
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
innodb.recovery_memory 'innodb,release' w21 [ 64 fail ]
|
Test ended at 2023-08-02 14:12:27
|
|
CURRENT_TEST: innodb.recovery_memory
|
mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
|
…
|
2023-08-02 14:12:26 0 [Note] InnoDB: Buffered log writes (block size=512 bytes)
|
2023-08-02 14:12:26 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=324689424
|
2023-08-02 14:12:26 0 [Note] InnoDB: End of log at LSN=324689897
|
2023-08-02 14:12:26 0 [ERROR] InnoDB: Missing FILE_CREATE, FILE_DELETE or FILE_MODIFY before FILE_CHECKPOINT for tablespace 251
|
2023-08-02 14:12:26 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
|
|
|
One more failure:
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
innodb.recovery_memory 'innodb,release' w35 [ 15 fail ] Found warnings/errors in server log file!
|
Test ended at 2023-08-02 14:26:09
|
line
|
2023-08-02 14:26:05 0 [ERROR] InnoDB: Expected tablespace id 58 but found 0 in the file ./test/t1.ibd
|
This might be MDEV-31826. To prevent that, I applied a temporary fix:
diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
|
index a4a5dcf1aa3..f1304418870 100644
|
--- a/storage/innobase/fil/fil0fil.cc
|
+++ b/storage/innobase/fil/fil0fil.cc
|
@@ -1756,12 +1756,16 @@ pfs_os_file_t fil_delete_tablespace(uint32_t id)
|
{
|
ut_ad(!is_system_tablespace(id));
|
pfs_os_file_t handle= OS_FILE_CLOSED;
|
+ mtr_t mtr;
|
+ mtr.start();
|
+ mysql_mutex_lock(&fil_system.mutex);
|
+ if (fil_space_t *space= fil_space_get_by_id(id))
|
+ mtr.log_file_op(FILE_DELETE, id, space->chain.start->name);
|
+ mysql_mutex_unlock(&fil_system.mutex);
|
+
|
if (fil_space_t *space= fil_space_t::check_pending_operations(id))
|
{
|
/* Before deleting the file(s), persistently write a log record. */
|
- mtr_t mtr;
|
- mtr.start();
|
- mtr.log_file_op(FILE_DELETE, id, space->chain.start->name);
|
mtr.commit_file(*space, nullptr, &handle);
|
fil_space_free_low(space);
|
}
|
|
|
Here is one more failure, suggesting some kind of redo log corruption (premature end of the log when parsing):
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
innodb.recovery_memory 'innodb,release' w85 [ 3 fail ]
|
Test ended at 2023-08-02 14:46:41
|
|
CURRENT_TEST: innodb.recovery_memory
|
mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
|
…
|
2023-08-02 14:46:40 0 [Note] InnoDB: Buffered log writes (block size=512 bytes)
|
2023-08-02 14:46:40 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(10032912) at 10680209
|
2023-08-02 14:46:40 0 [ERROR] InnoDB: Log scan aborted at LSN 10680209
|
2023-08-02 14:46:40 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
|
I believe that in order to make any progress with this, we need rr replay traces of failures. On my system, everything appears to be fine if the databases reside in /dev/shm and the memory-mapped log ("fake PMEM") is being used. It might merely have something to do with timing.
|
|
With the memory-mapped log ("fake PMEM" on /dev/shm), the test appears rock solid to me.
I can’t tell if the rather frequent failures on Linux tmpfs when using normal pwrite() calls for the ib_logfile0 are due to timing differences or due to a bug in my kernel:
Linux version 6.3.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.3.0-5) 12.3.0, GNU ld (GNU Binutils for Debian) 2.40.50.20230625) #1 SMP PREEMPT_DYNAMIC Debian 6.3.11-1 (2023-07-01)
|
I ran one test round on ext4fs on NVMe storage and finally reproduced the bug in the Description, this time with a core dump thanks to my modification of buf_page_check_lsn():
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
2023-08-02 15:41:14 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB
|
2023-08-02 15:41:14 0 [Note] InnoDB: Completed initialization of buffer pool
|
2023-08-02 15:41:14 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
|
2023-08-02 15:41:14 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=271813827
|
2023-08-02 15:41:14 0 [Note] InnoDB: Multi-batch recovery needed at LSN 272313244
|
2023-08-02 15:41:14 0 [Note] InnoDB: End of log at LSN=276035576
|
2023-08-02 15:41:15 0 [ERROR] [FATAL] InnoDB: Page [page id: space=1, page number=312] log sequence number 274108480 is in the future! Current system log sequence number 272313346.
|
…
|
#7 0x00005634b7132be7 in ib::fatal::~fatal (this=this@entry=0x7ffc8c806ba0, __in_chrg=<optimized out>) at /mariadb/11/storage/innobase/ut/ut0ut.cc:527
|
#8 0x00005634b6fb44de in buf_page_check_lsn (read_buf=read_buf@entry=0x7fdb3a9c8000 "", check_lsn=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0buf.cc:536
|
#9 0x00005634b6fb2c91 in buf_page_check_lsn (read_buf=0x7fdb3a9c8000 "", check_lsn=true) at /mariadb/11/storage/innobase/buf/buf0buf.cc:522
|
#10 buf_page_is_corrupted (check_lsn=check_lsn@entry=true, read_buf=read_buf@entry=0x7fdb3a9c8000 "", fsp_flags=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0buf.cc:611
|
#11 0x00005634b6fb3170 in buf_page_check_corrupt (node=@0x5634b924d870: {space = 0x5634b924d740, name = 0x5634b924d8c0 ".//undo001", handle = {m_file = 9}, on_ssd = 1, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 1216, init_size = 0, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}, bpage=0x7fdb3a9c3000) at /mariadb/11/storage/innobase/buf/buf0buf.cc:3160
|
#12 buf_page_t::read_complete (this=this@entry=0x7fdb3a9c3000, node=@0x5634b924d870: {space = 0x5634b924d740, name = 0x5634b924d8c0 ".//undo001", handle = {m_file = 9}, on_ssd = 1, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 1216, init_size = 0, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at /mariadb/11/storage/innobase/buf/buf0buf.cc:3251
|
#13 0x00005634b70009d7 in buf_read_page_low (page_id={m_id = 4294967608}, zip_size=zip_size@entry=0, chain=@0x5634b9246058: {first = 0x7fdb3a9c3000}, space=space@entry=0x5634b924d740, block=@0x7ffc8c807028: 0x0, sync=sync@entry=true) at /mariadb/11/storage/innobase/buf/buf0rea.cc:263
|
#14 0x00005634b7000dca in buf_read_page (page_id=page_id@entry={m_id = 4294967608}, zip_size=zip_size@entry=0, chain=@0x5634b9246058: {first = 0x7fdb3a9c3000}) at /mariadb/11/storage/innobase/buf/buf0rea.cc:436
|
#15 0x00005634b6fb74d8 in buf_page_get_low (page_id={m_id = 4294967608}, zip_size=0, rw_latch=2, guess=0x0, mode=10, mtr=0x7ffc8c807340, err=0x0) at /mariadb/11/storage/innobase/buf/buf0buf.cc:2363
|
#16 0x00005634b6fb2f55 in buf_page_get_gen (page_id=<optimized out>, zip_size=<optimized out>, rw_latch=<optimized out>, guess=<optimized out>, mode=<optimized out>, mtr=<optimized out>, err=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0buf.cc:2656
|
#17 0x00005634b712c568 in trx_undo_mem_create_at_db_start (rseg=rseg@entry=0x5634b73df940 <trx_sys+17152>, id=id@entry=0, page_no=312) at /mariadb/11/storage/innobase/trx/trx0undo.cc:1017
|
#18 0x00005634b7123817 in trx_undo_lists_init (rseg_header=0x7fdb3aac5620, rseg=<optimized out>) at /mariadb/11/storage/innobase/trx/trx0rseg.cc:424
|
#19 trx_rseg_mem_restore (mtr=0x7ffc8c807800, rseg=<optimized out>) at /mariadb/11/storage/innobase/trx/trx0rseg.cc:495
|
#20 trx_rseg_array_init () at /mariadb/11/storage/innobase/trx/trx0rseg.cc:611
|
#21 0x00005634b7128570 in trx_lists_init_at_db_start () at /mariadb/11/storage/innobase/trx/trx0trx.cc:721
|
#22 0x00005634b710cdc0 in srv_start (create_new_db=false) at /mariadb/11/storage/innobase/srv/srv0start.cc:1561
|
The function srv_start() would invoke recv_sys.apply(true) to apply the last batch of log records a little later.
|
|
The LSN check in buf_page_check_lsn() is guarded both by the function parameter check_lsn and by the global variable recv_lsn_checks_on. That variable would be set in recv_recovery_from_checkpoint_start(), after parsing the log.
In MDEV-29911, the parsing of the log was changed, and recv_recovery_from_checkpoint_start() may no longer assign to log_sys.lsn the latest LSN. In the core dump that I am analyzing, I see log_sys.write_lsn=recv_sys.lsn=272313244, which corresponds to the LSN at which we noticed that multi-batch recovery will be needed. The end of the log was parsed at 276035576.
I am yet to evaluate whether this also affects the 10.6 version of MDEV-29911, or only 10.9 and later versions.
|
|
The recovery fails nondeterministically. An attempt to recover a copy of the failed data directory happened to works just fine, at least under rr record:
2023-08-02 16:43:13 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB
|
2023-08-02 16:43:13 0 [Note] InnoDB: Completed initialization of buffer pool
|
2023-08-02 16:43:13 0 [Note] InnoDB: Buffered log writes (block size=512 bytes)
|
2023-08-02 16:43:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=271813827
|
2023-08-02 16:43:13 0 [Note] InnoDB: Multi-batch recovery needed at LSN 272340327
|
2023-08-02 16:43:13 0 [Note] InnoDB: End of log at LSN=276035576
|
2023-08-02 16:43:13 0 [Note] InnoDB: To recover: LSN 272340327/276035576; 278 pages
|
2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 273245127/276035576; 54 pages
|
2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 274181976/276035576; 55 pages
|
2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 275118954/276035576; 54 pages
|
2023-08-02 16:43:14 0 [Note] InnoDB: To recover: LSN 276008436/276035576; 59 pages
|
2023-08-02 16:43:14 0 [Note] InnoDB: To recover: 7 pages
|
2023-08-02 16:43:14 0 [Warning] InnoDB: Cannot change innodb_undo_tablespaces=0 because previous shutdown was not with innodb_fast_shutdown=0
|
…
|
2023-08-02 16:43:14 0 [Note] InnoDB: log sequence number 276035576; transaction id 535920
|
In the rr replay of that execution, I see that recv_sys.lsn (which would be assigned to the log_sys.lsn before recv_lsn_checks_on is set) was last updated to the correct value here:
#0 recv_sys_t::parse<recv_buf, true> (
|
this=this@entry=0x55dd04a86c40 <recv_sys>,
|
l=@0x7ffffe498a68: {ptr = 0x7f29b0eab3f8 ""},
|
if_exists=if_exists@entry=true)
|
at /mariadb/11/storage/innobase/include/log0log.h:313
|
#1 0x000055dd04733452 in recv_sys_t::parse_mtr<true> (if_exists=true)
|
at /mariadb/11/storage/innobase/log/log0recv.cc:2996
|
#2 recv_sys_t::parse_pmem<true> (if_exists=if_exists@entry=true)
|
at /mariadb/11/storage/innobase/log/log0recv.cc:3007
|
#3 0x000055dd04724f98 in recv_scan_log (last_phase=last_phase@entry=true)
|
at /mariadb/11/storage/innobase/log/log0recv.cc:4009
|
#4 0x000055dd04725d07 in recv_recovery_from_checkpoint_start ()
|
at /mariadb/11/storage/innobase/log/log0recv.cc:4506
|
#5 0x000055dd047ba383 in srv_start (create_new_db=false)
|
at /mariadb/11/storage/innobase/srv/srv0start.cc:1513
|
It looks like we were reading the log wrong, hitting premature EOF while re-reading it. In the core dump, I saw recv_sys.lsn=272313244 and recv_sys.scanned_lsn=276035576. I will rerun the test with an additional assertion, hoping to catch this earlier.
|
|
The attached MDEV-31827-data.tar.xz triggers an assertion failure of the following patch:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 6f58be6ed35..07887e082c9 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -4662,6 +4662,7 @@ dberr_t recv_recovery_from_checkpoint_start()
|
}
|
|
mysql_mutex_lock(&recv_sys.mutex);
|
+ ut_a(recv_sys.scanned_lsn == recv_sys.lsn);
|
recv_sys.apply_log_recs = true;
|
recv_no_ibuf_operations = false;
|
ut_d(recv_no_log_write = srv_operation == SRV_OPERATION_RESTORE
|
If the datadir is located in /dev/shm and therefore the "fake PMEM" is being used, the dataset will recover fine as far as InnoDB is concerned:
|
10.9 8760fe11b89450d56269ed062b36b4238d45707f
|
2023-08-02 17:31:00 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB
|
2023-08-02 17:31:00 0 [Note] InnoDB: Completed initialization of buffer pool
|
2023-08-02 17:31:00 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
|
2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces
|
2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
|
2023-08-02 17:31:00 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=72811914
|
2023-08-02 17:31:00 0 [Note] InnoDB: Multi-batch recovery needed at LSN 73324548
|
2023-08-02 17:31:00 0 [Note] InnoDB: End of log at LSN=73844162
|
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73324548/73844162; 328 pages
|
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: LSN 73808457/73844162; 315 pages
|
2023-08-02 17:31:00 0 [ERROR] InnoDB: The change buffer is corrupted or has been removed on upgrade to MariaDB 11.0 or later
|
2023-08-02 17:31:00 0 [Note] InnoDB: Opened 3 undo tablespaces
|
2023-08-02 17:31:00 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
|
2023-08-02 17:31:00 0 [Note] InnoDB: To recover: 5 pages
|
2023-08-02 17:31:00 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.
|
|
|
This patch did the trick:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 6f58be6ed35..d8d976f5e67 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -4147,7 +4147,8 @@ static bool recv_scan_log(bool last_phase)
|
if (recv_sys.is_corrupt_log())
|
break;
|
|
- if (recv_sys.offset < log_sys.get_block_size())
|
+ if (recv_sys.offset < log_sys.get_block_size() &&
|
+ recv_sys.lsn == recv_sys.scanned_lsn)
|
goto got_eof;
|
|
if (recv_sys.offset > buf_size / 4 ||
|
The failure scenario was that we did premature goto got_eof when the very first mini-transaction at the "rewind LSN" wraps around the circular ib_logfile0 to byte offset 12288. The mini-transaction started 2048-4 bytes before the end of ib_logfile0. 2048 bytes were read from the end of the file into the parsing buffer, and the recv_sys.offset was 4.
After applying that patch (and retaining all above mentioned assertions), while testing on /run/user/$UID (tmpfs with pwrite() and pread() based ib_logfile0 access) I only got a timing-related failure that does not affect correctness:
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
innodb.recovery_memory 'innodb,release' w79 [ 41 fail ] Found warnings/errors in server log file!
|
Test ended at 2023-08-02 18:21:00
|
line
|
2023-08-02 18:20:55 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 480 OS file reads, 430 OS file writes, 0 OS fsyncs.
|
I restarted the test with innodb_log_file_size=5m (half the mtr default size), which should improve the failure rate. Got the same non-failure again:
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
innodb.recovery_memory 'innodb,release' w34 [ 20 fail ] Found warnings/errors in server log file!
|
Test ended at 2023-08-02 18:40:52
|
line
|
2023-08-02 18:40:48 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 595 OS file reads, 432 OS file writes, 0 OS fsyncs.
|
I temporarily suppressed that warning (it should eventually be investigated and fixed), hoping to get 100×100 successful test runs with regular file I/O on RAM disk. We should have the results in 30 minutes.
|
|
The following test with the patch and the suppressed warning was successful, on /run/user/$UID:
./mtr --mysqld=--innodb-log-file-size=5m --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
|
|
11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patches
|
…
|
innodb.recovery_memory 'innodb,release' w11 [ 100 pass ] 14533
|
--------------------------------------------------------------------------
|
The servers were restarted 0 times
|
Spent 158268.156 of 1651 seconds executing testcases
|
|
Completed: All 10000 tests were successful.
|
|