|
I posted some more analysis to MDEV-24448, where we will address the debug assertion failure that should be possible in 10.5 and 10.6, but on buildbot.askmonty.org has only occurred on 10.6 (after MDEV-21452). I do not think that any corruption is possible in the 10.5 or 10.6 series, thanks to MDEV-19514.
However, I do think that in earlier series, this race condition could cause the very last recovered page to be corrupted, if that page was a secondary index leaf page for which some changes exist in the change buffer. I hope that introducing some delay would repeat the problem:
diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
|
index 9f3d743ada1..b6ad559f079 100644
|
--- a/storage/innobase/buf/buf0buf.cc
|
+++ b/storage/innobase/buf/buf0buf.cc
|
@@ -6133,6 +6133,7 @@ buf_page_io_complete(buf_page_t* bpage, bool dblwr, bool evict)
|
&& fil_page_get_type(frame) == FIL_PAGE_INDEX
|
&& page_is_leaf(frame)) {
|
|
+ /* This may be executed too late! */
|
ibuf_merge_or_delete_for_page(
|
(buf_block_t*) bpage, bpage->id,
|
bpage->size);
|
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 4c3886caeaf..3bafd3f874c 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -2265,8 +2265,10 @@ void recv_recover_page(buf_page_t* bpage)
|
|
mtr.commit();
|
func_exit:
|
+ const ulint n_addrs = recv_sys->n_addrs;
|
mutex_exit(&recv_sys->mutex);
|
ut_ad(mtr.has_committed());
|
+ if (!n_addrs) os_thread_sleep(100000);
|
}
|
|
/** Reads in pages which have hashed log records, from an area around a given
|
Concurrently while the ibuf_merge_or_delete_for_page() is executing, recv_apply_hashed_log_recs(true) may complete. It seems that mariabackup --prepare would be able to exit without having properly written out the change buffer merge, in innobase_start_or_create_for_mysql():
buf_flush_sync_all_buf_pools();
|
err = fil_write_flushed_lsn(log_get_lsn());
|
ut_ad(!buf_pool_check_no_pending_io());
|
fil_close_log_files(true);
|
This might explain MDEV-20934 as well as the MDEV-9663 family of bugs.
I may of course be mistaken, but I hope that adding some sleep will improve the chances of reproducing the mystery corruption.
|
|
I made some success with a modified test and the above patch that adds os_thread_sleep():
|
10.2 8e3e87d2fc1e63d287f203d441dcb9360775c6b7
|
2020-12-28 11:05:32 140463661520832 [Note] InnoDB: Starting a batch to recover 30 pages from redo log.
|
2020-12-28 11:05:32 140463661520832 [ERROR] [FATAL] InnoDB: Page [page id: space=0, page number=61] still fixed or dirty
|
…
|
#4 0x00007fc03ecb9537 in __GI_abort () at abort.c:79
|
#5 0x000056215f5bf1aa in ib::fatal::~fatal (this=0x7ffdff0614e0) at /mariadb/10.2o/storage/innobase/ut/ut0ut.cc:649
|
#6 0x000056215f62f162 in buf_all_freed_instance (buf_pool=buf_pool@entry=0x56216119d820) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6238
|
#7 0x000056215f62c29b in buf_pool_invalidate_instance (buf_pool=0x56216119d820) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6295
|
#8 buf_pool_invalidate () at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6326
|
#9 0x000056215f46f2da in recv_apply_hashed_log_recs (last_batch=<optimized out>) at /mariadb/10.2o/storage/innobase/log/log0recv.cc:2545
|
#10 0x000056215f473691 in recv_group_scan_log_recs (group=group@entry=0x56216123a450, checkpoint_lsn=3976126, contiguous_lsn=contiguous_lsn@entry=0x7ffdff062160, last_phase=true) at /mariadb/10.2o/storage/innobase/log/log0recv.cc:3492
|
This is not directly proving MDEV-9663, but I would consider it as indirect proof. It occurred on the 298th run of the test, which was modified as follows:
diff --git a/mysql-test/suite/innodb/r/ibuf_not_empty.result b/mysql-test/suite/innodb/r/ibuf_not_empty.result
|
index 7c61e74850b..6c6ad3bf7d5 100644
|
--- a/mysql-test/suite/innodb/r/ibuf_not_empty.result
|
+++ b/mysql-test/suite/innodb/r/ibuf_not_empty.result
|
@@ -20,7 +20,5 @@ INSERT INTO t1 SELECT 0,b,c FROM t1;
|
INSERT INTO t1 SELECT 0,b,c FROM t1;
|
check table t1;
|
Table Op Msg_type Msg_text
|
-test.t1 check Warning InnoDB: Index 'b' contains #### entries, should be 4096.
|
-test.t1 check error Corrupt
|
-SET GLOBAL innodb_fast_shutdown=0;
|
+test.t1 check status OK
|
DROP TABLE t1;
|
diff --git a/mysql-test/suite/innodb/t/ibuf_not_empty.test b/mysql-test/suite/innodb/t/ibuf_not_empty.test
|
index a5093892b35..5f7c2b24b7d 100644
|
--- a/mysql-test/suite/innodb/t/ibuf_not_empty.test
|
+++ b/mysql-test/suite/innodb/t/ibuf_not_empty.test
|
@@ -43,64 +43,11 @@ INSERT INTO t1 SELECT 0,b,c FROM t1;
|
let MYSQLD_DATADIR=`select @@datadir`;
|
let PAGE_SIZE=`select @@innodb_page_size`;
|
|
---source include/shutdown_mysqld.inc
|
-
|
-# Corrupt the change buffer bitmap, to claim that pages are clean
|
-perl;
|
-do "$ENV{MTR_SUITE_DIR}/include/crc32.pl";
|
-my $file = "$ENV{MYSQLD_DATADIR}/test/t1.ibd";
|
-open(FILE, "+<$file") || die "Unable to open $file";
|
-binmode FILE;
|
-my $ps= $ENV{PAGE_SIZE};
|
-my $page;
|
-sysseek(FILE, $ps, 0) || die "Unable to seek $file\n";
|
-die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
-# Clean the change buffer bitmap.
|
-substr($page,38,$ps - 38 - 8) = chr(0) x ($ps - 38 - 8);
|
-my $polynomial = 0x82f63b78; # CRC-32C
|
-my $ck= pack("N",mycrc32(substr($page, 4, 22), 0, $polynomial) ^
|
- mycrc32(substr($page, 38, $ps - 38 - 8), 0, $polynomial));
|
-substr($page,0,4)=$ck;
|
-substr($page,$ps-8,4)=$ck;
|
-sysseek(FILE, $ps, 0) || die "Unable to rewind $file\n";
|
-syswrite(FILE, $page, $ps)==$ps || die "Unable to write $file\n";
|
-close(FILE) || die "Unable to close $file";
|
-EOF
|
-
|
---let $restart_parameters= --innodb-force-recovery=6 --innodb-change-buffer-dump
|
---source include/start_mysqld.inc
|
+--let $shutdown_timeout=0
|
+--source include/restart_mysqld.inc
|
+--let $shutdown_timeout=
|
|
---replace_regex /contains \d+ entries/contains #### entries/
|
check table t1;
|
|
---source include/shutdown_mysqld.inc
|
-
|
-# Truncate the file to 5 pages, as if it were empty
|
-perl;
|
-do "$ENV{MTR_SUITE_DIR}/include/crc32.pl";
|
-my $file = "$ENV{MYSQLD_DATADIR}/test/t1.ibd";
|
-open(FILE, "+<$file") || die "Unable to open $file";
|
-binmode FILE;
|
-my $ps= $ENV{PAGE_SIZE};
|
-my $pages=5;
|
-my $page;
|
-die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
-substr($page,46,4)=pack("N", $pages);
|
-my $polynomial = 0x82f63b78; # CRC-32C
|
-my $ck= pack("N",mycrc32(substr($page, 4, 22), 0, $polynomial) ^
|
- mycrc32(substr($page, 38, $ps - 38 - 8), 0, $polynomial));
|
-substr($page,0,4)=$ck;
|
-substr($page,$ps-8,4)=$ck;
|
-sysseek(FILE, 0, 0) || die "Unable to rewind $file\n";
|
-syswrite(FILE, $page, $ps)==$ps || die "Unable to write $file\n";
|
-truncate(FILE, $ps * $pages);
|
-close(FILE) || die "Unable to close $file";
|
-EOF
|
-
|
---let $restart_parameters=
|
---source include/start_mysqld.inc
|
-SET GLOBAL innodb_fast_shutdown=0;
|
---source include/restart_mysqld.inc
|
-
|
# Cleanup
|
DROP TABLE t1;
|
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 4c3886caeaf..fde96575c59 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -2265,8 +2265,10 @@ void recv_recover_page(buf_page_t* bpage)
|
|
mtr.commit();
|
func_exit:
|
+ const ulint n_addrs = recv_sys->n_addrs;
|
mutex_exit(&recv_sys->mutex);
|
ut_ad(mtr.has_committed());
|
+ if (!n_addrs) os_thread_sleep(1000);
|
}
|
|
/** Reads in pages which have hashed log records, from an area around a given
|
@@ -2407,7 +2409,7 @@ void recv_apply_hashed_log_recs(bool last_batch)
|
return;
|
}
|
|
- os_thread_sleep(500000);
|
+ os_thread_sleep(1000);
|
mutex_enter(&recv_sys->mutex);
|
}
|
|
@@ -2516,7 +2518,7 @@ void recv_apply_hashed_log_recs(bool last_batch)
|
return;
|
}
|
|
- os_thread_sleep(500000);
|
+ os_thread_sleep(1000);
|
|
mutex_enter(&(recv_sys->mutex));
|
}
|
On a second try, it occurred on the 362nd attempt:
2020-12-28 11:09:12 140184457193408 [Note] InnoDB: Starting a batch to recover 30 pages from redo log.
|
2020-12-28 11:09:12 140184457193408 [ERROR] [FATAL] InnoDB: Page [page id: space=0, page number=319] still fixed or dirty
|
On a third try, it failed on the 371st round:
2020-12-28 11:11:29 140563794130880 [Note] InnoDB: Starting a batch to recover 35 pages from redo log.
|
2020-12-28 11:11:29 140563794130880 [Note] InnoDB: Starting a batch to recover 38 pages from redo log.
|
2020-12-28 11:11:29 140563794130880 [ERROR] [FATAL] InnoDB: Page [page id: space=0, page number=331] still fixed or dirty
|
201228 11:11:29 [ERROR] mysqld got signal 6 ;
|
With the 1-line fix additionally applied, 2,000 rounds of the test would pass:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 4c3886caeaf..95179ec2271 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -2501,7 +2501,7 @@ void recv_apply_hashed_log_recs(bool last_batch)
|
|
/* Wait until all the pages have been processed */
|
|
- while (recv_sys->n_addrs != 0) {
|
+ while (recv_sys->n_addrs || buf_get_n_pending_read_ios()) {
|
const bool abort = recv_sys->found_corrupt_log
|
|| recv_sys->found_corrupt_fs;
|
|
A search for "still fixed or dirty" turns up MDEV-20481, where the message refers to page number 3 (clustered index root page), which cannot possibly be related to this bug, because the change buffer is only ever applied to secondary index pages (and never a root page).
I made one more try to reproduce a different failure, by reverting the above fix and applying one more patch to disable two assertions:
diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
|
index 9f3d743ada1..a86a9b3da51 100644
|
--- a/storage/innobase/buf/buf0buf.cc
|
+++ b/storage/innobase/buf/buf0buf.cc
|
@@ -1698,6 +1698,7 @@ buf_pool_contains_zip(
|
}
|
#endif /* UNIV_DEBUG */
|
|
+#if 0
|
/*********************************************************************//**
|
Checks that all file pages in the buffer chunk are in a replaceable state.
|
@return address of a non-free block, or NULL if all freed */
|
@@ -1761,6 +1762,7 @@ buf_chunk_not_freed(
|
|
return(NULL);
|
}
|
+#endif
|
|
/********************************************************************//**
|
Set buffer pool size variables after resizing it */
|
@@ -6223,6 +6225,7 @@ buf_all_freed_instance(
|
/*===================*/
|
buf_pool_t* buf_pool) /*!< in: buffer pool instancce */
|
{
|
+#if 0
|
ulint i;
|
buf_chunk_t* chunk;
|
|
@@ -6241,6 +6244,7 @@ buf_all_freed_instance(
|
}
|
|
buf_pool_mutex_exit(buf_pool);
|
+#endif
|
|
return(TRUE);
|
}
|
This would cause an assertion failure a little later:
2020-12-28 11:23:12 140594869794752 [Note] InnoDB: Starting a batch to recover 35 pages from redo log.
|
mysqld: /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6306: void buf_pool_invalidate_instance(buf_pool_t *): Assertion `(buf_pool->LRU).count == 0' failed.
|
Let me ”fix” that one as well:
@@ -6299,6 +6303,7 @@ buf_pool_invalidate_instance(
|
while (buf_LRU_scan_and_free_block(buf_pool, true)) {
|
}
|
|
+ UT_LIST_INIT(buf_pool->LRU, &buf_page_t::LRU);
|
ut_ad(UT_LIST_GET_LEN(buf_pool->LRU) == 0);
|
ut_ad(UT_LIST_GET_LEN(buf_pool->unzip_LRU) == 0);
|
|
And it fails differently again:
#5 0x0000558f6dd73fa2 in ut_dbg_assertion_failed (expr=0x558f6e1f1cb5 "UT_LIST_GET_LEN(buf_pool->LRU) == n_lru", file=<optimized out>, line=<optimized out>, line@entry=6542) at /mariadb/10.2o/storage/innobase/ut/ut0dbg.cc:60
|
#6 0x0000558f6dde4044 in buf_pool_validate_instance (buf_pool=<optimized out>) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6542
|
#7 0x0000558f6dddbc59 in buf_validate () at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:6577
|
#8 buf_page_optimistic_get (rw_latch=<optimized out>, block=block@entry=0x7fcd55abd580, modify_clock=modify_clock@entry=1, file=0x558f6e1ec05b "/mariadb/10.2o/storage/innobase/btr/btr0pcur.cc", line=line@entry=217, mtr=<optimized out>) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:4926
|
“Fix”:
@@ -6562,6 +6567,7 @@ ibool
|
buf_validate(void)
|
/*==============*/
|
{
|
+#if 0
|
ulint i;
|
|
for (i = 0; i < srv_buf_pool_instances; i++) {
|
@@ -6571,6 +6577,7 @@ buf_validate(void)
|
|
buf_pool_validate_instance(buf_pool);
|
}
|
+#endif
|
return(TRUE);
|
}
|
|
With those debug checks disabled, we are finally getting something that looks like ‘random’ corruption:
2020-12-28 11:29:07 139847306946496 [Note] InnoDB: Starting final batch to recover 28 pages from redo log.
|
2020-12-28 11:29:07 139847306946496 [Note] InnoDB: Cleaning up trx with id 0x291f
|
2020-12-28 11:29:07 0x7f30bd0da7c0 InnoDB: Assertion failure in file /mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc line 3197
|
InnoDB: Failing assertion: xdes_mtr_get_bit(descr, XDES_FREE_BIT, header_page % FSP_EXTENT_SIZE, mtr) == FALSE
|
…
|
#5 0x000055ee3dd9efa2 in ut_dbg_assertion_failed (expr=0x55ee3e22faf6 "xdes_mtr_get_bit(descr, XDES_FREE_BIT, header_page % FSP_EXTENT_SIZE, mtr) == FALSE", file=<optimized out>, line=<optimized out>, line@entry=3197) at /mariadb/10.2o/storage/innobase/ut/ut0dbg.cc:60
|
#6 0x000055ee3dea7caa in fseg_free_step (header=0x7f30b3ad403c "", mtr=mtr@entry=0x7ffdf24b56b8) at /mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc:3196
|
#7 0x000055ee3dd9cd5e in trx_undo_seg_free (undo=<optimized out>, noredo=<optimized out>) at /mariadb/10.2o/storage/innobase/trx/trx0undo.cc:1095
|
#8 trx_undo_commit_cleanup (undo=<optimized out>, is_temp=<optimized out>) at /mariadb/10.2o/storage/innobase/trx/trx0undo.cc:1789
|
#9 0x000055ee3dd8e7b5 in trx_cleanup_at_db_startup (trx=trx@entry=0x7f30b8171120) at /mariadb/10.2o/storage/innobase/trx/trx0trx.cc:1999
|
#10 0x000055ee3dd7f0ce in trx_rollback_resurrected (trx=<optimized out>, all=<optimized out>) at /mariadb/10.2o/storage/innobase/trx/trx0roll.cc:703
|
#11 trx_rollback_or_clean_recovered (all=<optimized out>, all@entry=0) at /mariadb/10.2o/storage/innobase/trx/trx0roll.cc:838
|
#12 0x000055ee3dd4b7f9 in innobase_start_or_create_for_mysql () at /mariadb/10.2o/storage/innobase/srv/srv0start.cc:2468
|
#13 0x000055ee3dbd90da in innobase_init (p=<optimized out>) at /mariadb/10.2o/storage/innobase/handler/ha_innodb.cc:4290
|
This corruption, which occurred on the 410th run of the test, appears to be related to an undo log page, not a secondary index or change buffer page. But, we must keep in mind that both undo log and change buffer pages are being allocated from the system tablespace in this test.
With the same settings, I repeated this corruption on the 460th run of the test:
2020-12-28 11:35:06 139865104271296 [Note] InnoDB: Starting a batch to recover 31 pages from redo log.
|
2020-12-28 11:35:06 139865104271296 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
|
2020-12-28 11:35:06 139865104271296 [Note] InnoDB: Trx id counter is 13056
|
2020-12-28 11:35:06 139865104271296 [Note] InnoDB: Starting final batch to recover 28 pages from redo log.
|
2020-12-28 11:35:06 139865104271296 [Note] InnoDB: Cleaning up trx with id 0x311f
|
2020-12-28 11:35:06 0x7f34e1db47c0 InnoDB: Assertion failure in file /mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc line 3197
|
InnoDB: Failing assertion: xdes_mtr_get_bit(descr, XDES_FREE_BIT, header_page % FSP_EXTENT_SIZE, mtr) == FALSE
|
I am afraid that this is the closest I can get to repeating a MDEV-9663 style corruption. Again, if I apply the 1-line fix, 2,000 rounds of the tests would succeed without any failures.
Results might be different if I had used real storage instead of a RAM disk, or if I enabled encryption slow down page I/O. I believe that the corruption that I repeated (a data page that is expected to be allocated is marked free for reuse) could explain MDEV-15608: a page that is supposed to be an undo log page might also be used for the change buffer, and corruption would ensue. Without having access to page dumps we of course can only make educated guesses.
|