[MDEV-25886] CHECK TABLE crashes due to DB_MISSING_HISTORY in innodb_read_only mode Created: 2021-06-09  Updated: 2021-06-09  Resolved: 2021-06-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.6.2, 10.3.30, 10.4.20, 10.5.11

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: purge

Issue Links:
Relates
relates to MDEV-15418 innodb_force_recovery=5 displays bogu... Closed
relates to MDEV-18952 CHECK TABLE should use READ UNCOMMITE... Closed
relates to MDEV-25180 Atomic ALTER TABLE Closed

 Description   

Occasionally, the test innodb.alter_copy would fail, reporting DB_MISSING_HISTORY in CHECK TABLE. It first caught my attention during the development of MDEV-25180. That change introduced delays to the purge tasks, in the form of purge_sys.stop_SYS(). If we delay purge more during DDL operations, then the test would almost always fail. Here is the first occurrence of this kind in a main branch:

10.6 66165ae2210ac3230850fa60086564db

CURRENT_TEST: innodb.alter_copy
mysqltest: At line 84: query 'CHECK TABLE t' failed: <Unknown> (2013): Lost connection to server during query
2021-05-21 22:21:54 3 [ERROR] [FATAL] InnoDB: Unknown error Required history data has been deleted

I was able to repeat this, and this actually affects older versions as well. In 10.2, the transaction isolation level is supposed to be effectively hard-wired as READ UNCOMMITTED when innodb_read_only is set or the setting innodb_force_recovery has the value 4, 5, or 6. But, we actually would use some other isolation level in trx_undo_prev_version_build():

	if (trx_undo_get_undo_rec(
		    roll_ptr, heap, rec_trx_id, index->table->name,
		    &undo_rec)) {
		if (v_status & TRX_UNDO_PREV_IN_PURGE) {
			/* We are fetching the record being purged */
			undo_rec = trx_undo_get_undo_rec_low(roll_ptr, heap);
		} else {
			/* The undo record may already have been purged,
			during purge or semi-consistent read. */
			return(false);
		}
	}

The fix is to avoid creating a purge view if purge is not going to be executed, so that the READ UNCOMMITTED mode will be used. There are two possible fixes: Either make CHECK TABLE explicitly use the READ UNCOMMITTED isolation level, or do not create the purge sys view (the below is for MariaDB 10.2):

diff --git a/storage/innobase/trx/trx0sys.cc b/storage/innobase/trx/trx0sys.cc
index 9138e9475bf..19dfa79d0ff 100644
--- a/storage/innobase/trx/trx0sys.cc
+++ b/storage/innobase/trx/trx0sys.cc
@@ -535,7 +535,9 @@ trx_sys_init_at_db_start()
 
 	trx_sys_mutex_exit();
 
-	trx_sys->mvcc->clone_oldest_view(&purge_sys->view);
+	if (!high_level_read_only) {
+		trx_sys->mvcc->clone_oldest_view(&purge_sys->view);
+	}
 }
 
 /*****************************************************************//**

We could also explicitly use READ UNCOMMITTED isolation level in CHECK TABLE also in this case (see MDEV-15418 and MDEV-18952).

MDEV-15418 claims the following:

Since MariaDB 10.2 (along with upstream MySQL 5.7), InnoDB already does this [READ UNCOMMITTED] for innodb_read_only=1

That hard-wiring occurs at a rather low level, for example, lock_clust_rec_cons_read_sees() is checking srv_read_only_mode.

The fix with the least impact and risk would seem to be to only touch CHECK TABLE, starting with 10.3, where MDEV-15418 first touched this code:

diff --git a/storage/innobase/dict/dict0dict.cc b/storage/innobase/dict/dict0dict.cc
index 10b878c0e49..1d0e2af3cd6 100644
--- a/storage/innobase/dict/dict0dict.cc
+++ b/storage/innobase/dict/dict0dict.cc
@@ -5425,7 +5425,7 @@ dict_set_corrupted(
 
 	/* If this is read only mode, do not update SYS_INDEXES, just
 	mark it as corrupted in memory */
-	if (srv_read_only_mode) {
+	if (high_level_read_only) {
 		index->type |= DICT_CORRUPT;
 		goto func_exit;
 	}
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 4e63edc65c2..65039919793 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -14796,10 +14796,9 @@ ha_innobase::check(
 
 	/* We must run the index record counts at an isolation level
 	>= READ COMMITTED, because a dirty read can see a wrong number
-	of records in some index; to play safe, we use always
-	REPEATABLE READ here (except when undo logs are unavailable) */
-	m_prebuilt->trx->isolation_level = srv_force_recovery
-		>= SRV_FORCE_NO_UNDO_LOG_SCAN
+	of records in some index; to play safe, we normally use
+	REPEATABLE READ here */
+	m_prebuilt->trx->isolation_level = high_level_read_only
 		? TRX_ISO_READ_UNCOMMITTED
 		: TRX_ISO_REPEATABLE_READ;
 


Generated at Thu Feb 08 09:41:09 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.