[MDEV-25869] Change buffer entries for secondary indexes are lost on InnoDB restart Created: 2021-06-07  Updated: 2022-02-10  Resolved: 2021-06-08

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2.30, 10.3.21, 10.4.11
Fix Version/s: 10.2.39, 10.3.30, 10.4.20

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: not-10.5, regression-10.2, regression-10.3, regression-10.4, rr-profile-analyzed

Issue Links:
Blocks
Problem/Incident
is caused by MDEV-21069 Crash on DROP TABLE if the data file ... Closed
Relates
relates to MDEV-22373 Unable to find a record to delete-mar... Closed
relates to MDEV-25796 Failing assertion: !cursor->index->is... Open

 Description   

The bug fix MDEV-21069 aimed to avoid a server crash when executing DROP TABLE of a corrupted table. Unfortunately, that fix introduced corruption in the function buf_read_ibuf_merge_pages() and caused wrongful deletion of records from the change buffer for tablespaces whose first page had not been read yet. The following should fix it:

diff --git a/storage/innobase/buf/buf0rea.cc b/storage/innobase/buf/buf0rea.cc
--- a/storage/innobase/buf/buf0rea.cc
+++ b/storage/innobase/buf/buf0rea.cc
@@ -772,13 +772,18 @@ buf_read_ibuf_merge_pages(
 			continue;
 		}
 
-		if (UNIV_UNLIKELY(page_nos[i] >= space->size)) {
+		ulint size = space->size;
+		if (!size) {
+			size = fil_space_get_size(space->id);
+		}
+
+		if (UNIV_UNLIKELY(page_nos[i] >= size)) {
 			do {
 				ibuf_delete_recs(page_id_t(space_ids[i],
 							   page_nos[i]));
 			} while (++i < n_stored
 				 && space_ids[i - 1] == space_ids[i]
-				 && page_nos[i] >= space->size);
+				 && page_nos[i] >= size);
 			i--;
 next:
 			space->release();

MariaDB Server 10.5 is not affected by this, because this code was removed in MDEV-19514. (However, 10.5 is affected by MDEV-25783 without even involving a server restart.)

Also if crash recovery is needed, those tablespaces for which no redo log was applied could be affected by this bug.

I believe that this kind of corruption can be detected by CHECK TABLE, and it can be fixed by dropping and re-creating the secondary indexes:

CHECK TABLE t QUICK;
ALTER TABLE t DROP INDEX idx1, DROP INDEX idx2;
ALTER TABLE t ADD INDEX idx1(a), ADD INDEX idx2(b);

A work-around would be to ensure that the change buffer is empty during shutdown (set innodb_fast_shutdown=0) or to always run with innodb_change_buffering=none.

This bug was originally reported and analyzed in MDEV-22373, but because MDEV-22373 is partly explained by MDEV-24449 and MDEV-25783, it is clearer to file a separate ticket for this specific case.



 Comments   
Comment by Manuel Arostegui [ 2021-06-07 ]

We have been using the combination of
innodb_change_buffering=none + rebuilding the tables that got reported by check tables after upgrading to 10.4 and so far it seems ok.

Comment by Marko Mäkelä [ 2021-06-08 ]

Pushed to 10.2 and merged to 10.3 and 10.4.

Generated at Thu Feb 08 09:41:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.