[MDEV-31767] InnoDB tables are being flagged as corrupted on an I/O bound server Created: 2023-07-24  Updated: 2024-01-10  Resolved: 2023-07-26

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.12, 10.6.13, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 10.6.14
Fix Version/s: 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: CS0668311, corruption, race

Issue Links:
Blocks
blocks MDEV-30531 Corrupt index(es) on busy table when ... Closed
Problem/Incident
is caused by MDEV-30400 Assertion `height == btr_page_get_lev... Closed
Relates
relates to MDEV-13542 Crashing on a corrupted page is unhel... Closed
relates to MDEV-27058 Buffer page descriptors are too large Closed
relates to MDEV-32116 Server suddenly crashed Closed
relates to MDEV-33205 [ERROR] InnoDB: We detected index cor... Needs Feedback

 Description   

This was reproduced while trying to reproduce an older issue MDEV-30531.

Some InnoDB B-tree cursor refactoring in MDEV-30400 turns out to be unsafe, resulting in InnoDB tables being flagged as corrupted. This occurs also on PRIMARY KEY indexes (clustered indexes), not only on secondary index pages.

The root cause seems to be that some operations are accessing the buffer page frame contents while only holding a buffer-fix on the page, not a page latch. It could be the case that the page is being read into the buffer pool, or it is being decrypted or decompressed. In some core dumps of such failures (with additional instrumentation to essentially revert MDEV-13542), the corruption condition would no longer hold.



 Comments   
Comment by Marko Mäkelä [ 2023-07-25 ]

There could be an even older culprit to this than MDEV-30400. In MDEV-27058, I removed the function buf_wait_for_read(), which would keep acquiring and releasing a page latch as long as the page is read-fixed. This loop would probably prevent many of these issues when a page is only being buffer-fixed.

I do not think that it ever is a good idea to use buffer-fixing for the first-time lookup of a data page in a mini-transaction. If the page was not in the buffer pool and had to be loaded into it, the buffer-fixing could gain access to the page before the read request was completed and the page checksum was validated. Before MDEV-13542, this was not that much of an issue; we would crash on corruption anyway.

What my fix aims to do is to acquire proper page latches upfront. To avoid deadlocks when acquiring page latches in the wrong order (not from left to right), we can safely release a page latch for a short while while waiting for the left sibling page latch. A buffer-fix will prevent the current block from being evicted from the buffer pool.

Comment by Thirunarayanan Balathandayuthapani [ 2023-07-26 ]

Patch looks OK to me

Comment by Matthias Leich [ 2023-07-26 ]

Two runs of the RQG test battery on a RelWithDebInfo build and one run on a debug build of
origin/bb-10.6-primary-corruption b102872ad50cce5959ad95369740766d14e9e48c 2023-07-25T11:40:58+03:00
performed well. No new or unknown problems.

Comment by Michael Widenius [ 2023-08-26 ]

This bug affects at least long term releases versions 10.6.12 - 10.6.14 and 10.11.2-10.11.4
Potentially it could also affect 10.6.6-10.6.11.

Anyone using a short term release should ugrade to the next long term release or to the latest one in their serie.

Comment by Marko Mäkelä [ 2023-10-20 ]

This bug had been reproduced while trying to reproduce another issue MDEV-30531. I am quoting the error log from this comment:

2023-07-15 15:41:26 0 [Note] /test/MD220623-mariadb-11.1.2-linux-x86_64-opt/bin/mariadbd: ready for connections.
Version: '11.1.2-MariaDB'  socket: '/test/MD220623-mariadb-11.1.2-linux-x86_64-opt/socket.sock'  port: 12801  MariaDB Server
2023-07-15 15:41:38 1509 [Note] InnoDB: Number of transaction pools: 2
2023-07-15 16:19:46 72616 [ERROR] InnoDB: We detected index corruption in an InnoDB type table. You have to dump + drop + reimport the table or, in a case of widespread corruption, dump all InnoDB tables and recreate the whole tablespace. If the mariadbd server crashes after the startup or when you dump the tables. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2023-07-15 16:19:46 72616 [ERROR] mariadbd: Index for table 't2' is corrupt; try to repair it

This bug was a race condition that would allow a being-read page to be accessed before it had been fully read or uncompressed. As a result, the table may be claimed to be corrupted, even though it is not.

Generated at Thu Feb 08 10:26:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.