[MDEV-29374] Frequent "Data structure corruption" in InnoDB after OOM - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.7.5, 10.6.9, 10.8.4, 10.9.2, 10.10.1
Fix Version/s: 10.6.10, 10.7.6, 10.8.5, 10.9.3, 10.10.2
Component/s: Storage Engine - InnoDB
Labels:
None
Environment:
CentOS 7

Description

My company has around 200 CentOS 7 servers running MariaDB 10.6. Last week, after the 10.6.9 update was applied, we started seeing lots of InnoDB failures after an OOM had killed MariaDB.

Our systems are kind of light on memory and do hit OOM's sometimes but it shouldn't cause a failure in InnoDB recovery. When 10.6.9 was updated, we started seeing multiple failures per day. I rolled back to 10.6.8 yesterday and have not seen any more issues. So I think something in 10.6.9 is causing a problem.

Each time the issue happens, the error log looks like this.

2022-08-21 10:41:34 0 [Note] InnoDB: Compressed tables use zlib 1.2.7

2022-08-21 10:41:34 0 [Note] InnoDB: Number of pools: 1

2022-08-21 10:41:34 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions

2022-08-21 10:41:34 0 [Note] InnoDB: Using Linux native AIO

2022-08-21 10:41:34 0 [Note] InnoDB: Initializing buffer pool, total size = 2550136832, chunk size = 134217728

2022-08-21 10:41:34 0 [Note] InnoDB: Completed initialization of buffer pool

2022-08-21 10:41:34 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1383270444633,1383271021489

2022-08-21 10:41:36 0 [Note] InnoDB: Starting final batch to recover 15643 pages from redo log.

2022-08-21 10:41:37 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption

2022-08-21 10:41:37 0 [Note] InnoDB: Starting shutdown...

2022-08-21 10:41:38 0 [ERROR] Plugin 'InnoDB' init function returned error.

2022-08-21 10:41:38 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

2022-08-21 10:41:38 0 [Note] Plugin 'FEEDBACK' is disabled.

2022-08-21 10:41:38 0 [ERROR] Unknown/unsupported storage engine: InnoDB

2022-08-21 10:41:38 0 [ERROR] Aborting

There isn't anything unusual about our config. Here is one of them.

default-storage-engine=InnoDB

innodb_file_per_table=1

innodb_buffer_pool_size=2400M

innodb_log_file_size=600M

innodb_strict_mode = 0

innodb_use_native_aio = 1

innodb_write_io_threads = 8

innodb_read_io_threads = 8

query_cache_type=0

open_files_limit=50000

thread_cache_size=4

join_buffer_size=1024K

tmp_table_size=30M

max_heap_table_size=30M

sql_mode="NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

table_definition_cache=14000

Fortunately, it's very easy to recover from this but it does take manual intervention to do. i.e. innodb_force_recovery=1

Any ideas what could be causing this new issue and what we can do to correct it?

Attachments

Issue Links

is caused by

MDEV-13542 Crashing on a corrupted page is unhelpful

Closed

is duplicated by

MDEV-28784 mariadb-backup prepare fails with srv_start() returned 37 (Data structure corruption).

Closed

relates to

MDEV-29364 CRITICAL - MariaDB 10.8.4 creating corrupted backups - "InnoDB: Failed to read page 4 from file 'database/table.ibd': Page read from tablespace is corrupted."

Closed

MDEV-29511 "Page read from tablespace is corrupted" when starting mariadb service after crash (signal 6)

Closed

Activity

Ascending order - Click to sort in descending order

View 7 older comments

Marko Mäkelä added a comment - 2022-08-30 14:13

I am reopening this, because mleich provided an rr replay trace of something that reproduces exactly this message, even when the fix of ~~MDEV-29383~~ is present.

What I found out so far:

The problematic page 10:255 is an index page that was freed near LSN 0x40296d.
The first recovered record for the page at LSN 0x3fee0f fails to apply because the page was read back as all-zero.
I did not check yet if a FREE_PAGE record for the page had been written or if it was ignored by recovery.

Marko Mäkelä added a comment - 2022-08-30 14:13 I am reopening this, because mleich provided an rr replay trace of something that reproduces exactly this message, even when the fix of MDEV-29383 is present. What I found out so far: The problematic page 10:255 is an index page that was freed near LSN 0x40296d. The first recovered record for the page at LSN 0x3fee0f fails to apply because the page was read back as all-zero. I did not check yet if a FREE_PAGE record for the page had been written or if it was ignored by recovery.

Marko Mäkelä added a comment - 2022-08-31 08:38

It turns out that we fail to write a FREE_PAGE record (it can only be written by invoking mtr_t::free()) for the following operation:

bb-10.6-MDEV-19229 23ab688cec4b3031262bc93921592a04749a4341
#2 0x0000564aad8eacef in buf_page_t::set_freed (this=0x7fb1125e7120,
prev_state=1610612739, count=0)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/include/buf0buf.h:741
#3 0x0000564aadca9fc5 in buf_page_free (space=0x612000041740, page=255,
mtr=0x7fb10f496ae0)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/buf/buf0buf.cc:2130
#4 0x0000564aadc00c2d in btr_page_free (index=0x6160008ba208,
block=0x7fb1125e7120, mtr=0x7fb10f496ae0, blob=false, space_latched=false)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:657
#5 0x0000564aadc186da in btr_compress (cursor=0x61a00000a9a8, adjust=false,
mtr=0x7fb10f496ae0)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:3964
#6 0x0000564aadc68034 in btr_cur_compress_if_useful (cursor=0x61a00000a9a8,
adjust=false, mtr=0x7fb10f496ae0)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0cur.cc:5406
#7 0x0000564aadc6b3d8 in btr_cur_pessimistic_delete (err=0x7fb10f4967d0,
has_reserved_extents=0, cursor=0x61a00000a9a8, flags=0, rollback=false,
mtr=0x7fb10f496ae0)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0cur.cc:5863
#8 0x0000564aadac19fa in row_purge_remove_clust_if_poss_low (
node=0x61a00000a908, mode=522)
at /data/Server/bb-10.6-MDEV-19229/storage/innobase/row/row0purge.cc:220

Due to the missing record for freeing the page, the recovery will attempt to load the page and fail, as follows.

2022-08-29  9:26:51 0 [Note] InnoDB: Starting final batch to recover 143 pages from redo log.

2022-08-29  9:26:53 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1555] with error Data structure corruption

I must make the failure diagnostics more verbose; it was probably "cleaned up" too much in ~~MDEV-13542~~.

The root cause of this failure does not look like a recent regression. I think that it will affect 10.5 as well. 10.5 has a compatible log format, and it would crash on this corrupted data directory:

10.5 29fa9bcee01cf5457c096bf37bb25c84ddee5a30

2022-08-31 11:37:26 0 [Note] InnoDB: Starting final batch to recover 143 pages from redo log.

mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:270: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.

Marko Mäkelä added a comment - 2022-08-31 08:38 It turns out that we fail to write a FREE_PAGE record (it can only be written by invoking mtr_t::free() ) for the following operation: bb-10.6-MDEV-19229 23ab688cec4b3031262bc93921592a04749a4341 #2 0x0000564aad8eacef in buf_page_t::set_freed (this=0x7fb1125e7120, prev_state=1610612739, count=0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/include/buf0buf.h:741 #3 0x0000564aadca9fc5 in buf_page_free (space=0x612000041740, page=255, mtr=0x7fb10f496ae0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/buf/buf0buf.cc:2130 #4 0x0000564aadc00c2d in btr_page_free (index=0x6160008ba208, block=0x7fb1125e7120, mtr=0x7fb10f496ae0, blob=false, space_latched=false) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:657 #5 0x0000564aadc186da in btr_compress (cursor=0x61a00000a9a8, adjust=false, mtr=0x7fb10f496ae0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:3964 #6 0x0000564aadc68034 in btr_cur_compress_if_useful (cursor=0x61a00000a9a8, adjust=false, mtr=0x7fb10f496ae0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0cur.cc:5406 #7 0x0000564aadc6b3d8 in btr_cur_pessimistic_delete (err=0x7fb10f4967d0, has_reserved_extents=0, cursor=0x61a00000a9a8, flags=0, rollback=false, mtr=0x7fb10f496ae0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0cur.cc:5863 #8 0x0000564aadac19fa in row_purge_remove_clust_if_poss_low ( node=0x61a00000a908, mode=522) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/row/row0purge.cc:220 Due to the missing record for freeing the page, the recovery will attempt to load the page and fail, as follows. 2022-08-29 9:26:51 0 [Note] InnoDB: Starting final batch to recover 143 pages from redo log. 2022-08-29 9:26:53 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1555] with error Data structure corruption I must make the failure diagnostics more verbose; it was probably "cleaned up" too much in MDEV-13542 . The root cause of this failure does not look like a recent regression. I think that it will affect 10.5 as well. 10.5 has a compatible log format, and it would crash on this corrupted data directory: 10.5 29fa9bcee01cf5457c096bf37bb25c84ddee5a30 2022-08-31 11:37:26 0 [Note] InnoDB: Starting final batch to recover 143 pages from redo log. mariadbd: /mariadb/10.5/storage/innobase/log/log0recv.cc:270: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.

Marko Mäkelä added a comment - 2022-08-31 09:04

The root cause turns out to be that a write of a FREE_PAGE record is being skipped due to an early return here:

#0  fseg_free_page_low (seg_inode=0x7fb112adc0f2 "", iblock=0x7fb1125ebe50,

    space=0x612000041740, offset=255, mtr=0x7fb10f496ae0, ahi=false)

    at /data/Server/bb-10.6-MDEV-19229/storage/innobase/fsp/fsp0fsp.cc:2596

#1  0x0000564aaddfcef7 in fseg_free_page (seg_header=0x7fb112ae004a "",

    space=0x612000041740, offset=255, mtr=0x7fb10f496ae0, have_latch=false)

    at /data/Server/bb-10.6-MDEV-19229/storage/innobase/fsp/fsp0fsp.cc:2629

#2  0x0000564aadc00b9e in btr_page_free (index=0x6160008ba208,

    block=0x7fb1125e7120, mtr=0x7fb10f496ae0, blob=false, space_latched=false)

    at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:650

#3  0x0000564aadc186da in btr_compress (cursor=0x61a00000a9a8, adjust=false,

    mtr=0x7fb10f496ae0)

    at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:3964

The culprit is an accidentally added extra return err; statement. (Only in case of an error, we should not write the FREE_PAGE record.)

		if (UNIV_UNLIKELY(err != DB_SUCCESS)) {

			return err;

		return err;

	mtr->free(*space, static_cast<uint32_t>(offset));

	return DB_SUCCESS;

Marko Mäkelä added a comment - 2022-08-31 09:04 The root cause turns out to be that a write of a FREE_PAGE record is being skipped due to an early return here: #0 fseg_free_page_low (seg_inode=0x7fb112adc0f2 "", iblock=0x7fb1125ebe50, space=0x612000041740, offset=255, mtr=0x7fb10f496ae0, ahi=false) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/fsp/fsp0fsp.cc:2596 #1 0x0000564aaddfcef7 in fseg_free_page (seg_header=0x7fb112ae004a "", space=0x612000041740, offset=255, mtr=0x7fb10f496ae0, have_latch=false) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/fsp/fsp0fsp.cc:2629 #2 0x0000564aadc00b9e in btr_page_free (index=0x6160008ba208, block=0x7fb1125e7120, mtr=0x7fb10f496ae0, blob=false, space_latched=false) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:650 #3 0x0000564aadc186da in btr_compress (cursor=0x61a00000a9a8, adjust=false, mtr=0x7fb10f496ae0) at /data/Server/bb-10.6-MDEV-19229/storage/innobase/btr/btr0btr.cc:3964 The culprit is an accidentally added extra return err; statement. (Only in case of an error, we should not write the FREE_PAGE record.) if (UNIV_UNLIKELY(err != DB_SUCCESS)) { return err; } return err; } mtr-> free (*space, static_cast <uint32_t>(offset)); return DB_SUCCESS; }

Marko Mäkelä added a comment - 2022-08-31 10:38

As part of fixing this, I am going to add some error messages to identify the corrupted page:

innodb.corrupted_during_recovery 'innodb' w20 [ fail ]  Found warnings/errors in server log file!

        Test ended at 2022-08-31 13:12:59

line

2022-08-31 13:12:56 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=3]; set innodb_force_recovery to ignore

2022-08-31 13:12:58 0 [Warning] InnoDB: Discarding log for corrupted page [page id: space=5, page number=3]

Marko Mäkelä added a comment - 2022-08-31 10:38 As part of fixing this, I am going to add some error messages to identify the corrupted page: innodb.corrupted_during_recovery 'innodb' w20 [ fail ] Found warnings/errors in server log file! Test ended at 2022-08-31 13:12:59 line 2022-08-31 13:12:56 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=5, page number=3]; set innodb_force_recovery to ignore 2022-08-31 13:12:58 0 [Warning] InnoDB: Discarding log for corrupted page [page id: space=5, page number=3]

Sergei Golubchik added a comment - 2022-09-08 11:20

Summary: a crash shortly after a page merge (can be triggered by an UPDATE or DELETE or, say, rollback of an INSERT) can cause data corruption.

Sergei Golubchik added a comment - 2022-09-08 11:20 Summary: a crash shortly after a page merge (can be triggered by an UPDATE or DELETE or, say, rollback of an INSERT) can cause data corruption.

MariaDB Server

Frequent "Data structure corruption" in InnoDB after OOM

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration