[MDEV-31390] Assertion id.is_corrupted() failed in buf_page_create_low() - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Cannot Reproduce
Affects Version/s: 10.6.6, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL)
Fix Version/s: N/A
Component/s: Storage Engine - InnoDB
Labels:

Description

~~MDEV-27058~~ introduced a race condition in the function buf_page_create_low(). A patch illustrates it:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc

index 90886173b1b..3afd3840552 100644

--- a/storage/innobase/buf/buf0buf.cc

+++ b/storage/innobase/buf/buf0buf.cc

@@ -3144,7 +3144,7 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

   free_block->initialise(page_id, zip_size, buf_page_t::MEMORY);

   buf_pool_t::hash_chain &chain= buf_pool.page_hash.cell_get(page_id.fold());

-retry:

   mysql_mutex_lock(&buf_pool.mutex);

   buf_page_t *bpage= buf_pool.page_hash.get(page_id, chain);

@@ -3159,21 +3159,16 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

     if (!mtr->have_x_latch(reinterpret_cast<const buf_block_t&>(*bpage)))

       const bool got= bpage->lock.x_lock_try();

+      auto state= bpage->fix();

       if (!got)

         mysql_mutex_unlock(&buf_pool.mutex);

         bpage->lock.x_lock();

-        const page_id_t id{bpage->id()};

-        if (UNIV_UNLIKELY(id != page_id))

-        {

-          ut_ad(id.is_corrupted());

-          bpage->lock.x_unlock();

-          goto retry;

-        }

+        ut_ad(page_id == bpage->id());

+        state= bpage->state();

         mysql_mutex_lock(&buf_pool.mutex);

-      auto state= bpage->fix();

       ut_ad(state >= buf_page_t::FREED);

       ut_ad(state < buf_page_t::READ_FIX);

The above code is trying to acquire the page latch while holding the buf_pool.mutex. This would violate the latching order if we didn't use a non-blocking wait.

If the above patch was not applied, the page could be evicted and replaced with something else in the buffer pool while this thread is not holding either buf_pool.mutex or the page latch. To make this safe, we must first buffer-fix the block so that buf_page_t::can_relocate() will not hold, and then release the buf_pool.mutex.

The observed symptom was a debug assertion failure that was caught by mleich:

mysqld: /data/Server/bb-10.6-MDEV-30986B/storage/innobase/buf/buf0buf.cc:3155: buf_block_t* buf_page_create_low(page_id_t, ulint, mtr_t*, buf_block_t*): Assertion `id.is_corrupted()' failed

In the core dump, we had id.m_id==0x1c00000070 but page_id.m_id==0x9b00000007. In the stack trace, I could see that we were trying to allocate a block for tablespace 0x9b.

I believe that this could explain ~~MDEV-30531~~.

I will check if we are missing the buffer-fix in any other places that follow a similar pattern.

Attachments

Issue Links

is caused by

MDEV-27058 Buffer page descriptors are too large

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2023-06-02 12:35

I realized that the debug assertion failure cannot lead to any corruption, because in the code path where it is executed, we would goto retry and proceed look up the page again.

The patch that I posted will merely simplify code and slightly improve performance.

I checked the code base, and any other use of trying to acquire page latches without waiting looks safe.

Marko Mäkelä added a comment - 2023-06-02 12:35 I realized that the debug assertion failure cannot lead to any corruption, because in the code path where it is executed, we would goto retry and proceed look up the page again. The patch that I posted will merely simplify code and slightly improve performance. I checked the code base, and any other use of trying to acquire page latches without waiting looks safe.

Marko Mäkelä added a comment - 2023-06-02 13:05

In fact, the patch that I posted in the Description would trigger an assertion failure in the test innodb.innodb-wl5522-debug. We could try a simpler fix that should only invoke goto retry when the block had been read-fixed and found to be corrupted:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc

index 90886173b1b..a44bcc90575 100644

--- a/storage/innobase/buf/buf0buf.cc

+++ b/storage/innobase/buf/buf0buf.cc

@@ -3159,6 +3159,7 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

     if (!mtr->have_x_latch(reinterpret_cast<const buf_block_t&>(*bpage)))

       const bool got= bpage->lock.x_lock_try();

+      auto state= bpage->fix();

       if (!got)

         mysql_mutex_unlock(&buf_pool.mutex);

@@ -3170,10 +3171,10 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

           bpage->lock.x_unlock();

           goto retry;

+        state= bpage->state();

         mysql_mutex_lock(&buf_pool.mutex);

-      auto state= bpage->fix();

       ut_ad(state >= buf_page_t::FREED);

       ut_ad(state < buf_page_t::READ_FIX);

However, this patch will actually make the same test hang. The page read completion thread would wait forever for the buffer-fix to be released so that the corrupted page can be evicted:

#0  0x000055884fdf2ea9 in LF_BACKOFF () at /mariadb/10.6/include/my_cpu.h:124

#1  buf_pool_t::corrupted_evict (this=this@entry=0x55885030ce80 <buf_pool>, bpage=bpage@entry=0x7f3d852d71a0, state=state@entry=2147483648) at /mariadb/10.6/storage/innobase/buf/buf0lru.cc:1249

#2  0x000055884fdaea9b in buf_page_t::read_complete (this=0x7f3d852d71a0, node=@0x7f3d300873b0: {space = 0x7f3d3025c2b0, name = 0x7f3d300daed0 "./test/t1.ibd", handle = {m_file = 50}, on_ssd = 0, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 576, init_size = 4, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at /mariadb/10.6/storage/innobase/buf/buf0buf.cc:3633

#3  0x000055884fdc2c86 in IORequest::read_complete (this=this@entry=0x5588526fae58) at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:2862

The best alternative would seem to be to simply remove the failing debug assertion. Before we do that, it would be good to produce a simplified grammar for reproducing the debug assertion failure.

Marko Mäkelä added a comment - 2023-06-02 13:05 In fact, the patch that I posted in the Description would trigger an assertion failure in the test innodb.innodb-wl5522-debug . We could try a simpler fix that should only invoke goto retry when the block had been read-fixed and found to be corrupted: diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc index 90886173b1b..a44bcc90575 100644 --- a/storage/innobase/buf/buf0buf.cc +++ b/storage/innobase/buf/buf0buf.cc @@ -3159,6 +3159,7 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size, if (!mtr->have_x_latch(reinterpret_cast<const buf_block_t&>(*bpage))) { const bool got= bpage->lock.x_lock_try(); + auto state= bpage->fix(); if (!got) { mysql_mutex_unlock(&buf_pool.mutex); @@ -3170,10 +3171,10 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size, bpage->lock.x_unlock(); goto retry; } + state= bpage->state(); mysql_mutex_lock(&buf_pool.mutex); } - auto state= bpage->fix(); ut_ad(state >= buf_page_t::FREED); ut_ad(state < buf_page_t::READ_FIX); However, this patch will actually make the same test hang. The page read completion thread would wait forever for the buffer-fix to be released so that the corrupted page can be evicted: #0 0x000055884fdf2ea9 in LF_BACKOFF () at /mariadb/10.6/include/my_cpu.h:124 #1 buf_pool_t::corrupted_evict (this=this@entry=0x55885030ce80 <buf_pool>, bpage=bpage@entry=0x7f3d852d71a0, state=state@entry=2147483648) at /mariadb/10.6/storage/innobase/buf/buf0lru.cc:1249 #2 0x000055884fdaea9b in buf_page_t::read_complete (this=0x7f3d852d71a0, node=@0x7f3d300873b0: {space = 0x7f3d3025c2b0, name = 0x7f3d300daed0 "./test/t1.ibd", handle = {m_file = 50}, on_ssd = 0, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 576, init_size = 4, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at /mariadb/10.6/storage/innobase/buf/buf0buf.cc:3633 #3 0x000055884fdc2c86 in IORequest::read_complete (this=this@entry=0x5588526fae58) at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:2862 The best alternative would seem to be to simply remove the failing debug assertion. Before we do that, it would be good to produce a simplified grammar for reproducing the debug assertion failure.

MariaDB Server

Assertion id.is_corrupted() failed in buf_page_create_low()

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration