[MDEV-31390] Assertion id.is_corrupted() failed in buf_page_create_low() Created: 2023-06-02  Updated: 2023-10-24  Resolved: 2023-10-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.6, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Matthias Leich
Resolution: Cannot Reproduce Votes: 0
Labels: debug, need_rr, performance

Issue Links:
Problem/Incident
is caused by MDEV-27058 Buffer page descriptors are too large Closed

 Description   

MDEV-27058 introduced a race condition in the function buf_page_create_low(). A patch illustrates it:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
index 90886173b1b..3afd3840552 100644
--- a/storage/innobase/buf/buf0buf.cc
+++ b/storage/innobase/buf/buf0buf.cc
@@ -3144,7 +3144,7 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,
   free_block->initialise(page_id, zip_size, buf_page_t::MEMORY);
 
   buf_pool_t::hash_chain &chain= buf_pool.page_hash.cell_get(page_id.fold());
-retry:
+
   mysql_mutex_lock(&buf_pool.mutex);
 
   buf_page_t *bpage= buf_pool.page_hash.get(page_id, chain);
@@ -3159,21 +3159,16 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,
     if (!mtr->have_x_latch(reinterpret_cast<const buf_block_t&>(*bpage)))
     {
       const bool got= bpage->lock.x_lock_try();
+      auto state= bpage->fix();
       if (!got)
       {
         mysql_mutex_unlock(&buf_pool.mutex);
         bpage->lock.x_lock();
-        const page_id_t id{bpage->id()};
-        if (UNIV_UNLIKELY(id != page_id))
-        {
-          ut_ad(id.is_corrupted());
-          bpage->lock.x_unlock();
-          goto retry;
-        }
+        ut_ad(page_id == bpage->id());
+        state= bpage->state();
         mysql_mutex_lock(&buf_pool.mutex);
       }
 
-      auto state= bpage->fix();
       ut_ad(state >= buf_page_t::FREED);
       ut_ad(state < buf_page_t::READ_FIX);
 

The above code is trying to acquire the page latch while holding the buf_pool.mutex. This would violate the latching order if we didn't use a non-blocking wait.

If the above patch was not applied, the page could be evicted and replaced with something else in the buffer pool while this thread is not holding either buf_pool.mutex or the page latch. To make this safe, we must first buffer-fix the block so that buf_page_t::can_relocate() will not hold, and then release the buf_pool.mutex.

The observed symptom was a debug assertion failure that was caught by mleich:

mysqld: /data/Server/bb-10.6-MDEV-30986B/storage/innobase/buf/buf0buf.cc:3155: buf_block_t* buf_page_create_low(page_id_t, ulint, mtr_t*, buf_block_t*): Assertion `id.is_corrupted()' failed

In the core dump, we had id.m_id==0x1c00000070 but page_id.m_id==0x9b00000007. In the stack trace, I could see that we were trying to allocate a block for tablespace 0x9b.

I believe that this could explain MDEV-30531.

I will check if we are missing the buffer-fix in any other places that follow a similar pattern.



 Comments   
Comment by Marko Mäkelä [ 2023-06-02 ]

I realized that the debug assertion failure cannot lead to any corruption, because in the code path where it is executed, we would goto retry and proceed look up the page again.

The patch that I posted will merely simplify code and slightly improve performance.

I checked the code base, and any other use of trying to acquire page latches without waiting looks safe.

Comment by Marko Mäkelä [ 2023-06-02 ]

In fact, the patch that I posted in the Description would trigger an assertion failure in the test innodb.innodb-wl5522-debug. We could try a simpler fix that should only invoke goto retry when the block had been read-fixed and found to be corrupted:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
index 90886173b1b..a44bcc90575 100644
--- a/storage/innobase/buf/buf0buf.cc
+++ b/storage/innobase/buf/buf0buf.cc
@@ -3159,6 +3159,7 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,
     if (!mtr->have_x_latch(reinterpret_cast<const buf_block_t&>(*bpage)))
     {
       const bool got= bpage->lock.x_lock_try();
+      auto state= bpage->fix();
       if (!got)
       {
         mysql_mutex_unlock(&buf_pool.mutex);
@@ -3170,10 +3171,10 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,
           bpage->lock.x_unlock();
           goto retry;
         }
+        state= bpage->state();
         mysql_mutex_lock(&buf_pool.mutex);
       }
 
-      auto state= bpage->fix();
       ut_ad(state >= buf_page_t::FREED);
       ut_ad(state < buf_page_t::READ_FIX);
 

However, this patch will actually make the same test hang. The page read completion thread would wait forever for the buffer-fix to be released so that the corrupted page can be evicted:

#0  0x000055884fdf2ea9 in LF_BACKOFF () at /mariadb/10.6/include/my_cpu.h:124
#1  buf_pool_t::corrupted_evict (this=this@entry=0x55885030ce80 <buf_pool>, bpage=bpage@entry=0x7f3d852d71a0, state=state@entry=2147483648) at /mariadb/10.6/storage/innobase/buf/buf0lru.cc:1249
#2  0x000055884fdaea9b in buf_page_t::read_complete (this=0x7f3d852d71a0, node=@0x7f3d300873b0: {space = 0x7f3d3025c2b0, name = 0x7f3d300daed0 "./test/t1.ibd", handle = {m_file = 50}, on_ssd = 0, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 576, init_size = 4, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) at /mariadb/10.6/storage/innobase/buf/buf0buf.cc:3633
#3  0x000055884fdc2c86 in IORequest::read_complete (this=this@entry=0x5588526fae58) at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:2862

The best alternative would seem to be to simply remove the failing debug assertion. Before we do that, it would be good to produce a simplified grammar for reproducing the debug assertion failure.

Generated at Thu Feb 08 10:23:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.