[MDEV-35626] Race condition between buf_page_create_low() and read completion - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.6.20, 10.11.10, 11.2.6, 11.4.4, 11.6.2
Fix Version/s: 10.6.21, 10.11.11, 11.4.5, 11.7.2
Component/s: Storage Engine - InnoDB
Labels:

Description

The fix of ~~MDEV-34453~~ removed a read of a block descriptor state in case we are able to acquire an exclusive latch on the page without waiting:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc

index ae604f0826c..ba49754db6a 100644

--- a/storage/innobase/buf/buf0buf.cc

+++ b/storage/innobase/buf/buf0buf.cc

@@ -3322,6 +3322,8 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

         ut_ad(!bpage->is_io_fixed(state));

         ut_ad(bpage->buf_fix_count(state));

+      else

+        state= bpage->state();

       ut_ad(state >= buf_page_t::FREED);

       ut_ad(state < buf_page_t::READ_FIX);

In an rr replay trace, we have buf_page_t::read_complete() completing between the initial read of bpage->state() (before the above code snippet) and the time the last quoted assertion on failing because state == buf_page_t::READ_FIX + 1.

As far as I can tell, this bug may cause a totally incorrect state of the block to be set in a subsequent call:

        bpage->set_reinit(state & buf_page_t::LRU_MASK);

Furthermore, there seems to be some duplicated or redundant code that affects ROW_FORMAT=COMPRESSED pages that exist in the buffer pool only in compressed form:

@@ -3345,24 +3347,12 @@ static buf_block_t *buf_page_create_low(page_id_t page_id, ulint zip_size,

       else

-        auto state= bpage->state();

-        ut_ad(state >= buf_page_t::FREED);

-        ut_ad(state < buf_page_t::READ_FIX);

         page_hash_latch &hash_lock= buf_pool.page_hash.lock_get(chain);

         /* It does not make sense to use transactional_lock_guard here,

         because buf_relocate() would likely make the memory transaction

         too large. */

         hash_lock.lock();

-        if (state < buf_page_t::UNFIXED)

-          bpage->set_reinit(buf_page_t::FREED);

-        else

-        {

-          bpage->set_reinit(state & buf_page_t::LRU_MASK);

-          ibuf_exist= (state & buf_page_t::LRU_MASK) == buf_page_t::IBUF_EXIST;

-        }

         mysql_mutex_lock(&buf_pool.flush_list_mutex);

         buf_relocate(bpage, &free_block->page);

         free_block->page.lock.x_lock();

As far as I can tell, this code should not have had any ill effect. The else branch should always be taken because a state of FREED would already have been transformed to UNFIXED.

Attachments

Issue Links

is caused by

MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1

Closed

is duplicated by

MDEV-35181 Assertion 'state < buf_page_t::READ_FIX' in buf_page_create_low()

Closed

Activity

People

Assignee:: Debarun Banerjee

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2024-12-11 11:42

Updated:: 4 days ago 08:20

Resolved:: 2024-12-13 15:46

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.