[MDEV-26033] Race condition between buf_pool.page_hash and buffer pool resizing - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5.4, 10.6.0
Fix Version/s: 10.5.12, 10.6.3
Component/s: Storage Engine - InnoDB
Labels:
- regression
- rr-profile-analyzed

Description

The replacement of buf_pool.page_hash with a different type of hash table in ~~MDEV-22871~~ introduced a race condition with buffer pool resizing.

We have an execution trace where buf_pool.page_hash.array is changed to point to something else while page_hash_latch::read_lock() is executing. The same should also affect page_hash_latch::write_lock().

The wait loop currently fails to notice that buffer pool resizing is in progress. A part of the problem is that we are waiting too deep in the code:

    template<bool exclusive> page_hash_latch *lock(ulint fold)

      for (;;)

        auto n= n_cells;

        page_hash_latch *latch= lock_get(fold, n);

        latch->acquire<exclusive>();

        /* Our latch prevents n_cells from changing. */

        if (UNIV_LIKELY(n == n_cells))

          return latch;

        /* Retry, because buf_pool_t::resize_hash() affected us. */

        latch->release<exclusive>();

We are actually waiting inside page_hash_latch::read_lock() or page_hash_latch::write_lock() on memory that may no longer belong to the buf_pool.page_hash.array. We would need some notion of timeout or temporary failure when the buffer pool is being resized.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

grammar.yy
0.8 kB
2021-06-28 11:55

Issue Links

blocks

MDEV-26826 Duplicated computations of buf_pool.page_hash addresses

Closed

duplicates

MDEV-24030 SUMMARY: AddressSanitizer: heap-use-after-free failure in hash_get_lock

Closed

is caused by

MDEV-22871 Contention on the buf_pool.page_hash

Closed

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2021-06-28 11:52

Updated:: 2022-04-07 05:23

Resolved:: 2021-07-03 12:26

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration