[MDEV-38069] Heavy contention on buf_pool.flush_list_mutex - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.11, 11.4, 11.8
Fix Version/s: 10.11.16, 11.4.10, 11.8.6
Component/s: Storage Engine - InnoDB
Labels:

Bug Category:
Related to performance
Release Note Summary:
In multi-threaded write-heavy workloads, buf_pool.flush_list_mutex could become a point of contention.

Description

rahulraj has identified a regression that is apparently caused by ~~MDEV-35155~~, which was contributed by alessandro.vetere.

I would say that the real cause is older, but ~~MDEV-35155~~ contributed to finding this bottleneck. When the ~~MDEV-35155~~ fix is present, we would observe that buf_flush_ahead() in mtr_t::commit() is waiting for buf_pool.flush_list_mutex, to increase the efforts of the buf_flush_page_cleaner thread.

Why would the buf_pool.flush_list_mutex be so contended? For one set of runs with performance_schema enabled, the MAX_TIMER_WAIT for wait/synch/mutex/innodb/flush_list_mutex was 69,715,544,080 for most part of a run, and 80,284,902,880 for the case with 150 concurrent virtual users. If that is in nanoseconds, 69.7 or 80.3 seconds would feel unbelievable or totally unacceptable.

I would like to note that neither buf_pool.mutex nor buf_pool.flush_list_mutex are normally being held or acquired when a page write is initiated in buf_page_t::flush() or completed in buf_page_write_complete(). The latter function would only acquire buf_pool.mutex in order to evict a page that belongs to the temporary tablespace.

Only if there are lots of blocks to "lazily evict" from buf_pool.flush_list (~~MDEV-26010~~), we’d hold buf_pool.flush_list_mutex for a long time. We implemented some logic elsewhere to avoid hogging buf_pool.mutex (~~MDEV-26827~~):

buf_flush_LRU_list_batch()
if (!bpage->oldest_modification())
{
evict:
if (state != buf_page_t::FREED &&
(state >= buf_page_t::READ_FIX \|\| (~buf_page_t::LRU_MASK & state)))
continue;
if (UNIV_UNLIKELY(to_withdraw != 0))
to_withdraw= buf_flush_LRU_to_withdraw(to_withdraw, *bpage);
buf_LRU_free_page(bpage, true);
++n->evicted;
if (UNIV_LIKELY(scanned & 31))
continue;
mysql_mutex_unlock(&buf_pool.mutex);
reacquire_mutex:
mysql_mutex_lock(&buf_pool.mutex);
continue;
}

On every 32th iteration where this function is freeing a least-recently-used block, it will release and reacquire buf_pool.mutex. We are missing such logic for buf_pool.flush_list_mutex near the continue below:

buf_do_flush_list_batch()
for (buf_page_t *bpage= UT_LIST_GET_LAST(buf_pool.flush_list);
bpage && len && count < max_n; ++scanned, len--)
{
const lsn_t oldest_modification= bpage->oldest_modification();
if (oldest_modification >= lsn)
break;
ut_ad(bpage->in_file());

{
buf_page_t *prev= UT_LIST_GET_PREV(list, bpage);

if (oldest_modification == 1)
{
clear:
buf_pool.delete_from_flush_list(bpage);
skip:
bpage= prev;
continue;
}

If there is a lot of “garbage” (clean blocks that are here being lazily removed from buf_pool.flush_list), the buf_pool.flush_list_mutex could be monopolized for a long time. This should be fairly simple to fix.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

OFF_CPU_38069.zip
872 kB
2025-11-17 12:34
ON_CPU_38069.zip
4.25 MB
2025-11-17 12:34
Perf_graph_38069.png
61 kB
2025-11-17 12:34

Issue Links

blocks

MDEV-33966 sysbench performance regression with concurrent workloads

Stalled

is caused by

MDEV-25113 Reduce effect of parallel background flush on select workload

Closed

MDEV-35155 Small innodb_log_file_size leads to excessive write amplification

Closed

relates to

MDEV-26827 Make page flushing even faster

Closed

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Assigned for Implementation:: Marko Mäkelä

Assigned for Review:: Thirunarayanan Balathandayuthapani

Assigned for Testing:: Rahul Raj

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2025-11-11 14:43

Updated:: 2025-11-17 16:09

Resolved:: 2025-11-17 13:34

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.