[MDEV-28708] Increased congestion on buf_pool.flush_list_mutex Created: 2022-05-31  Updated: 2023-11-30  Resolved: 2022-06-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.8.3, 10.9.1
Fix Version/s: 10.8.4, 10.9.2

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: performance, regression-10.8

Issue Links:
Problem/Incident
is caused by MDEV-27868 buf_pool.flush_list is in the wrong o... Closed
Relates
relates to MDEV-32269 InnoDB after ALTER TABLE…IMPORT TABLE... Closed

 Description   

The buf_pool.flush_list_mutex became more contended after MDEV-27868 fixed a correctness issue that was caused by the performance fix MDEV-27774. We really should acquire and release buf_pool.flush_list_mutex at most once in mtr_t::commit() and reduce the amount of work performed while holding the mutex.



 Comments   
Comment by Marko Mäkelä [ 2022-06-01 ]

My first attempt at fixing this could still be improved upon, by refactoring ReleaseModified and its caller so that buf_pool.flush_list_mutex will be acquired and released exactly once, and the correct insert position will be determined only once in case multiple previously clean pages were modified.

Comment by Marko Mäkelä [ 2022-06-08 ]

I tested the performance of two variants with regard to garbage (clean blocks in buf_pool.flush_list) encountered while searching for the insert position during mtr_t::commit().

garbage 20 40 80 160 320 640
skip 173451.92 216405.04 242115.24 222376.35 232804.73 225682.87
collect 175827.91 217642.73 244303.73 232586.02 218409.98 225964.44

The glitch in the 30-second average throughput at 160 and 320 concurrent threads is probably due to checkpoint flushing. The 10-second intervals with maximum throughput from the same run are as follows:

garbage 20 40 80 160 320 640
skip 175404.17 228561.55 250756.42 255970.53 246018.71 229484.90
collect 177563.37 230077.80 252248.08 256343.42 247137.73 228197.05

The variant with garbage collection seems to perform slightly better.

Generated at Thu Feb 08 10:02:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.