[MDEV-23169] Optimize InnoDB code around mutexes assuming InnoDB locks are uncontended Created: 2020-07-14  Updated: 2022-01-17  Resolved: 2022-01-17

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: N/A

Type: Task Priority: Minor
Reporter: Dmitriy Philimonov Assignee: Marko Mäkelä
Resolution: Incomplete Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-21452 Use condition variables and normal mu... Closed

 Description   

The InnoDB engine locks have low contention during OLTP_PS/RO sysbench tests. Assuming this we used compiler specific attributes to annotate code near mutexes making code path acquiring lock easier. Specifically, branches which lead to ut_delay() function were marked `unlikely` and functions which call ut_delay() were marked `cold`. We got persistent +2% improvement in sysbench OLTP_PS 128 thread, meanwhile sysbench OTLP_RO/RW results almost didn't change.

Our environment: Kunpeng 920 (64 cores), Ubuntu 20.04, kernel 5.6.0 with 64K pages, gcc-9.3.0.

Look at the TTASEventMutex implementation, for instance. Here, we can mark while(!try_lock()) as unlikely:

void enter(
		uint32_t	max_spins,
		uint32_t	max_delay,
		const char*	filename,
		uint32_t	line)
		UNIV_NOTHROW
	{
		uint32_t	n_spins = 0;
		uint32_t	n_waits = 0;
		const uint32_t	step = max_spins;
 
		while (UNIV_UNLIKELY(!try_lock())) {
                ...



 Comments   
Comment by Vladislav Vaintroub [ 2020-07-14 ]

is contention always unlikely ? Try OLTP-write-only, small dataset, many users.

Comment by Marko Mäkelä [ 2020-07-14 ]

I think that this is a good idea, but I would like to see a patch and benchmarks. Some contended mutexes include buf_pool.mutex, lock_sys.mutex and log_sys.mutex. trx_sys.mutex should not be that contended since 10.3, and some minor improvement in 10.5.

Comment by Vladislav Vaintroub [ 2020-07-14 ]

Ideally, such things are regulated by good CPU branch prediction, second choice by profile guided optimization done on well-chosen , representative benchmarks. The programmer-guided optimization, as in here, is often wrong, or works for one case and not the others, and can bring more damage that goods.

Comment by Dmitriy Philimonov [ 2020-07-15 ]

1. Actually, not all CPUs have good branch prediction.
2. You're right, PGO is better, however, it has its own pitfalls: it optimizes the whole program for the particular workload, so it's hard to use it in practice. So here we do a 'handmade PGO' for a small isolated code parts. The idea here is to optimize the fast path and pessimize the slow path. If the load changes (like OLTP_RW, small dataset, many users), the negative effect from the patch will be much less that the whole degradation due to high contention.

Finally, from our team's point of view, it makes sense.

Comment by Marko Mäkelä [ 2020-09-17 ]

In MDEV-21452, I would like to replace the custom InnoDB mutex and event implementations with normal mutex and condition variable. If that effort fails to improve performance, we may have to look at this.

Comment by Marko Mäkelä [ 2021-08-17 ]

dmitriy.philimonov, did you test MariaDB 10.6? It includes completely rewritten mutex and rw-lock code.

Comment by Dmitriy Philimonov [ 2021-08-19 ]

No, I didn't test this patch with MariaDB 10.6

Comment by Marko Mäkelä [ 2022-01-17 ]

dmitriy.philimonov, the described change would only be applicable to older MariaDB Server releases (10.5 or earlier). I would not try to improve the performance of older releases, because it could introduce surprise regressions that are hard to reason about. See MDEV-20638 and MDEV-23475 for an example.

During the development of simpler latching for MariaDB Server 10.6, I found that the usefulness of tweaks like "use a spinloop" (MYSQL_MUTEX_INIT_FAST) depends on the actual mutex as well as on the workload. I would expect that log_sys.mutex (which actually is a normal mutex already starting with MDEV-23855) or buf_pool.mutex the answer is "yes, this mutex is contended". For most dict_table_t::lock_mutex the answer might be "no, this is not contended", while for some it could be "yes, this is contended", entirely depending on the workload.

In general, I think that instead of tweaking the mutex or rw-lock implementation, we should try to reduce the contention in the first place. I would like to note that std::atomic is not a silver bullet. A combination of std::atomic and appropriate partitioning of data could work.

That said, I welcome a review and any improvements to the InnoDB synchronization primitives in MariaDB Server 10.6 or later.

Generated at Thu Feb 08 09:20:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.