[MDEV-24789] Performance regression after MDEV-24671 Created: 2021-02-05  Updated: 2022-12-20  Resolved: 2021-03-02

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.0

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: performance, regression

Issue Links:
Problem/Incident
causes MDEV-25016 Race condition between lock_sys_t::ca... Closed
causes MDEV-25371 Potential hang in wsrep_is_BF_lock_ti... Closed
causes MDEV-26883 InnoDB hang due to table lock conflict Closed
is caused by MDEV-24671 Assertion failure in lock_wait_table_... Closed
Relates
relates to MDEV-25016 Race condition between lock_sys_t::ca... Closed

 Description   

The fix of MDEV-24671 introduced a serious performance regression that is observable at 32 concurrent connections.

My current hypothesis based on some initial investigation is that the changed sizeof(trx->lock) caused an increase of cache misses.



 Comments   
Comment by Marko Mäkelä [ 2021-02-05 ]

Reducing sizeof(trx_lock_t) did not help (I did it anyway), but some refactoring to reduce the hold time of lock_sys.mutex and lock_sys.wait_mutex seems to have done the trick. Thanks to axel for helping with the analysis and validation!

Comment by Marko Mäkelä [ 2021-02-25 ]

Some performance regression is still present. The function lock_wait() that was introduced in MDEV-24671 was holding lock_sys.wait_mutex for unnecessarily long time.

Comment by Marko Mäkelä [ 2021-02-26 ]

Even after reducing the lock_sys.wait_mutex hold time in lock_wait() to the minimum, some regression is present, and more work needs to be done.

Comment by Marko Mäkelä [ 2021-03-01 ]

It looks like MDEV-24671 emphasized a pre-existing bottleneck on log_sys.mutex, which we hope to address in MDEV-14425. The remaining performance regression was only observed on RAM disk, not on real storage.

That said, the latest change that is being tested should reduce contention on lock_sys.latch and lock_sys.wait_mutex to the absolute minimum.

Generated at Thu Feb 08 09:32:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.