[MDEV-24167] InnoDB unnecessarily uses complex rw-lock implementation Created: 2020-11-09  Updated: 2024-02-08  Resolved: 2020-11-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5
Fix Version/s: 10.6.0

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: Scalability, performance, performance_schema

Issue Links:
Blocks
blocks MDEV-24142 rw_lock_t has unnecessarily complex w... Closed
blocks MDEV-24258 Merge dict_sys.mutex into dict_sys.latch Closed
is blocked by MDEV-24171 index_online_log is incorrectly instr... Closed
is blocked by MDEV-24271 rw_lock::read_lock_yield() may cause ... Closed
Problem/Incident
causes MDEV-24884 Hang in ssux_lock_low::write_lock() Closed
Relates
relates to MDEV-7109 Add support for INFORMATION_SCHEMA.IN... Closed
relates to MDEV-7399 Add support for INFORMATION_SCHEMA.IN... Closed
relates to MDEV-24410 Deadlock between freeing and allocati... Closed
relates to MDEV-25267 Reported latching order violation in ... Open
relates to MDEV-21330 Lock monitor doesn't print a semaphor... Closed
relates to MDEV-24630 MY_RELAX_CPU assembly instruction upg... Closed
relates to MDEV-32065 Always check whether lock is free at ... Open

 Description   

InnoDB implements its own complex rw-latch whose special features (recursive X-locks and SX-locks) are only needed for buf_block_t::lock and dict_index_t::lock.

It would be tempting to use the lightweight wrapper mysql_rwlock_t, but unfortunately it would introduce a performance regression on Linux. On Microsoft Windows, we can use a straightforward wrapper of SRWLOCK. On Linux, a std::atomic<uint32_t> and a futex will do. On anything else, we can use mysql_rwlock_t.



 Comments   
Comment by Marko Mäkelä [ 2020-11-09 ]

On my Debian GNU/Linux system, sizeof(mysql_rwlock_t) is 64 bytes, while sizeof(rw_lock_t) is 160 bytes or 88 bytes, depending on whether a debug build is being used.
The system rw-lock does not support recursive locking. InnoDB’s reimplementation supports that for the exclusive and shared-exclusive modes (possibly, to be simplified in MDEV-24142).

Comment by Marko Mäkelä [ 2020-11-16 ]

For contended rw-locks, such as the adaptive hash index latches, the custom implementation is measurably faster than mysql_rwlock_t. Hence, it makes sense to improve the custom implementation, in MDEV-24142.

Comment by Marko Mäkelä [ 2020-11-19 ]

For all InnoDB rw-locks except those that will be addressed in MDEV-24142, we can use a lightweight wrapper of Microsoft Windows SRWLOCK or std::atomic<uint32_t> and Linux futex. On other operating systems, we can use mysql_rwlock_t.

Comment by Marko Mäkelä [ 2020-11-20 ]

My std::atomic<uint32_t> based implementation was improved in MDEV-24271: rd_lock() must perform compare-and-swap instead of doing fetch_add() and then backing off if a conflicting lock was held or requested. The current implementation is causing hangs if the server is heavily overloaded, or running under rr record.

Comment by Marko Mäkelä [ 2020-12-15 ]

To prepare for MDEV-24142 I had replaced the SRWLOCK implementation on Microsoft Windows with atomic-and-futex-based implementation. According to wlad, that did not perform well. So, I separated ssux_lock and srw_lock. The srw_lock is based on Windows SRWLOCK, or outside Windows on rw_lock_t when SRW_LOCK_DUMMY is defined). The non-recursive ssux_lock is only being used as a building block of the recursive MDEV-24142 sux_lock, which is only being used for dict_index_t::lock and buf_block_t::lock.

Comment by Marko Mäkelä [ 2020-12-16 ]

Especially on Microsoft Windows (probably due to the hardware configuration rather than the operating system), we observed hung ssux_lock::update_lock(), mainly during the test mariabackup.xb_compressed_encrypted. Making ssux_lock::u_unlock() always wake up all waiters seemed to help. I suspect that the issue simply was that if multiple concurrent update_lock() ended up waiting, not all of them would be woken up. For X-locks, this problem was prevented by having a separate ‘waiting’ flag. For U-locks, we do not have such a flag.

For some reason, the issue was not observable before we replaced the InnoDB homebrew mutex implementation with system mutexes in MDEV-24142.

Comment by Marko Mäkelä [ 2020-12-16 ]

An unrelated cause of mariabackup.xb_compression_encrypted test timeouts was that the innodb_encryption_threads=4 were fighting each other and causing contention. Before MDEV-24142, the spinloop of the homebrew mutex saved the day. System mutexes seem to have significant overhead in the case of contention, at least on Microsoft Windows.

Generated at Thu Feb 08 09:27:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.