[MDEV-25404] read-only performance regression in 10.6 Created: 2021-04-13 Updated: 2021-10-07 Resolved: 2021-04-19 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.6 |
| Fix Version/s: | 10.6.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Axel Schwenke | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | regression | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
I see a heavy performance regression in 10.6 that did not exist ~4 weeks ago. It affects all workloads, even read-only:
|
| Comments |
| Comment by Axel Schwenke [ 2021-04-13 ] | ||||||||||||
|
Full results in attachment gdigest.pdf | ||||||||||||
| Comment by Axel Schwenke [ 2021-04-14 ] | ||||||||||||
|
I tested back until e9f33b77605 (from Dec 3rd) but still see this regression. | ||||||||||||
| Comment by Axel Schwenke [ 2021-04-15 ] | ||||||||||||
|
I found a good and a bad revision now:
sysbench command line is
I switched back to standard collation (latin1) now. | ||||||||||||
| Comment by Axel Schwenke [ 2021-04-15 ] | ||||||||||||
|
Hunted down (bisecting) to commit 03ca6495df3
| ||||||||||||
| Comment by Marko Mäkelä [ 2021-04-15 ] | ||||||||||||
|
wlad already noticed something similar. To address that, we first made srw_lock a thin wrapper of SRWLOCK on Microsoft Windows, and later the futex-based implementation was disabled altogether on Microsoft Windows. axel confirmed that defining SRW_LOCK_DUMMY (switching to an implementation with one mutex and two condition variables (separate ones for pending shared and exclusive requests) helps improve performance. The problem with a futex is that both readers and writers will unnecessarily be waken up. I think that we definitely need a futex-based srw_mutex and a mutex-and-condition-variables based sux_lock (for dict_index_t::lock and buf_block_t::lock). It remains to be seen whether also srw_lock must be changed to use a mutex and two condition variables. It would be so much nicer to use only 4 bytes instead of 40+2*48 bytes, especially inside buf_block_t::lock. | ||||||||||||
| Comment by Marko Mäkelä [ 2021-04-16 ] | ||||||||||||
|
Using http://locklessinc.com/articles/sleeping_rwlocks/ for inspiration, I came up with a simpler composition for srw_lock or sux_lock. Writers (U and X lock holders) will use a combination of a writer mutex and a flag in the atomic readers lock word. If a read lock request sees that the WRITER flag is set, it will enter a loop where it acquires the mutex, increments the readers and checks the WRITER flag. I also reimplemented srw_mutex using 31 bits for counting waiting requests and 1 bit for a HOLDER flag. The counter allows us avoid unnecessary FUTEX_WAKE system calls. With SRW_LOCK_DUMMY, we will use two mutexes and one condition variable, instead of using 1 mutex and 2 condition variables. | ||||||||||||
| Comment by Marko Mäkelä [ 2021-04-17 ] | ||||||||||||
|
We can only compose the ssux_lock using a writer mutex on systems where the mutex is not re-entrant. There is no such guarantee for the POSIX pthread_mutex_t. Hence, for systems where only a generic mutex is available, we must retain the old SRW_LOCK_DUMMY implementation that consists of std::atomic<uint32_t>, a pthread_mutex_t and two pthread_cond_t so that in case the ownership of a buf_block_t::lock is transferred to a write completion callback thread, the submitting thread of the write will not wrongly acquire the writer mutex of the buf_block_t::lock while the previously submitted write is in progress. This problem was caught in Microsoft Windows on a system where the tests were run on relatively slow hard disk. Using a futex-based srw_mutex writer works correctly, because re-entrant acquisition is not allowed and the mutex does not keep track of the holding thread. I successfully tested the fix on Microsoft Windows both with and without the following patch:
| ||||||||||||
| Comment by Vladislav Vaintroub [ 2021-04-19 ] | ||||||||||||
|
Performance seems ok, the code has grown to be somewhat complicated. maybe it makes sense to document relationship between different rwlocks in Innodb. | ||||||||||||
| Comment by Marko Mäkelä [ 2021-04-19 ] | ||||||||||||
|
If there was a way to have non-recursive mutexes on all platforms, the fallback implementation for futex-less systems would be simpler. On GNU/Linux (with GNU libc), pthread_mutex_t is non-recursive by default and "just works". On Microsoft Windows, and on some proprietary UNIX systems, mutexes are recursive by default. There is a way to explicitly request a mutex to be recursive, but nothing to request them to be non-recursive. Recursive mutexes are inherently incompatible with "ownership passing", which is a requirement for the asynchronous writes of pages that are protected by buf_block_t::lock. | ||||||||||||
| Comment by Marko Mäkelä [ 2021-04-19 ] | ||||||||||||
|
SRW_LOCK_DUMMY was renamed to SUX_LOCK_GENERIC, because on Microsoft Windows, srw_lock will always wrap SRWLOCK even if that alternative implementation were enabled. | ||||||||||||
| Comment by Marko Mäkelä [ 2021-05-05 ] | ||||||||||||
|
According to https://shift.click/blog/futex-like-apis/ documented futex equivalents do exist on some operating systems beyond Linux, OpenBSD and Microsoft Windows:
Furthermore, C++20 defines std::atomic_wait and std::atomic_notify_one. |