[MDEV-24258] Merge dict_sys.mutex into dict_sys.latch Created: 2020-11-20  Updated: 2023-07-22  Resolved: 2021-08-31

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: 10.6.5

Type: Task Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: performance

Attachments: File MDEV-24258.patch    
Issue Links:
Blocks
blocks MDEV-25919 InnoDB reports misleading lock wait t... Closed
is blocked by MDEV-23484 Rollback unnecessarily acquires dict_... Closed
is blocked by MDEV-24167 InnoDB unnecessarily uses complex rw-... Closed
Problem/Incident
causes MDEV-26551 InnoDB crash on multiple concurrent S... Closed
Relates
relates to MDEV-16260 Scale the purge effort according to t... Open
relates to MDEV-26356 Performance regression after dict_sys... Closed
relates to MDEV-26636 Race conditions due to attempted upda... Closed
relates to MDEV-28289 fts_optimize_sync_table() is acquirin... Closed
relates to MDEV-28462 AddressSanitizer: use-after-poison d... Closed
relates to MDEV-29846 deadlock on dict_sys.mutex on databas... Open
relates to MDEV-31759 Large grain of dict_sys lock by table... Closed

 Description   

InnoDB data dictionary cache is protected by both dict_sys.latch (an RW-lock) and dict_sys.mutex. One reason for the redundant synchronization primitive would be eliminated by MDEV-24167: the rw-lock should not be slower than a mutex.

Another reason for keeping a separate mutex is that sometimes, the mutex provides a mechanism to ‘upgrade’ the rw-lock. The most prominent case of that would be removed by MDEV-23484, if we could guarantee that transaction rollback is always protected by MDL. Currently, the rollback of recovered transactions is not being protected by MDL.



 Comments   
Comment by Marko Mäkelä [ 2020-11-20 ]

MDEV-24258.patch applies to a development branch of MDEV-24167. Many tests would hang. In every case that I checked, the reason was that transaction rollback would attempt to acquire an exclusive latch while already holding it in shared mode. That would be fixed by MDEV-23484.

Comment by Marko Mäkelä [ 2021-07-28 ]

I believe that this can be achieved in the 10.6 release series after all. It seems that all use of dict_sys.freeze() can be replaced with MDL. In most cases, the MDL will already have been acquired by the SQL layer.

Comment by Marko Mäkelä [ 2021-07-30 ]

thiru, please review.

Comment by Marko Mäkelä [ 2021-07-30 ]

Unfortunately, we got the following problem with rr record once (not without it):

2021-07-30 12:57:47 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

In the trace that I analyzed, the first thread that started the wait would starve while later threads would acquire and release the latch. Nothing is really stuck, but either the scheduling is extremely unfair, or we may really have to think about making the lock waits fairer. For example, when a wait time start has been set by some thread, threads that are trying to acquire dict_sys.latch would yield.

Given this potential regression, this change is too risky to be included in the upcoming 10.6.4 release. We will need more testing.

Comment by Marko Mäkelä [ 2021-08-27 ]

During the testing of MDEV-25919, which depends on this change, a simple reason for excessive waits around dict_sys.latch when running with rr record was identified: Table lookups would acquire dict_sys.latch in exclusive mode even when the table definition is present in the cache. We really only have to acquire an exclusive latch when a table definition needs to be loaded into the cache. The eviction policy for the InnoDB internal table definition cache will have to be changed from LRU to FIFO, because reordering elements in dict_sys.table_LRU on every loookup would require an exclusive latch.

Comment by Marko Mäkelä [ 2021-08-31 ]

The PERFORMANCE_SCHEMA instrumentation for dict_sys_mutex was removed along with dict_sys.mutex. The dict_sys.latch will continue be instrumented as dict_operation_lock.

Because dict_sys.mutex will no longer 'throttle' the threads that purge InnoDB transaction history, a performance degradation may be observed unless innodb_purge_threads=1.

The table cache eviction policy will become FIFO-like, because table lookup will be protected by a shared dict_sys.latch, instead of being protected by exclusive dict_sys.mutex. Note: Tables can never be evicted as long as locks exist on them or the tables are in use by some thread.

Comment by Marko Mäkelä [ 2021-09-02 ]

The http://www.brendangregg.com/offcpuanalysis.html graphs provided by axel showed that purge tasks would end up waiting much more elsewhere if their table lookups are no longer serialized by exclusive dict_sys.latch. Possibly this would lead to purge lag, which in turn would lead to degraded throughput.

I managed to reproduce the phenomenon on my system today. Adding dummy synchronization fixed the regression for me. This is of course only a work-around, and deeper investigation on purge subsystem will be needed.

Comment by Marko Mäkelä [ 2021-09-03 ]

I tested a more complex purge throttle, and it resulted in much worse overall throughput. The previous dummy synchronization did consistently help in my tests.

I checked the graphs again, and I see that the purge coordinator is waiting for undo pages to be read into the buffer pool, which in turn can wait for a to-be-evicted page to be written. Purge workers are waiting for index page latches (theoretically conflicting with workload, but maybe more likely just waiting for the page read). Before the removal of dict_sys.mutex, both these threads were spending 2/3 or 3/4 of their waiting time on dict_sys.mutex.

I think that we must throttle purge based on buffer pool contention. That could be done in MDEV-26356.

Generated at Thu Feb 08 09:28:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.