[MDEV-36218] InnoDB Semaphore Waits for Extended Periods Causing Lock Contention and hangs the server - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.4.33
Fix Version/s: None
Component/s: Storage Engine - InnoDB
Labels:
None

Description

We are experiencing significant lock contention in our MariaDB instance, where threads are waiting for InnoDB semaphore locks for long periods (over 240 seconds in some cases). This behavior seems to be linked to the holding of RW-latches by writer threads, preventing other threads from acquiring the necessary locks and causing a substantial delay in query execution. Specifically, the issue seems to be related to locks on the data dictionary or related resources.

The following logs highlight instances of threads waiting for locks:

Log Entry 1:

--Thread 139459816298240 has waited at row0ins.cc line 2700 for 241.00 seconds the semaphore:

SX-lock on RW-latch at 0x7f441ae0c2c8 created in file buf0buf.cc line 1390

a writer (thread id 139482392381184) has reserved it in mode exclusive

Thread 139459816298240 has been waiting for 241.00 seconds for a SX-lock on an RW-latch.
The writer thread (ID: 139482392381184) holds the lock in exclusive mode, blocking access to other threads.

Log Entry 2:

--Thread 139460081497856 has waited at btr0cur.cc line 1490 for 241.00 seconds the semaphore:

SX-lock on RW-latch at 0x7edbd225d5e0 created in file dict0dict.cc line 1925

a writer (thread id 139482392381184) has reserved it in mode SX

number of readers 103, waiters flag 1, lock_word: fffff99

Last time write locked in file btr0cur.cc line 1490

2025-02-25 12:17:43 0 [Note] InnoDB: A semaphore wait:

Thread 139460081497856 is waiting for 241.00 seconds for an SX-lock on an RW-latch related to the data dictionary.
The writer thread (ID: 139482392381184) holds the lock in SX mode (shared exclusive), with 103 readers waiting.

Log Entry 3:

--Thread 139460078425856 has waited at row0ins.cc line 1970 for 240.00 seconds the semaphore:

S-lock on RW-latch at 0x55a539cfb5b0 created in file dict0dict.cc line 833

a writer (thread id 139482701494016) has reserved it in mode wait exclusive

number of readers 7, waiters flag 1, lock_word: fffffff9

Last time write locked in file dict0stats.cc line 2479

2025-02-25 12:17:43 0 [Note] InnoDB: A semaphore wait:

Thread 139460078425856 is waiting for 240.00 seconds for an S-lock on an RW-latch related to the data dictionary, with 7 readers also waiting.
The writer thread (ID: 139482701494016) holds the lock in wait exclusive mode, causing delays for both readers and writers.

I am seeing a severe semaphore waits where multiple threads are waiting for an InnoDB RW-latch (Read-Write latch), which is held exclusively by a writer thread for an extended period.

--Thread 139460275664640 has waited at row0ins.cc line 2700 for 941.00 seconds the semaphore:

SX-lock on RW-latch at 0x7f441ae0c2c8 created in file buf0buf.cc line 1390

a writer (thread id 139482392381184) has reserved it in mode  exclusive

number of readers 0, waiters flag 1, lock_word: f0000000

Last time write locked in file btr0btr.cc line 229

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14216156

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14857263

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14212809

--Thread 139459954722560 has waited at ha_innodb.cc line 14436 for 940.00 seconds the semaphore:

S-lock on RW-latch at 0x7efa11d2ce40 created in file fil0fil.cc line 1276

a writer (thread id 139482392381184) has reserved it in mode  exclusive

number of readers 0, waiters flag 1, lock_word: e0000000

Last time write locked in file fsp0fsp.cc line 2419

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14848946

--Thread 139482764965632 has waited at ha_innodb.cc line 14436 for 940.00 seconds the semaphore:

S-lock on RW-latch at 0x7efa11d2ce40 created in file fil0fil.cc line 1276

a writer (thread id 139482392381184) has reserved it in mode  exclusive

number of readers 0, waiters flag 1, lock_word: e0000000

Last time write locked in file fsp0fsp.cc line 2419

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14213035

--Thread 139481774987008 has waited at row0ins.cc line 2700 for 941.00 seconds the semaphore:

SX-lock on RW-latch at 0x7f441ae0c2c8 created in file buf0buf.cc line 1390

a writer (thread id 139482392381184) has reserved it in mode  exclusive

number of readers 0, waiters flag 1, lock_word: f0000000

Last time write locked in file btr0btr.cc line 229

OS WAIT ARRAY INFO: reservation count 0

OS WAIT ARRAY INFO: reservation count 14845575

I was suspecting the buffer pool to be performing in-efficiently, but the buffer pool stats looks normal

----------------------

BUFFER POOL AND MEMORY

----------------------

Total large memory allocated 571230650368

Dictionary memory allocated 470920

Buffer pool size   33650688

Free buffers       65536

Database pages     54403797

Old database pages 20081362

Modified db pages  290

Percent of dirty pages(LRU & free pages): 0.001

Max dirty pages percent: 80.000

Pending reads 0

Pending writes: LRU 0, flush list 0, single page 0

Pages made young 804830394, not young 143736088851

0.00 youngs/s, 0.00 non-youngs/s

Pages read 2300877009, created 1058711993, written 15684288049

0.00 reads/s, 2.25 creates/s, 302.13 writes/s

Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000

Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s

LRU len: 54403797, unzip_LRU len: 5440476

I/O sum[6144384]:cur[348160], unzip sum[0]:cur[0]

I am suspecting it is encountering the following MDEV-15020 error.

Can someone help to validate the possible root cause and any way I can re-produce the issue.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

primary-innodb-diag.log
2025-03-05 08:46
704 kB
Keshan Nageswaran

Activity

People

Assignee:: Unassigned

Reporter:: Keshan Nageswaran

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2025-03-05 07:07

Updated:: 2025-03-05 20:16

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.