[MDEV-26883] InnoDB hang due to table lock conflict - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.6.0, 10.6.1, 10.6.2, 10.6.3, 10.6.4
Fix Version/s: 10.6.5
Component/s: Storage Engine - InnoDB
Labels:
- regression-10.6
- rr-profile-analyzed

Description

In a stress test campaign of a 10.6-based branch by mleich, a deadlock between two InnoDB threads were observed, involving lock_sys.wait_mutex and a dict_table_t::lock_mutex. The cause of the hang is a latching order violation in lock_sys_t::cancel():

resolve_table_lock:

      dict_table_t *table= lock->un_member.tab_lock.table;

      table->lock_mutex_lock();

The correct latching order would be lock_sys.latch, dict_table_t::lock_mutex, lock_sys.wait_mutex. Because we are already holding lock_sys.wait_mutex here, we must invoke table->lock_mutex_trylock(). If that mutex is unavailable, we must first release lock_sys.wait_mutex before acquiring it, and finally acquire lock_sys.mutex, just like we handle the lock_sys.latch order violation in the same function.

This hang should mostly only affect DDL operations, and possibly LOCK TABLES. During normal DML, there will be no table lock conflicts, because IX and IS locks are compatible with each other.

The final symptom was the infamous watchdog message like this (copied from another log):

2021-10-21 17:57:45 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

The fix was validated by further stress testing, and no hangs were observed. The random query generator (RQG) grammar involved partitioned tables (each partition is handled as a separate InnoDB table) and some compression and encryption to slow down the buffer pool operations.

Attachments

Issue Links

is caused by

MDEV-24789 Performance regression after MDEV-24671

Closed

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2021-10-22 08:15

Updated:: 2021-10-22 11:04

Resolved:: 2021-10-22 11:04

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.