Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26883

InnoDB hang due to table lock conflict

    XMLWordPrintable

    Details

      Description

      In a stress test campaign of a 10.6-based branch by Matthias Leich, a deadlock between two InnoDB threads were observed, involving lock_sys.wait_mutex and a dict_table_t::lock_mutex. The cause of the hang is a latching order violation in lock_sys_t::cancel():

      resolve_table_lock:
            dict_table_t *table= lock->un_member.tab_lock.table;
            table->lock_mutex_lock();
      

      The correct latching order would be lock_sys.latch, dict_table_t::lock_mutex, lock_sys.wait_mutex. Because we are already holding lock_sys.wait_mutex here, we must invoke table->lock_mutex_trylock(). If that mutex is unavailable, we must first release lock_sys.wait_mutex before acquiring it, and finally acquire lock_sys.mutex, just like we handle the lock_sys.latch order violation in the same function.

      This hang should mostly only affect DDL operations, and possibly LOCK TABLES. During normal DML, there will be no table lock conflicts, because IX and IS locks are compatible with each other.

      The final symptom was the infamous watchdog message like this (copied from another log):

      2021-10-21 17:57:45 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
      

      The fix was validated by further stress testing, and no hangs were observed. The random query generator (RQG) grammar involved partitioned tables (each partition is handled as a separate InnoDB table) and some compression and encryption to slow down the buffer pool operations.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration