Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31132

Deadlock between a DDL operation and a purge of InnoDB history

Details

    Description

      elenst provided an rr replay trace where InnoDB hangs:

      10.6 0976afec889d8914326f9e71b15ea215470dadba

      2023-04-25 15:29:40 0 [Note] /data/MDEV-31084/10.6d/sql/mysqld: ready for connections.
      Version: '10.6.13-MariaDB-debug-log'  socket: '/data/MDEV-31084/var3/mysql.sock'  port: 13000  Source distribution
      2023-04-25 15:30:24 17 [Note] Semi-sync replication initialized for transactions.
      2023-04-25 15:30:24 17 [Note] Semi-sync replication enabled on the master.
      2023-04-25 15:30:24 0 [Note] Starting ack receiver thread
      2023-04-25 15:31:17 0 [Note] InnoDB: Dumping buffer pool(s) to /data/MDEV-31084/var3/data/ib_buffer_pool
      2023-04-25 15:31:17 0 [Note] InnoDB: Restricted to 510 pages due to innodb_buf_pool_dump_pct=25
      2023-04-25 15:31:17 0 [Note] InnoDB: Buffer pool(s) dump completed at 230425 15:31:17
      2023-04-25 15:41:43 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
      

      At the final state, one locked block exists in the buf_pool.flush_list, with an oldest_modification_ that requires this page to be written out, so that a log checkpoint can be made and the circular ib_logfile0 is writable again.

      The problematic page latch is being held by a purge thread, which is waiting for lock_sys.latch. The exclusive lock_sys.latch is being held by a user thread that is executing the DROP TABLE part of a CREATE OR REPLACE TABLE statement. Here is how I debugged it:

      ssh rr
      rr replay /data/MDEV-31084/var3/rr_profile_1682436579
      

      run 1364731
      break buf_flush_sync_for_checkpoint
      continue
      thread apply 18 backtrace
      thread apply 1 backtrace
      

      10.6 0976afec889d8914326f9e71b15ea215470dadba

      Thread 18 (Thread 424366.424399):
      #0  FreeState (cs=0x604754000ce0, free_state=1) at /data/MDEV-31084/10.6d/dbug/dbug.c:1645
      #1  0x000055f7328f8a55 in _db_pop_ () at /data/MDEV-31084/10.6d/dbug/dbug.c:935
      #2  0x000055f7328d4802 in safe_mutex_lock (mp=0x55f733e55a50 <buf_dblwr+16>, my_flags=0, file=0x55f732d8c8f0 "/data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h", line=161) at /data/MDEV-31084/10.6d/mysys/thr_mutex.c:398
      #3  0x000055f73255bdd1 in inline_mysql_mutex_lock (that=0x55f733e55a50 <buf_dblwr+16>, src_file=0x55f732d8c8f0 "/data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h", src_line=161) at /data/MDEV-31084/10.6d/include/mysql/psi/mysql_thread.h:750
      #4  0x000055f7325636bc in buf_dblwr_t::wait_flush_buffered_writes (this=0x55f733e55a40 <buf_dblwr>) at /data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h:161
      #5  0x000055f732561ad2 in os_aio_wait_until_no_pending_writes (declare=false) at /data/MDEV-31084/10.6d/storage/innobase/os/os0file.cc:3665
      #6  0x000055f732746470 in buf_flush_wait (lsn=21395826) at /data/MDEV-31084/10.6d/storage/innobase/buf/buf0flu.cc:1865
      #7  0x000055f732746967 in buf_flush_wait_flushed (sync_lsn=21395826) at /data/MDEV-31084/10.6d/storage/innobase/buf/buf0flu.cc:1911
      #8  0x000055f73251cf09 in log_checkpoint_margin () at /data/MDEV-31084/10.6d/storage/innobase/log/log0log.cc:1046
      #9  0x000055f73251cf48 in log_check_margins () at /data/MDEV-31084/10.6d/storage/innobase/log/log0log.cc:1062
      #10 0x000055f7324d8af1 in log_free_check () at /data/MDEV-31084/10.6d/storage/innobase/include/log0log.inl:309
      #11 0x000055f7324eb915 in ibuf_delete_for_discarded_space (space=484) at /data/MDEV-31084/10.6d/storage/innobase/ibuf/ibuf0ibuf.cc:4461
      #12 0x000055f7327b3177 in fil_delete_tablespace (id=484) at /data/MDEV-31084/10.6d/storage/innobase/fil/fil0fil.cc:1692
      #13 0x000055f7327a8522 in trx_t::commit (this=0x153fe3029880, deleted=std::vector of length 0, capacity 0) at /data/MDEV-31084/10.6d/storage/innobase/dict/drop.cc:274
      #14 0x000055f73245d2ce in ha_innobase::delete_table (this=0x604754316210, name=0x42486683ab50 "./test1/updates_deletes_tbl11") at /data/MDEV-31084/10.6d/storage/innobase/handler/ha_innodb.cc:13757
      #27 0x000055f731d7d592 in mysql_parse (thd=0x604754000d48, rawbuf=0x6047540130c0 "CREATE /* QNO 11738 CON_ID 17 */ OR REPLACE TABLE updates_deletes_tbl11 AS SELECT tbl1.* FROM test1.updates_deletes_tbl8 AS tbl1 WHERE tbl1.`col_enum_utf8` < 'primjer' OR 'phục vụ' != tbl1.`col_ye"..., length=253, parser_state=0x42486683d390) at /data/MDEV-31084/10.6d/sql/sql_parse.cc:8036
      

      10.6 0976afec889d8914326f9e71b15ea215470dadba

      Thread 1 (Thread 424366.424380):
      #16 0x000055f73250f1be in lock_sys_t::rd_lock (this=0x55f733e16440 <lock_sys>) at /data/MDEV-31084/10.6d/storage/innobase/include/lock0lock.h:809
      #17 0x000055f7324f4e06 in LockGuard::LockGuard (this=0x1657406ddbc0, hash=@0x55f733e16498: {n_cells = 10477, array = 0x55f7358358c0}, id={m_id = 11}) at /data/MDEV-31084/10.6d/storage/innobase/lock/lock0lock.cc:201
      #18 0x000055f7324ffca4 in lock_update_delete (block=0xd7e2e786340, rec=0xd7e2e8343e4 "") at /data/MDEV-31084/10.6d/storage/innobase/lock/lock0lock.cc:3174
      #19 0x000055f732701eab in btr_cur_optimistic_delete (cursor=0x55f7358894a0, flags=0, mtr=0x1657406de1c0) at /data/MDEV-31084/10.6d/storage/innobase/btr/btr0cur.cc:4579
      #20 0x000055f73262662b in row_purge_remove_clust_if_poss_low (node=0x55f735889400, mode=BTR_MODIFY_LEAF) at /data/MDEV-31084/10.6d/storage/innobase/row/row0purge.cc:215
      

      I see two possible fixes:

      • We must disallow holding lock_sys.latch while log_free_check() is being executed. This should only be violated in DDL operations. The fix might be as simple as executing log_free_check() only once in DDL operations, right before acquiring lock_sys.latch.
      • In any operation that needs to acquire lock_sys.latch while holding buffer page latches, we must implement a timed wait, and on timeout we have to restart the whole mini-transaction. This would be very complicated, could seriously hurt DML performance, and increase the amount of "fake hangs" due to starvation when running with rr record.

      Attachments

        Issue Links

          Activity

            Because there is no debug instrumentation for "the current thread holds a shared latch" for the MDEV-24167 rw-locks (we only have it for the MDEV-24142 block and index latches), we can only check for an exclusive latch. It should suffice, though. No caller of log_free_check() is allowed to hold any buffer page latches, and typically the shared lock_sys.latch should be acquired for a short time while holding an index leaf page latch. DDL operations are an exception: they can hold lock_sys.latch for a longer time.

            A simple assertion that I added is failing already on ./mtr --bootstrap:

            void log_free_check(): Assertion `!lock_sys.is_writer()' failed.
            

            The stack trace is similar to the one in the analyzed hang. The current statement is migrating an InnoDB table to another storage engine:

            ALTER TABLE gtid_slave_pos ENGINE=Aria transactional=0;
            

            If I comment out the problematic call, the regression test suite will pass fine even when the debug assertion in log_free_check() is present:

            diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
            index 48d205f428a..5e42c59baf9 100644
            --- a/storage/innobase/fil/fil0fil.cc
            +++ b/storage/innobase/fil/fil0fil.cc
            @@ -1689,7 +1689,7 @@ pfs_os_file_t fil_delete_tablespace(ulint id)
                 fil_space_free_low(space);
               }
             
            -  ibuf_delete_for_discarded_space(id);
            +  //ibuf_delete_for_discarded_space(id);
               return handle;
             }
             
            

            This call was introduced in MDEV-22090 in MariaDB 10.5.3, but it probably was not a problem until fil_delete_tablespace() was refactored to implement atomic DDL in 10.6.

            marko Marko Mäkelä added a comment - Because there is no debug instrumentation for "the current thread holds a shared latch" for the MDEV-24167 rw-locks (we only have it for the MDEV-24142 block and index latches), we can only check for an exclusive latch. It should suffice, though. No caller of log_free_check() is allowed to hold any buffer page latches, and typically the shared lock_sys.latch should be acquired for a short time while holding an index leaf page latch. DDL operations are an exception: they can hold lock_sys.latch for a longer time. A simple assertion that I added is failing already on ./mtr --bootstrap : void log_free_check(): Assertion `!lock_sys.is_writer()' failed. The stack trace is similar to the one in the analyzed hang. The current statement is migrating an InnoDB table to another storage engine: ALTER TABLE gtid_slave_pos ENGINE=Aria transactional=0; If I comment out the problematic call, the regression test suite will pass fine even when the debug assertion in log_free_check() is present: diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc index 48d205f428a..5e42c59baf9 100644 --- a/storage/innobase/fil/fil0fil.cc +++ b/storage/innobase/fil/fil0fil.cc @@ -1689,7 +1689,7 @@ pfs_os_file_t fil_delete_tablespace(ulint id) fil_space_free_low(space); } - ibuf_delete_for_discarded_space(id); + //ibuf_delete_for_discarded_space(id); return handle; } This call was introduced in MDEV-22090 in MariaDB 10.5.3, but it probably was not a problem until fil_delete_tablespace() was refactored to implement atomic DDL in 10.6.

            My fix removes the ibuf_delete_for_discarded_space() call from fil_delete_tablespace() and adds it to its callers, after the caller has released all latches.

            marko Marko Mäkelä added a comment - My fix removes the ibuf_delete_for_discarded_space() call from fil_delete_tablespace() and adds it to its callers, after the caller has released all latches.

            Patch looks OK

            thiru Thirunarayanan Balathandayuthapani added a comment - Patch looks OK

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.