[MDEV-31132] Deadlock between a DDL operation and a purge of InnoDB history Created: 2023-04-26  Updated: 2023-04-26  Resolved: 2023-04-26

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1
Fix Version/s: 11.1.1, 10.11.3, 11.0.2, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang, rr-profile-analyzed

Issue Links:
Relates
relates to MDEV-22090 Change buffer is not freed after drop... Closed
relates to MDEV-25506 Atomic DDL: .frm file is removed and ... Closed

 Description   

elenst provided an rr replay trace where InnoDB hangs:

10.6 0976afec889d8914326f9e71b15ea215470dadba

2023-04-25 15:29:40 0 [Note] /data/MDEV-31084/10.6d/sql/mysqld: ready for connections.
Version: '10.6.13-MariaDB-debug-log'  socket: '/data/MDEV-31084/var3/mysql.sock'  port: 13000  Source distribution
2023-04-25 15:30:24 17 [Note] Semi-sync replication initialized for transactions.
2023-04-25 15:30:24 17 [Note] Semi-sync replication enabled on the master.
2023-04-25 15:30:24 0 [Note] Starting ack receiver thread
2023-04-25 15:31:17 0 [Note] InnoDB: Dumping buffer pool(s) to /data/MDEV-31084/var3/data/ib_buffer_pool
2023-04-25 15:31:17 0 [Note] InnoDB: Restricted to 510 pages due to innodb_buf_pool_dump_pct=25
2023-04-25 15:31:17 0 [Note] InnoDB: Buffer pool(s) dump completed at 230425 15:31:17
2023-04-25 15:41:43 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

At the final state, one locked block exists in the buf_pool.flush_list, with an oldest_modification_ that requires this page to be written out, so that a log checkpoint can be made and the circular ib_logfile0 is writable again.

The problematic page latch is being held by a purge thread, which is waiting for lock_sys.latch. The exclusive lock_sys.latch is being held by a user thread that is executing the DROP TABLE part of a CREATE OR REPLACE TABLE statement. Here is how I debugged it:

ssh rr
rr replay /data/MDEV-31084/var3/rr_profile_1682436579

run 1364731
break buf_flush_sync_for_checkpoint
continue
thread apply 18 backtrace
thread apply 1 backtrace

10.6 0976afec889d8914326f9e71b15ea215470dadba

Thread 18 (Thread 424366.424399):
#0  FreeState (cs=0x604754000ce0, free_state=1) at /data/MDEV-31084/10.6d/dbug/dbug.c:1645
#1  0x000055f7328f8a55 in _db_pop_ () at /data/MDEV-31084/10.6d/dbug/dbug.c:935
#2  0x000055f7328d4802 in safe_mutex_lock (mp=0x55f733e55a50 <buf_dblwr+16>, my_flags=0, file=0x55f732d8c8f0 "/data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h", line=161) at /data/MDEV-31084/10.6d/mysys/thr_mutex.c:398
#3  0x000055f73255bdd1 in inline_mysql_mutex_lock (that=0x55f733e55a50 <buf_dblwr+16>, src_file=0x55f732d8c8f0 "/data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h", src_line=161) at /data/MDEV-31084/10.6d/include/mysql/psi/mysql_thread.h:750
#4  0x000055f7325636bc in buf_dblwr_t::wait_flush_buffered_writes (this=0x55f733e55a40 <buf_dblwr>) at /data/MDEV-31084/10.6d/storage/innobase/include/buf0dblwr.h:161
#5  0x000055f732561ad2 in os_aio_wait_until_no_pending_writes (declare=false) at /data/MDEV-31084/10.6d/storage/innobase/os/os0file.cc:3665
#6  0x000055f732746470 in buf_flush_wait (lsn=21395826) at /data/MDEV-31084/10.6d/storage/innobase/buf/buf0flu.cc:1865
#7  0x000055f732746967 in buf_flush_wait_flushed (sync_lsn=21395826) at /data/MDEV-31084/10.6d/storage/innobase/buf/buf0flu.cc:1911
#8  0x000055f73251cf09 in log_checkpoint_margin () at /data/MDEV-31084/10.6d/storage/innobase/log/log0log.cc:1046
#9  0x000055f73251cf48 in log_check_margins () at /data/MDEV-31084/10.6d/storage/innobase/log/log0log.cc:1062
#10 0x000055f7324d8af1 in log_free_check () at /data/MDEV-31084/10.6d/storage/innobase/include/log0log.inl:309
#11 0x000055f7324eb915 in ibuf_delete_for_discarded_space (space=484) at /data/MDEV-31084/10.6d/storage/innobase/ibuf/ibuf0ibuf.cc:4461
#12 0x000055f7327b3177 in fil_delete_tablespace (id=484) at /data/MDEV-31084/10.6d/storage/innobase/fil/fil0fil.cc:1692
#13 0x000055f7327a8522 in trx_t::commit (this=0x153fe3029880, deleted=std::vector of length 0, capacity 0) at /data/MDEV-31084/10.6d/storage/innobase/dict/drop.cc:274
#14 0x000055f73245d2ce in ha_innobase::delete_table (this=0x604754316210, name=0x42486683ab50 "./test1/updates_deletes_tbl11") at /data/MDEV-31084/10.6d/storage/innobase/handler/ha_innodb.cc:13757
#27 0x000055f731d7d592 in mysql_parse (thd=0x604754000d48, rawbuf=0x6047540130c0 "CREATE /* QNO 11738 CON_ID 17 */ OR REPLACE TABLE updates_deletes_tbl11 AS SELECT tbl1.* FROM test1.updates_deletes_tbl8 AS tbl1 WHERE tbl1.`col_enum_utf8` < 'primjer' OR 'phục vụ' != tbl1.`col_ye"..., length=253, parser_state=0x42486683d390) at /data/MDEV-31084/10.6d/sql/sql_parse.cc:8036

10.6 0976afec889d8914326f9e71b15ea215470dadba

Thread 1 (Thread 424366.424380):
#16 0x000055f73250f1be in lock_sys_t::rd_lock (this=0x55f733e16440 <lock_sys>) at /data/MDEV-31084/10.6d/storage/innobase/include/lock0lock.h:809
#17 0x000055f7324f4e06 in LockGuard::LockGuard (this=0x1657406ddbc0, hash=@0x55f733e16498: {n_cells = 10477, array = 0x55f7358358c0}, id={m_id = 11}) at /data/MDEV-31084/10.6d/storage/innobase/lock/lock0lock.cc:201
#18 0x000055f7324ffca4 in lock_update_delete (block=0xd7e2e786340, rec=0xd7e2e8343e4 "") at /data/MDEV-31084/10.6d/storage/innobase/lock/lock0lock.cc:3174
#19 0x000055f732701eab in btr_cur_optimistic_delete (cursor=0x55f7358894a0, flags=0, mtr=0x1657406de1c0) at /data/MDEV-31084/10.6d/storage/innobase/btr/btr0cur.cc:4579
#20 0x000055f73262662b in row_purge_remove_clust_if_poss_low (node=0x55f735889400, mode=BTR_MODIFY_LEAF) at /data/MDEV-31084/10.6d/storage/innobase/row/row0purge.cc:215

I see two possible fixes:

  • We must disallow holding lock_sys.latch while log_free_check() is being executed. This should only be violated in DDL operations. The fix might be as simple as executing log_free_check() only once in DDL operations, right before acquiring lock_sys.latch.
  • In any operation that needs to acquire lock_sys.latch while holding buffer page latches, we must implement a timed wait, and on timeout we have to restart the whole mini-transaction. This would be very complicated, could seriously hurt DML performance, and increase the amount of "fake hangs" due to starvation when running with rr record.


 Comments   
Comment by Marko Mäkelä [ 2023-04-26 ]

Because there is no debug instrumentation for "the current thread holds a shared latch" for the MDEV-24167 rw-locks (we only have it for the MDEV-24142 block and index latches), we can only check for an exclusive latch. It should suffice, though. No caller of log_free_check() is allowed to hold any buffer page latches, and typically the shared lock_sys.latch should be acquired for a short time while holding an index leaf page latch. DDL operations are an exception: they can hold lock_sys.latch for a longer time.

A simple assertion that I added is failing already on ./mtr --bootstrap:

void log_free_check(): Assertion `!lock_sys.is_writer()' failed.

The stack trace is similar to the one in the analyzed hang. The current statement is migrating an InnoDB table to another storage engine:

ALTER TABLE gtid_slave_pos ENGINE=Aria transactional=0;

If I comment out the problematic call, the regression test suite will pass fine even when the debug assertion in log_free_check() is present:

diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
index 48d205f428a..5e42c59baf9 100644
--- a/storage/innobase/fil/fil0fil.cc
+++ b/storage/innobase/fil/fil0fil.cc
@@ -1689,7 +1689,7 @@ pfs_os_file_t fil_delete_tablespace(ulint id)
     fil_space_free_low(space);
   }
 
-  ibuf_delete_for_discarded_space(id);
+  //ibuf_delete_for_discarded_space(id);
   return handle;
 }
 

This call was introduced in MDEV-22090 in MariaDB 10.5.3, but it probably was not a problem until fil_delete_tablespace() was refactored to implement atomic DDL in 10.6.

Comment by Marko Mäkelä [ 2023-04-26 ]

My fix removes the ibuf_delete_for_discarded_space() call from fil_delete_tablespace() and adds it to its callers, after the caller has released all latches.

Comment by Thirunarayanan Balathandayuthapani [ 2023-04-26 ]

Patch looks OK

Generated at Thu Feb 08 10:21:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.