[MDEV-30180] Server hang with innodb_undo_log_truncate=ON Created: 2022-12-09  Updated: 2023-12-13  Resolved: 2022-12-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.11.2, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang

Issue Links:
Blocks
blocks MDEV-29986 Set innodb_undo_tablespaces=3 by default Closed
Relates
relates to MDEV-30863 Server freeze, all threads in trx_ass... Closed
relates to MDEV-31343 Another server hang with innodb_undo_... Closed
relates to MDEV-33009 Server hangs for a long time with inn... Closed
relates to MDEV-27414 Server may hang when innodb_undo_log_... Closed

 Description   

The fix of MDEV-27414 turns out to be incomplete. The server can still hang, with a deadlock between buf_pool_t::release_freed_page() and trx_purge_truncate_history(). In the latter, we'd better actively wait for the former to complete:

diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
index b834c5d070d..e162456e63f 100644
--- a/storage/innobase/trx/trx0purge.cc
+++ b/storage/innobase/trx/trx0purge.cc
@@ -768,11 +768,12 @@ TRANSACTIONAL_TARGET static void trx_purge_truncate_history()
         auto block= reinterpret_cast<buf_block_t*>(bpage);
         if (!bpage->lock.x_lock_try())
         {
+        rescan:
           /* Let buf_pool_t::release_freed_page() proceed. */
           mysql_mutex_unlock(&buf_pool.flush_list_mutex);
-          std::this_thread::yield();
+          mysql_mutex_lock(&buf_pool.mutex);
           mysql_mutex_lock(&buf_pool.flush_list_mutex);
-        rescan:
+          mysql_mutex_unlock(&buf_pool.mutex);
           bpage= UT_LIST_GET_LAST(buf_pool.flush_list);
           continue;
         }



 Comments   
Comment by Axel Schwenke [ 2022-12-09 ]

I can confirm that commit be6a3d9a01c exhibits the hang and that this hang is gone (fixed) in commit 64231528ed6.

Comment by Marko Mäkelä [ 2023-03-21 ]

Unfortunately, MDEV-30863 is another hang in this area.

Generated at Thu Feb 08 10:14:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.