[MDEV-30673] InnoDB recovery hangs when buf_LRU_get_free_block is being called Created: 2023-02-17  Updated: 2023-02-17  Resolved: 2023-02-17

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 11.0
Fix Version/s: 11.0.1

Type: Bug Priority: Major
Reporter: Thirunarayanan Balathandayuthapani Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-30551 InnoDB recovery hangs when buffer poo... Closed

 Description   

MDEV-30551 fails to release log_sys.latch in one more code path to access buf_LRU_get_free_block() for non-last batch during recovery.

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index fbfe37d7d41..1662b90dcc8 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -3147,10 +3147,12 @@ bool recv_recover_page(fil_space_t* space, buf_page_t* bpage)
 }
 
 /** Read pages for which log needs to be applied.
-@param page_id first page identifier to read
-@param i        iterator to recv_sys.pages */
+@param page_id    first page identifier to read
+@param i          iterator to recv_sys.pages
+@param last_batch whether it is possible to write more redo log */
 TRANSACTIONAL_TARGET
-static void recv_read_in_area(page_id_t page_id, recv_sys_t::map::iterator i)
+static void recv_read_in_area(page_id_t page_id, recv_sys_t::map::iterator i,
+                              bool last_batch)
 {
   uint32_t page_nos[32];
   ut_ad(page_id == i->first);
@@ -3170,7 +3172,9 @@ static void recv_read_in_area(page_id_t page_id, recv_sys_t::map::iterator i)
   if (p != page_nos)
   {
     mysql_mutex_unlock(&recv_sys.mutex);
+    if (!last_batch) log_sys.latch.wr_unlock();
     buf_read_recv_pages(page_id.space(), {page_nos, p});
+    if (!last_batch) log_sys.latch.wr_lock(SRW_LOCK_CALL);
     mysql_mutex_lock(&recv_sys.mutex);
   }
 }
@@ -3459,7 +3463,7 @@ void recv_sys_t::apply(bool last_batch)
         ut_ad(p == pages.end() || p->first > page_id);
         continue;
       case page_recv_t::RECV_NOT_PROCESSED:
-        recv_read_in_area(page_id, p);
+        recv_read_in_area(page_id, p, last_batch);
       }
       p= pages.lower_bound(page_id);
       /* Ensure that progress will be made. */

Above patch should fix the issue. Thanks to MSAN build which consistenly hangs on innodb.recovery_memory test case.



 Comments   
Comment by Thirunarayanan Balathandayuthapani [ 2023-02-17 ]

Patch is in bb-11.0-MDEV-30673

Comment by Marko Mäkelä [ 2023-02-17 ]

Thank you. OK to push.

Generated at Thu Feb 08 10:18:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.