Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.11.3, 10.6.13
Description
After uploading MariaDB 10.11.3 to Launchpad builders several of them failed on test innodb.recovery_memory:
innodb.recovery_memory 'innodb,release' w3 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2023-05-11 22:44:39
|
line
|
2023-05-11 22:44:34 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 166 OS file reads, 66 OS file writes, 0 OS fsyncs.
|
Examples from amd64 and arm64 builds:
https://launchpadlibrarian.net/665736950/buildlog_ubuntu-mantic-arm64.mariadb_1%3A10.11.3-1~ubuntu23.10.1~1683836249.0a0f09bbe32.dev.otto_BUILDING.txt.gz
https://launchpadlibrarian.net/665721775/buildlog_ubuntu-mantic-amd64.mariadb_1%3A10.11.3-1~ubuntu23.10.1~1683836249.0a0f09bbe32.dev.otto_BUILDING.txt.gz
I see thiru worked on this test in February/March of 2023, he perhaps would know best what might have regressed here?
Attachments
Issue Links
- is caused by
-
MDEV-26827 Make page flushing even faster
-
- Closed
-
- relates to
-
MDEV-26827 Make page flushing even faster
-
- Closed
-
-
MDEV-31353 InnoDB recovery hangs after reporting corruption
-
- Closed
-
-
MDEV-31354 SIGSEGV in log_sort_flush_list() in InnoDB crash recovery
-
- Closed
-
-
MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress
-
- Closed
-
This patch looks promising so far. I will let the test run for 88×100 runs (currently about 88×41 passed):
diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
index 91c8de3191b..90263757c19 100644
--- a/storage/innobase/buf/buf0flu.cc
+++ b/storage/innobase/buf/buf0flu.cc
@@ -2438,6 +2438,7 @@ static void buf_flush_page_cleaner()
else if (buf_pool.ran_out())
{
buf_pool.page_cleaner_set_idle(false);
+ buf_pool.get_oldest_modification(0);
mysql_mutex_unlock(&buf_pool.flush_list_mutex);
n= srv_max_io_capacity;
This is a regression due to
MDEV-26827. BetweenMDEV-23855andMDEV-26827, LRU eviction flushing was only initiated by user threads.I believe that
MDEV-29911fixed the original reported problem, because crash recovery will use a different way of allocating data pages from the buffer pool. The only additional thing that needs to be fixed is this hang due toMDEV-26827.