[MDEV-31350] test innodb.recovery_memory failed on '21 failed attempts to flush a page' - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.11.3, 10.6.13
Fix Version/s: 11.1.1, 11.0.2, 10.6.14, 10.9.7, 10.10.5, 10.11.4
Component/s: Storage Engine - InnoDB
Labels:
- hang
- recovery
- regression

Description

After uploading MariaDB 10.11.3 to Launchpad builders several of them failed on test innodb.recovery_memory:

innodb.recovery_memory 'innodb,release'  w3 [ fail ]  Found warnings/errors in server log file!

        Test ended at 2023-05-11 22:44:39

line

2023-05-11 22:44:34 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync): 0. 166 OS file reads, 66 OS file writes, 0 OS fsyncs.

Examples from amd64 and arm64 builds:
https://launchpadlibrarian.net/665736950/buildlog_ubuntu-mantic-arm64.mariadb_1%3A10.11.3-1~ubuntu23.10.1~1683836249.0a0f09bbe32.dev.otto_BUILDING.txt.gz
https://launchpadlibrarian.net/665721775/buildlog_ubuntu-mantic-amd64.mariadb_1%3A10.11.3-1~ubuntu23.10.1~1683836249.0a0f09bbe32.dev.otto_BUILDING.txt.gz

I see thiru worked on this test in February/March of 2023, he perhaps would know best what might have regressed here?

Attachments

Issue Links

is caused by

MDEV-26827 Make page flushing even faster

Closed

relates to

MDEV-26827 Make page flushing even faster

Closed

MDEV-31353 InnoDB recovery hangs after reporting corruption

Closed

MDEV-31354 SIGSEGV in log_sort_flush_list() in InnoDB crash recovery

Closed

MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress

Closed

Activity

Ascending order - Click to sort in descending order

Otto Kekäläinen created issue - 2023-05-26 05:38

Marko Mäkelä added a comment - 2023-05-26 05:48

The purpose of this test is to exercise crash recovery in the case that multiple recovery batches will be needed, because the parsed log records will not fit in the buffer pool at once.

I recently improved the memory management of crash recovery in ~~MDEV-29911~~. That fix has not been merged beyond the 10.9 branch yet. However, there was a failure of that test in 10.9 yesterday, with a recovery hang:

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-25 10:42:54 0 [Note] InnoDB: End of log at LSN=1540268
2023-05-25 10:42:54 0 [Note] InnoDB: To recover: LSN 528647/1540268; 279 pages
2023-05-25 10:42:54 0 [Note] InnoDB: To recover: LSN 1004400/1540268; 269 pages
CURRENT_TEST: innodb.recovery_memory
…
2023-05-25 10:51:38 0 [Note] Starting MariaDB 10.9.7-MariaDB-log source revision 44c9008ba65686abf1c82c9166255a8c52d61f74 as process 325396

Recovery was apparently stuck for almost 9 minutes, without any further messages being issued. After ~~MDEV-29911~~, they are supposed to be issued at the start of each recovery batch and every 15 seconds within a batch. Based on the reported LSN, it looks like at least 3 batches would have been needed.

Marko Mäkelä added a comment - 2023-05-26 05:48 The purpose of this test is to exercise crash recovery in the case that multiple recovery batches will be needed, because the parsed log records will not fit in the buffer pool at once. I recently improved the memory management of crash recovery in MDEV-29911 . That fix has not been merged beyond the 10.9 branch yet. However, there was a failure of that test in 10.9 yesterday , with a recovery hang: 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 2023-05-25 10:42:54 0 [Note] InnoDB: End of log at LSN=1540268 2023-05-25 10:42:54 0 [Note] InnoDB: To recover: LSN 528647/1540268; 279 pages 2023-05-25 10:42:54 0 [Note] InnoDB: To recover: LSN 1004400/1540268; 269 pages CURRENT_TEST: innodb.recovery_memory … 2023-05-25 10:51:38 0 [Note] Starting MariaDB 10.9.7-MariaDB-log source revision 44c9008ba65686abf1c82c9166255a8c52d61f74 as process 325396 Recovery was apparently stuck for almost 9 minutes, without any further messages being issued. After MDEV-29911 , they are supposed to be issued at the start of each recovery batch and every 15 seconds within a batch. Based on the reported LSN, it looks like at least 3 batches would have been needed.

Marko Mäkelä made changes - 2023-05-26 05:48

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-29911~~ [ ~~MDEV-29911~~ ]

Marko Mäkelä made changes - 2023-05-26 05:48

Component/s		Storage Engine - InnoDB [ 10129 ]
Fix Version/s		10.6 [ 24028 ]
Fix Version/s		10.9 [ 26905 ]
Fix Version/s		10.10 [ 27530 ]
Fix Version/s		10.11 [ 27614 ]
Fix Version/s		11.0 [ 28320 ]
Fix Version/s		11.1 [ 28549 ]
Assignee		Marko Mäkelä [ marko ]
Labels		hang recovery
Priority	Minor [ 4 ]	Critical [ 2 ]

Marko Mäkelä added a comment - 2023-05-26 06:52

2,200 runs of the test on a Debug build passed.

After I switched to a RelWithDebInfo build, I was luckier, but got no core dump or stack traces yet:

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
innodb.recovery_memory 'innodb,release' w32 [ 35 pass ] 16269
innodb.recovery_memory 'innodb,release' w66 [ fail ]
Test ended at 2023-05-26 09:39:59

CURRENT_TEST: innodb.recovery_memory
mysqltest: In included file "./include/wait_until_connected_again.inc":
included from ./include/start_mysqld.inc at line 49:
included from ./include/restart_mysqld.inc at line 11:
included from /mariadb/11/mysql-test/suite/innodb/t/recovery_memory.test at line 27:
At line 40: Server failed to restart

This was with 66 concurrently running servers, and one of the workers got stuck in the very first round. The server error log ends in the following:

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 9:30:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47006
2023-05-26 9:30:59 0 [Note] InnoDB: Multi-batch recovery needed at LSN 533156
2023-05-26 9:30:59 0 [Note] InnoDB: End of log at LSN=1318353
2023-05-26 9:30:59 0 [Note] InnoDB: To recover: LSN 533156/1318353; 279 pages
2023-05-26 9:30:59 0 [Note] InnoDB: To recover: LSN 1013329/1318353; 269 pages

The reason why I think that a hang and the reported message are equivalent is that very often, if the message InnoDB: Difficult to find free blocks is output, a hang will follow.

I also found a recent occurrence of a hang that predates the merge of the ~~MDEV-29911~~ fix:

10.10 a089ebd0dd11547019bed8bb8495b57c73666b83 plus a merge
2023-05-11 9:10:34 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47038
2023-05-11 9:10:34 0 [Note] InnoDB: Starting a batch to recover 278 pages from redo log.
CURRENT_TEST: innodb.recovery_memory
…
2023-05-11 9:19:15 0 [Note] Starting MariaDB 10.10.5-MariaDB-log source revision 8311dbeb1f60e1687e4663f02de3c15441049239 as process 326426

Marko Mäkelä added a comment - 2023-05-26 06:52 2,200 runs of the test on a Debug build passed. After I switched to a RelWithDebInfo build, I was luckier, but got no core dump or stack traces yet: 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 innodb.recovery_memory 'innodb,release' w32 [ 35 pass ] 16269 innodb.recovery_memory 'innodb,release' w66 [ fail ] Test ended at 2023-05-26 09:39:59 CURRENT_TEST: innodb.recovery_memory mysqltest: In included file "./include/wait_until_connected_again.inc": included from ./include/start_mysqld.inc at line 49: included from ./include/restart_mysqld.inc at line 11: included from /mariadb/11/mysql-test/suite/innodb/t/recovery_memory.test at line 27: At line 40: Server failed to restart This was with 66 concurrently running servers, and one of the workers got stuck in the very first round. The server error log ends in the following: 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 2023-05-26 9:30:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47006 2023-05-26 9:30:59 0 [Note] InnoDB: Multi-batch recovery needed at LSN 533156 2023-05-26 9:30:59 0 [Note] InnoDB: End of log at LSN=1318353 2023-05-26 9:30:59 0 [Note] InnoDB: To recover: LSN 533156/1318353; 279 pages 2023-05-26 9:30:59 0 [Note] InnoDB: To recover: LSN 1013329/1318353; 269 pages The reason why I think that a hang and the reported message are equivalent is that very often, if the message InnoDB: Difficult to find free blocks is output, a hang will follow. I also found a recent occurrence of a hang that predates the merge of the MDEV-29911 fix: 10.10 a089ebd0dd11547019bed8bb8495b57c73666b83 plus a merge 2023-05-11 9:10:34 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=47038 2023-05-11 9:10:34 0 [Note] InnoDB: Starting a batch to recover 278 pages from redo log. CURRENT_TEST: innodb.recovery_memory … 2023-05-11 9:19:15 0 [Note] Starting MariaDB 10.10.5-MariaDB-log source revision 8311dbeb1f60e1687e4663f02de3c15441049239 as process 326426

Marko Mäkelä made changes - 2023-05-26 06:52

Status

Open [ 1 ]

In Progress [ 3 ]

Marko Mäkelä added a comment - 2023-05-26 07:49

I finally reproduced a hang. The main recovery thread is waiting for a write batch to happen:

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
#5 0x0000559d47c23e35 in buf_flush_wait (lsn=lsn@entry=11873250) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1981
#6 0x0000559d47c24e27 in buf_flush_sync_batch (lsn=11873250) at /mariadb/11/storage/innobase/buf/buf0flu.cc:2624
#7 0x0000559d47cbecad in recv_sys_t::apply_batch (this=this@entry=0x559d47fe6500 <recv_sys>, space_id=<optimized out>, space_id@entry=4294967295, space=@0x7fffa986e0b0: 0x559d4a3172f0, free_block=@0x7fffa986e0c0: 0x7f33faa3aa40, last_batch=false) at /mariadb/11/storage/innobase/log/log0recv.cc:3597
#8 0x0000559d47cbf333 in recv_sys_t::apply (this=this@entry=0x559d47fe6500 <recv_sys>, last_batch=false) at /mariadb/11/storage/innobase/log/log0recv.cc:3902
#9 0x0000559d47ccceac in recv_sys_t::parse<recv_buf, true> (this=this@entry=0x559d47fe6500 <recv_sys>, l=@0x7fffa9872440: {ptr = 0x7f33f9283be2 "5"}, if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:2920
#10 0x0000559d47cc0362 in recv_sys_t::parse_mtr<true> (if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:3088
#11 recv_sys_t::parse_pmem<true> (if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:3099
#12 recv_scan_log (last_phase=true) at /mariadb/11/storage/innobase/log/log0recv.cc:4104
#13 0x0000559d47cbfd96 in recv_recovery_from_checkpoint_start () at /mariadb/11/storage/innobase/log/log0recv.cc:4601

The page cleaner thread is periodically invoking a batch:

#0  buf_flush_LRU_list_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1211

#1  buf_do_LRU_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1343

#2  buf_flush_LRU (max_n=max_n@entry=2000, evict=false) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1689

#3  0x0000559d47c24c3e in buf_flush_page_cleaner () at /mariadb/11/storage/innobase/buf/buf0flu.cc:2445

Despite this, buf_pool.free remains empty and all 209 pages of buf_pool.LRU are also in buf_pool.flush_list. At least some of those pages are actually clean (oldest_modification_==1, ~~MDEV-25113~~).

Marko Mäkelä added a comment - 2023-05-26 07:49 I finally reproduced a hang. The main recovery thread is waiting for a write batch to happen: 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 #5 0x0000559d47c23e35 in buf_flush_wait (lsn=lsn@entry=11873250) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1981 #6 0x0000559d47c24e27 in buf_flush_sync_batch (lsn=11873250) at /mariadb/11/storage/innobase/buf/buf0flu.cc:2624 #7 0x0000559d47cbecad in recv_sys_t::apply_batch (this=this@entry=0x559d47fe6500 <recv_sys>, space_id=<optimized out>, space_id@entry=4294967295, space=@0x7fffa986e0b0: 0x559d4a3172f0, free_block=@0x7fffa986e0c0: 0x7f33faa3aa40, last_batch=false) at /mariadb/11/storage/innobase/log/log0recv.cc:3597 #8 0x0000559d47cbf333 in recv_sys_t::apply (this=this@entry=0x559d47fe6500 <recv_sys>, last_batch=false) at /mariadb/11/storage/innobase/log/log0recv.cc:3902 #9 0x0000559d47ccceac in recv_sys_t::parse<recv_buf, true> (this=this@entry=0x559d47fe6500 <recv_sys>, l=@0x7fffa9872440: {ptr = 0x7f33f9283be2 "5"}, if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:2920 #10 0x0000559d47cc0362 in recv_sys_t::parse_mtr<true> (if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:3088 #11 recv_sys_t::parse_pmem<true> (if_exists=true) at /mariadb/11/storage/innobase/log/log0recv.cc:3099 #12 recv_scan_log (last_phase=true) at /mariadb/11/storage/innobase/log/log0recv.cc:4104 #13 0x0000559d47cbfd96 in recv_recovery_from_checkpoint_start () at /mariadb/11/storage/innobase/log/log0recv.cc:4601 The page cleaner thread is periodically invoking a batch: #0 buf_flush_LRU_list_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1211 #1 buf_do_LRU_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1343 #2 buf_flush_LRU (max_n=max_n@entry=2000, evict=false) at /mariadb/11/storage/innobase/buf/buf0flu.cc:1689 #3 0x0000559d47c24c3e in buf_flush_page_cleaner () at /mariadb/11/storage/innobase/buf/buf0flu.cc:2445 Despite this, buf_pool.free remains empty and all 209 pages of buf_pool.LRU are also in buf_pool.flush_list . At least some of those pages are actually clean ( oldest_modification_==1 , MDEV-25113 ).

Marko Mäkelä added a comment - 2023-05-26 08:00

For all pages in the buffer pool, oldest_modification_ is 1, that is, the pages are clean. Because buf_flush_LRU_list_batch() does not hold buf_pool.flush_list_mutex, it cannot evict such clean pages. I think that ensuring that buf_flush_page_cleaner() invokes buf_pool_t::get_oldest_modification() along this code path must resolve this, because that call should collect any garbage (clean pages) from buf_pool.flush_list.

Marko Mäkelä added a comment - 2023-05-26 08:00 For all pages in the buffer pool, oldest_modification_ is 1, that is, the pages are clean. Because buf_flush_LRU_list_batch() does not hold buf_pool.flush_list_mutex , it cannot evict such clean pages. I think that ensuring that buf_flush_page_cleaner() invokes buf_pool_t::get_oldest_modification() along this code path must resolve this, because that call should collect any garbage (clean pages) from buf_pool.flush_list .

Marko Mäkelä made changes - 2023-05-26 08:26

Link

This issue relates to ~~MDEV-26827~~ [ ~~MDEV-26827~~ ]

Marko Mäkelä added a comment - 2023-05-26 08:26

This patch looks promising so far. I will let the test run for 88×100 runs (currently about 88×41 passed):

diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc

index 91c8de3191b..90263757c19 100644

--- a/storage/innobase/buf/buf0flu.cc

+++ b/storage/innobase/buf/buf0flu.cc

@@ -2438,6 +2438,7 @@ static void buf_flush_page_cleaner()

     else if (buf_pool.ran_out())

       buf_pool.page_cleaner_set_idle(false);

+      buf_pool.get_oldest_modification(0);

       mysql_mutex_unlock(&buf_pool.flush_list_mutex);

       n= srv_max_io_capacity;

       mysql_mutex_lock(&buf_pool.mutex);

This is a regression due to ~~MDEV-26827~~. Between ~~MDEV-23855~~ and ~~MDEV-26827~~, LRU eviction flushing was only initiated by user threads.

I believe that ~~MDEV-29911~~ fixed the original reported problem, because crash recovery will use a different way of allocating data pages from the buffer pool. The only additional thing that needs to be fixed is this hang due to ~~MDEV-26827~~.

Marko Mäkelä added a comment - 2023-05-26 08:26 This patch looks promising so far. I will let the test run for 88×100 runs (currently about 88×41 passed): diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index 91c8de3191b..90263757c19 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -2438,6 +2438,7 @@ static void buf_flush_page_cleaner() else if (buf_pool.ran_out()) { buf_pool.page_cleaner_set_idle(false); + buf_pool.get_oldest_modification(0); mysql_mutex_unlock(&buf_pool.flush_list_mutex); n= srv_max_io_capacity; mysql_mutex_lock(&buf_pool.mutex); This is a regression due to MDEV-26827 . Between MDEV-23855 and MDEV-26827 , LRU eviction flushing was only initiated by user threads. I believe that MDEV-29911 fixed the original reported problem, because crash recovery will use a different way of allocating data pages from the buffer pool. The only additional thing that needs to be fixed is this hang due to MDEV-26827 .

Marko Mäkelä added a comment - 2023-05-26 08:43 - edited

My test was interrupted by a different failure, which I reported as ~~MDEV-31353~~. wc -l mysql-test/var/*/log/mysqld.1.err reported between 16413 and 16844 lines of output for each server instance, so there was no hang.

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 with patch
innodb.recovery_memory 'innodb,release' w49 [ 54 fail ]
Test ended at 2023-05-26 11:29:12

CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'

The server error log says:

10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 with patch
2023-05-26 11:29:11 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978
2023-05-26 11:29:11 0 [Note] InnoDB: Multi-batch recovery needed at LSN 293582048
2023-05-26 11:29:11 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 11:29:11 0 [Note] InnoDB: To recover: LSN 293785020/294552683; 347 pages
2023-05-26 11:29:11 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 11:29:11 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]

The data file indeed is test/t1.ibd, carrying the tablespace ID 216, and innochecksum does not report it as corrupted. But, an attempted single-batch recovery of the saved data directory fails as well; actually it hangs after reporting a failure, to be fixed in ~~MDEV-31353~~.

Marko Mäkelä added a comment - 2023-05-26 08:43 - edited My test was interrupted by a different failure, which I reported as MDEV-31353 . wc -l mysql-test/var/*/log/mysqld.1.err reported between 16413 and 16844 lines of output for each server instance, so there was no hang. 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 with patch innodb.recovery_memory 'innodb,release' w49 [ 54 fail ] Test ended at 2023-05-26 11:29:12 CURRENT_TEST: innodb.recovery_memory mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' The server error log says: 10.9 44c9008ba65686abf1c82c9166255a8c52d61f74 with patch 2023-05-26 11:29:11 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978 2023-05-26 11:29:11 0 [Note] InnoDB: Multi-batch recovery needed at LSN 293582048 2023-05-26 11:29:11 0 [Note] InnoDB: End of log at LSN=294552683 2023-05-26 11:29:11 0 [Note] InnoDB: To recover: LSN 293785020/294552683; 347 pages 2023-05-26 11:29:11 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. 2023-05-26 11:29:11 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4] The data file indeed is test/t1.ibd , carrying the tablespace ID 216, and innochecksum does not report it as corrupted. But, an attempted single-batch recovery of the saved data directory fails as well; actually it hangs after reporting a failure, to be fixed in MDEV-31353 .

Marko Mäkelä made changes - 2023-05-26 09:12

Link

This issue relates to ~~MDEV-31353~~ [ ~~MDEV-31353~~ ]

Marko Mäkelä made changes - 2023-05-26 10:20

Link

This issue relates to ~~MDEV-31354~~ [ ~~MDEV-31354~~ ]

Marko Mäkelä added a comment - 2023-05-26 10:53

I reproduced the same hang on 10.6, with the page cleaner invoking this constantly:

#3  buf_flush_LRU_list_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1275

#4  buf_do_LRU_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1382

#5  buf_flush_LRU (max_n=max_n@entry=2000, evict=false) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1728

#6  0x0000556f50449cc4 in buf_flush_page_cleaner () at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:2337

Again, buf_pool.free was empty, and all 210 entries of buf_pool.LRU or buf_pool.flush_list were actually clean (oldest_modification_ was 1).

Marko Mäkelä added a comment - 2023-05-26 10:53 I reproduced the same hang on 10.6, with the page cleaner invoking this constantly: #3 buf_flush_LRU_list_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1275 #4 buf_do_LRU_batch (max=2000, evict=false, n=<optimized out>) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1382 #5 buf_flush_LRU (max_n=max_n@entry=2000, evict=false) at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:1728 #6 0x0000556f50449cc4 in buf_flush_page_cleaner () at /mariadb/10.6/storage/innobase/buf/buf0flu.cc:2337 Again, buf_pool.free was empty, and all 210 entries of buf_pool.LRU or buf_pool.flush_list were actually clean ( oldest_modification_ was 1).

Marko Mäkelä added a comment - 2023-05-26 11:27

A test with the one-liner fix applied on 10.6 RelWithDebInfo passed uneventfully:

innodb.recovery_memory 'innodb,release'  w42 [ 100 pass ]  14777

--------------------------------------------------------------------------

The servers were restarted 0 times

Spent 143884.706 of 1701 seconds executing testcases

Completed: All 8800 tests were successful.

Marko Mäkelä added a comment - 2023-05-26 11:27 A test with the one-liner fix applied on 10.6 RelWithDebInfo passed uneventfully: innodb.recovery_memory 'innodb,release' w42 [ 100 pass ] 14777 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 143884.706 of 1701 seconds executing testcases Completed: All 8800 tests were successful.

Marko Mäkelä made changes - 2023-05-26 14:03

issue.field.resolutiondate

2023-05-26 14:03:13.0

2023-05-26 14:03:13.745

Marko Mäkelä made changes - 2023-05-26 14:03

Fix Version/s		10.6.14 [ 28914 ]
Fix Version/s		10.9.7 [ 28916 ]
Fix Version/s		10.10.5 [ 28917 ]
Fix Version/s		10.11.4 [ 28918 ]
Fix Version/s		11.0.3 [ 28920 ]
Fix Version/s		11.1.2 [ 28921 ]
Fix Version/s	10.6 [ 24028 ]
Fix Version/s	10.9 [ 26905 ]
Fix Version/s	10.10 [ 27530 ]
Fix Version/s	10.11 [ 27614 ]
Fix Version/s	11.0 [ 28320 ]
Fix Version/s	11.1 [ 28549 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Ralf Gebhardt made changes - 2023-05-30 08:40

Link

This issue is caused by ~~MDEV-26827~~ [ ~~MDEV-26827~~ ]

Ralf Gebhardt made changes - 2023-05-30 08:40

Labels

hang recovery

hang recovery regression

Ralf Gebhardt made changes - 2023-05-30 12:15

Affects Version/s

10.6.13 [ 28514 ]

Julien Fritsch made changes - 2023-05-31 13:13

Link

This issue blocks MENT-1832 [ MENT-1832 ]

Julien Fritsch made changes - 2023-06-02 16:28

Link

This issue blocks MENT-1835 [ MENT-1835 ]

Marko Mäkelä added a comment - 2023-06-06 13:49

Note: the unscheduled releases 10.6.14, 10.9.7, and so on only fix the regression due to ~~MDEV-26827~~, but they will not include ~~MDEV-29911~~. Therefore, in those releases the test may fail as described.

Marko Mäkelä added a comment - 2023-06-06 13:49 Note: the unscheduled releases 10.6.14, 10.9.7, and so on only fix the regression due to MDEV-26827 , but they will not include MDEV-29911 . Therefore, in those releases the test may fail as described.

Daniel Bartholomew made changes - 2023-06-07 13:29

Fix Version/s		10.6.15 [ 29013 ]
Fix Version/s		10.9.8 [ 29015 ]
Fix Version/s		10.10.6 [ 29017 ]
Fix Version/s		10.11.5 [ 29019 ]
Fix Version/s	10.6.14 [ 28914 ]
Fix Version/s	10.9.7 [ 28916 ]
Fix Version/s	10.10.5 [ 28917 ]
Fix Version/s	10.11.4 [ 28918 ]

Daniel Bartholomew made changes - 2023-06-07 14:06

Fix Version/s		10.6.14 [ 28914 ]
Fix Version/s		10.9.7 [ 28916 ]
Fix Version/s		10.10.5 [ 28917 ]
Fix Version/s		10.11.4 [ 28918 ]
Fix Version/s		11.0.2 [ 28706 ]
Fix Version/s		11.1.1 [ 28704 ]
Fix Version/s	11.0.3 [ 28920 ]
Fix Version/s	11.1.2 [ 28921 ]
Fix Version/s	10.6.15 [ 29013 ]
Fix Version/s	10.9.8 [ 29015 ]
Fix Version/s	10.10.6 [ 29017 ]
Fix Version/s	10.11.5 [ 29019 ]

People

Assignee:: Marko Mäkelä

Reporter:: Otto Kekäläinen

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2023-05-26 05:38

Updated:: 2023-06-07 14:06

Resolved:: 2023-05-26 14:03

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration