[MDEV-26784] [Warning] InnoDB: Difficult to find free blocks in the buffer pool Created: 2021-10-07  Updated: 2023-03-22  Resolved: 2023-02-10

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5, 10.6, 10.7
Fix Version/s: 10.5.14, 10.6.6, 10.7.2

Type: Bug Priority: Critical
Reporter: Nilnandan Joshi Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 1
Labels: recovery

Issue Links:
Relates
relates to MDEV-24278 InnoDB page cleaner keeps waking up o... Closed
relates to MDEV-27461 Server hangs when the bufferpool is s... Closed
relates to MDEV-27326 Mariabackup being overwhelmed during ... Closed
relates to MDEV-30551 InnoDB recovery hangs when buffer poo... Closed

 Description   

The mariabackup --prepare shows this warning:

2021-10-07  1:13:23 0 [Note] InnoDB: Starting a batch to recover 9962 pages from redo log.
2021-10-07  1:13:28 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 582296 OS file reads, 504266 OS file writes, 2396 OS fsyncs.

The error is similar to:
https://jira.mariadb.org/browse/MDEV-20679
https://jira.mariadb.org/browse/MDEV-19176

But looks like they are already fixed in 10.5.1



 Comments   
Comment by Marko Mäkelä [ 2021-10-07 ]

This should also affect InnoDB crash recovery. I think that we must try to reproduce this on 10.5 or 10.6. I would suggest the following:

  1. Use a large innodb_log_file_size and innodb_buffer_pool_size. (What values is the customer using?)
  2. Run a write-heavy workload (UPDATE of an indexed column should be fine).
  3. Kill the server.
  4. Restart with a smaller innodb_buffer_pool_size. The minimum value is 5M.

We are not interested in failures of older versions than 10.5; we know that those can run out of memory easily.
Roel, given that mleich is currently on vacation, maybe you could take up this challenge?

Comment by Roel Van de Paar [ 2021-10-07 ]

Plan to test this tomorrow.

Comment by Roel Van de Paar [ 2021-10-09 ]

Easy to reproduce, without Mariabackup, as described. Present in 10.5 trunk as of today.

10.5.13 4eb7217ec33fef8d23f2dda0c97b442508c81b1d (Debug)

2021-10-09 13:02:06 0 [Note] InnoDB: Initializing buffer pool, total size = 5242880, chunk size = 5242880
2021-10-09 13:02:06 0 [Note] InnoDB: Completed initialization of buffer pool
2021-10-09 13:02:06 0 [Note] InnoDB: Small buffer pool size (5M), the flst_validate() debug function can cause a deadlock if the buffer pool fills up.
2021-10-09 13:02:06 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=45690,45690
2021-10-09 13:02:06 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 153 OS file reads, 2 OS file writes, 2 OS fsyncs.
2021-10-09 13:02:07 0 [Note] InnoDB: Starting final batch to recover 19 pages from redo log.
2021-10-09 13:02:07 0 [Note] InnoDB: Resizing redo log from 12614565888 to 100663296 bytes; LSN=352278

Comment by Roel Van de Paar [ 2021-10-09 ]

Reproducible testcase (w/ thanks marko for the perfect lead-in!)

1. Choose suitable machine (sufficiently large memory), create clean/init data dir
2. Start server with --innodb_buffer_pool_size=50458263552 --innodb_log_file_size=12614565888 
3. Execute the following SQL (single thread suffices) for a minute orso:
   (Note that the data dir size will continue to increase as long as the testcase is running)
 
CREATE DATABASE IF NOT EXISTS test;
USE test;
CREATE TABLE t (c TEXT, KEY(c(3072))) ENGINE=InnoDB;
DELIMITER //
CREATE PROCEDURE dorepeat()
  LOOP
    INSERT INTO t VALUES ('abc');
    UPDATE t SET c='cba';
  END LOOP
//
DELIMITER ;
CALL dorepeat();
 
4. Kill the server randomly while this is happening
5. Start server with --innodb_buffer_pool_size=5242880 and immediately check for issue occurence
   grep 'free blocks' log/master.err

Comment by Roel Van de Paar [ 2021-10-09 ]

Thus, the MariaBackup issue is an effect only, and MariaBackup is not a/the cause.

Comment by Roel Van de Paar [ 2021-10-09 ]

Issue present in 10.6 trunk

10.6.5 583516bbb9c14f82866942e4cc159edfdc3a1bfa (Debug)

2021-10-09 13:16:43 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 185 OS file reads, 0 OS file writes, 0 OS fsyncs.

Comment by Roel Van de Paar [ 2021-10-09 ]

Issue present in 10.7 trunk

10.7.1 5cc9cf9a05582307abefd8f3b57548945da86b1a (Debug)

2021-10-09 13:18:55 0 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 149 OS file reads, 2 OS file writes, 2 OS fsyncs.

Comment by Marko Mäkelä [ 2022-01-20 ]

I diagnosed this with thiru today. Similar to MDEV-27461, the buffer pool is filled with modified pages, which will not be written out because the page cleaner is in untimed sleep (MDEV-24278).

We agreed on a fix that buf_flush_LRU_list_batch() should disregard the 256-page minimum length of buf_pool.LRU (BUF_LRU_MIN_LEN) during recovery, and simply initiate writes of all dirty pages whenever a page needs to be allocated. In the hang that thiru reproduced, the length of buf_pool.LRU was 215 pages.

That fix is OK to push, along with the regression test case that reproduces the issue.

Comment by Michael Widenius [ 2023-02-10 ]

The Innodb.recovery_memory test fails constantly in bb-11.0 with:

CURRENT_TEST: innodb.recovery_memory
mysqltest: In included file "./include/wait_until_connected_again.inc":
included from ./include/start_mysqld.inc at line 49:
included from ./include/restart_mysqld.inc at line 11:
included from /home/my/maria-11.0-org/mysql-test/suite/innodb/t/recovery_memory.test at line 21:
At line 40: Server failed to restart

I assume most developers has not noticed this as this is marked as 'big'.

Comment by Marko Mäkelä [ 2023-02-10 ]

monty, I think that your observation needs to be considered in the scope of the follow-up fix MDEV-30551 that thiru is currently working on.

Comment by Thirunarayanan Balathandayuthapani [ 2023-02-22 ]

monty Issue has been fixed as MDEV-30673

Generated at Thu Feb 08 09:47:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.