[MDEV-27868] buf_pool.flush_list is in the wrong order Created: 2022-02-17  Updated: 2022-06-09  Resolved: 2022-02-17

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: N/A
Fix Version/s: 10.9.0, 10.8.3

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: corruption, race, recovery, regression-10.8

Issue Links:
Problem/Incident
causes MDEV-28708 Increased congestion on buf_pool.flus... Closed
is caused by MDEV-27774 Reduce scalability bottlenecks in mtr... Closed
Relates
relates to MDEV-25113 Reduce effect of parallel background ... Closed

 Description   

In MDEV-27774, the log_sys.mutex and log_sys.flush_order_mutex were replaced with a shared log_sys.latch. This means that concurrently executing mtr_t::commit() may insert records to the buf_pool.flush_list roughly concurrently. Each insertion is still protected by buf_pool.flush_list_mutex.

In buf_pool_t::insert_into_flush_list() we attempted to compensate for this by not unconditionally inserting blocks first in buf_pool.flush_list but by searching for an appropriate insert position. This compensation does not appear to be working at all times. With the following command, I was able to reproduce a crash on my system:

./mtr --parallel=auto --repeat=100 encryption.innodb_encryption_filekeys{,,,,}{,,,}

10.8 8251a9fb93075a72074bd7fd10faee5165014b7f

encryption.innodb_encryption_filekeys 'cbc,innodb' w1 [ 26 fail ]
        Test ended at 2022-02-17 10:28:31
 
CURRENT_TEST: encryption.innodb_encryption_filekeys
 
 
Server [mysqld.1 - pid: 488142, winpid: 488142, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2022-02-17 10:28:30 104 [Note] InnoDB: Creating #1 encryption thread id 140526397404736 total threads 4.
2022-02-17 10:28:30 104 [Note] InnoDB: Creating #2 encryption thread id 140526389012032 total threads 4.
2022-02-17 10:28:30 104 [Note] InnoDB: Creating #3 encryption thread id 140526405797440 total threads 4.
2022-02-17 10:28:30 104 [Note] InnoDB: Creating #4 encryption thread id 140526414190144 total threads 4.
mariadbd: /mariadb/10.8/storage/innobase/buf/buf0flu.cc:2538: void buf_flush_validate_low(): Assertion `om == 1 || !bpage || __builtin_expect(recv_sys.recovery_on, (0)) || om >= bpage->oldest_modification()' failed.

The assertion reports that buf_pool.flush_list is not ordered by buf_page_t::oldest_modification(), like it must be.

The impact of this bug is that log checkpoints and thus crash recovery and backup may work incorrectly.



 Comments   
Comment by Marko Mäkelä [ 2022-02-17 ]

Another failure:

CURRENT_TEST: encryption.innodb-checksum-algorithm
mysqltest: At line 53: query 'ALTER TABLE tc DISCARD TABLESPACE' failed: <Unknown> (2013): Lost connection to server during query
buf/buf0flu.cc:2585(buf_flush_validate_low())[0x55d918032422]
buf/buf0flu.cc:112(buf_flush_validate_skip())[0x55d9180367fd]
buf/buf0flu.cc:220(buf_pool_t::delete_from_flush_list(buf_page_t*, bool))[0x55d918036a4c]
buf/buf0flu.cc:254(buf_flush_remove_pages(unsigned int))[0x55d917ef81d5]
row/row0mysql.cc:2543(row_discard_tablespace_for_mysql(dict_table_t*, trx_t*))[0x55d917d74fc3]

Comment by Marko Mäkelä [ 2022-02-17 ]

I had forgotten that since MDEV-25113, the buf_pool.flush_list may contain clean blocks that are identified by buf_page_t:oldest_modification()==1. Those blocks must be removed or disregarded when buf_pool_t::insert_into_flush_list() determines the correct insert position.

Generated at Thu Feb 08 09:56:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.