[MDEV-32067] InnoDB linear read ahead had better be logical Created: 2023-09-01  Updated: 2023-11-15

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 11.3
Fix Version/s: 11.4

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 1
Labels: performance

Issue Links:
Blocks
is blocked by MDEV-32068 Some calls to buf_read_ahead_linear()... Closed
Relates
relates to MDEV-11378 AliSQL: [Perf] Issue#23 MERGE INNODB ... Open
relates to MDEV-24854 Change innodb_flush_method=O_DIRECT b... Closed
relates to MDEV-30986 Slow full index scan in 10.6 vs 10.5 ... Closed

 Description   

MDEV-30986 was originally filed due to a change that was made in MDEV-24854: Starting with MariaDB Server 10.6, InnoDB disables the file system cache by default (innodb_flush_method=O_DIRECT).

While disabling the file system cache does improve write performance, it can hurt the performance of those read workloads that cannot be satisfied directly by the InnoDB buffer pool.

In MDEV-30986 it was tested on several types of storage and operating system versions that when the data needs to be loaded into the InnoDB buffer pool, it is faster when the file system cache of the operating system is used. This is the case also when the cache is initially empty. This suggests that the InnoDB read-ahead mechanism could be better.

In key range scans or table scans, it would seem to make sense to post read-ahead requests for the index leaf pages when the level right above the leaf is reached. At that point, we would know which pages will have to be accessed by the query. Possibly, it would help to issue a single larger read instead of several single-page requests (MDEV-11378).



 Comments   
Comment by Marko Mäkelä [ 2023-09-01 ]

I rebased the work-in-progress prototype. MDEV-32068 implements a subset of this: a smarter invocation of buf_read_ahead_linear().

The main challenge for completing the logical read-ahead is to pass the end_key from handler::read_range_first() to btr_cur_t::search_leaf() in order to determine which pages will have to be read ahead. The current patch is always trying to read up to 16 siblings between the requested leaf page and the first or last child page of the current page, no matter if we are in a point select (not going to access any other pages than the current one) or in a table scan (going to access all child pages).

Generated at Thu Feb 08 10:28:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.