Details
-
Bug
-
Status: Stalled (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.6, 10.11, 11.4, 11.8
-
Related to performance
-
Q4/2025 Server Maintenance
Description
Reverse index scans in InnoDB currently make use of rather expensive logic for moving to the previous index page in the function btr_pcur_move_backward_from_page(). InnoDB will release all latches and then search for a smaller key than the current one.
We are missing an opportunity to acquire a latch on the preceding index page directly if it is available without any waiting. If any waiting is involved, then we had better use the existing logic, whose motivation is to avoid lock order inversion and therefore deadlocks between concurrent forward and reverse index scans.
The following test demonstrates this:
--source include/have_innodb.inc
|
--source include/have_innodb_4k.inc
|
--source include/have_profiling.inc
|
--source include/have_sequence.inc
|
CREATE TABLE tbig2 (id int auto_increment not null, |
c2 char(255) not null default '', |
c3 char(255) not null default '', |
c4 char(255) not null default '', primary key(id,c2,c3,c4(20))) |
ENGINE=InnoDB default charset=ucs2; |
SET STATEMENT unique_checks=0,foreign_key_checks=0 FOR |
insert into tbig2 (c2) select '' from seq_1_to_524288; |
--disable_result_log
|
set profiling=ON; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
select id from tbig2 order by id desc; |
--enable_result_log
|
show profiles;
|
DROP TABLE tbig2; |
This will create a 1.7 GiB data file, with 2 rows per index leaf page and 3 records per non-leaf page. That would be 262,144 leaf pages of 4 KiB each (1 GiB) and 0.7 GiB of non-leaf pages. I padded the PRIMARY KEY with the data, so that the tree would grow higher.
If I run the test on a large enough innodb_buffer_pool_size so that the data fits there (say, 2 GiB), then my current implementation of the optimization in btr_pcur_move_backward_from_page() would improve throughput by about 10% in one environment. However, if I run the test with a smaller buffer pool, so that the data cannot be resident, then the old implementation turns out to perform faster. The reason ought to be that while InnoDB is searching for the key, it may trigger read-ahead of some preceding pages.
Attachments
Issue Links
- is blocked by
-
MDEV-37244 buf_read_page() should return a buffer-fixed block
-
- In Progress
-
- relates to
-
MDEV-32067 InnoDB linear read ahead had better be logical
-
- Stalled
-