[MDEV-27576] Use DESC indexes for MIN/MAX optimization Created: 2022-01-22  Updated: 2023-12-27  Resolved: 2023-12-14

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 11.4.1

Type: Task Priority: Critical
Reporter: Elena Stepanova Assignee: Yuchen Pei
Resolution: Fixed Votes: 0
Labels: Preview_11.4, optimizer-easy, optimizer-feature

Issue Links:
PartOf
Problem/Incident
is caused by MDEV-13756 Implement descending index: KEY (a DE... Closed
Relates
relates to MDEV-32732 Support DESC indexes in loose scan op... Open

 Description   

As of c10e10c6, opt_sum.cc is still stubbed for DESC indexes

      if (part->key_part_flag & HA_REVERSE_SORT)
        break; // TODO MDEV-13756

Given that we are about to close MDEV-13756, it looks strange: either it was forgotten and needs to be enabled, or maybe it requires a better reference.

Given the above, the optimization expectedly doesn't work.

--source include/have_sequence.inc
 
create or replace table t1 (id int, key(id));
insert into t1 select seq from seq_1_to_100 order by rand(1);
explain extended select max(id) from t1 where id > 50;
 
create or replace table t1 (id int, key(id desc));
insert into t1 select seq from seq_1_to_100 order by rand(1);
explain extended select max(id) from t1 where id > 50;
 
# Cleanup
drop table t1;

With the ASC index, it is

preview-10.8-MDEV-13756-desc-indexes c10e10c6

+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+
|    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+

With the DESC index, it is

+------+-------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------+
|    1 | SIMPLE      | t1    | range | id            | id   | 5       | NULL | 50   |   100.00 | Using where; Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------+



 Comments   
Comment by Elena Stepanova [ 2022-01-23 ]

I don't know what it will take to enable it, but it doesn't appear to be as simple as removing that if – having done so locally, I ran a quick set of tests and it returned quite a few failures. I can't say whether they genuinely relate to DESC indexes or just reveal some old problems, known or not, but they would certainly need to be investigated at least. So, given the release schedule, it may be safer to keep it as is now and, if at all needed, create a separate task for 10.9 about using DESC indexes in the optimizations. Or converting this item into such a task.

However, there are at least some effects of the disabled optimization which can come as a surprise. For example, this

CREATE TABLE t1 (a FLOAT, KEY (a DESC));
INSERT INTO t1 VALUES (0.1234),(0.6789);
SELECT MAX(a) FROM t1 WHERE a <= 0.6789;

returns

preview-10.8-MDEV-13756-desc-indexes c10e10c6

MAX(a)
0.1234

I can't raise it to a bug level since it relies on float comparison which is not to be relied upon, but looking at existing MTR tests, for "normal" ascending indexes it is assumed that it will return the bigger value. Thus a surprise.

Comment by Yuchen Pei [ 2023-06-06 ]

Trying to understand where the issue comes from. If I remove the if, I get NULL with descending index. The code path looks similar between the two, but after the call to find_key_for_maxmin(), get_index_max_value() assigns the max (*((Item *) conds)->cmp->a)->val_int(), which is 100 in the usual index and 1 in the descending index. So I assume the fix should come before or during the call to get_index_max_value().

The main actions of finding the max happens in the storage engine, though it is probably independent of storage engines as defined by the contract of index_last(), and the change for this ticket presumably should only touch the optimizer / sql layer. The storage engine code is rather low level, but to find out why things don't work as expected I have to look into it. With InnoDB, I notice that one difference between the two scenarios is page_rec_get_prev_const() counts up to 100 in the usual index and counts down to 1 in the descending index. To be further investigated...

Comment by Yuchen Pei [ 2023-11-01 ]

The difference between counting up and down is simply because
index_last() looks for the last value according to the index: asc
-> last is max; desc -> last is min. So what needs to be done is to
invert these logic when doing desc, i.e. use index_first() etc.
Here's a PoC patch

upstream/bb-11.3-mdev-27576 5bd89cc733f4553612e9eb4ae74161059154a683
MDEV-27576 [PoC] Use reverse index for max/min optimization

Comment by Yuchen Pei [ 2023-11-02 ]

Hi psergei, ptal thanks

upstream/bb-11.3-mdev-27576 c4c4bd8e47e5bfd4d0a650e286ea9daa3e6276f0
MDEV-27576 Use reverse index for max/min optimization
 
We add a field in st_table_ref to indicate that the key used is
a descending index, which will flip the functions and flags used in
get_index_max_value() and get_index_min_value(), that allows correct
optimization for max/min for descending index.

Comment by Sergei Petrunia [ 2023-11-08 ]

ycp, replied:
https://lists.mariadb.org/hyperkitty/list/developers@lists.mariadb.org/thread/IWKG5FXYNP44SUCNRVJU2RLMZGMARIH5/

Comment by Yuchen Pei [ 2023-11-09 ]

Thanks for the comments psergei. I addressed them and updated my commit. ptal thanks (I understand the fixversion is 11.5 so no rush):

upstream/bb-11.3-mdev-27576 8398e5c153320efbbeeeeac9d9029e0f932aaaa2
MDEV-27576 Use reverse index for max/min optimization
 
We use a bool to indicate that the key part used is a descending
index, which will flip the functions and flags used in
get_index_max_value() and get_index_min_value(), that allows correct
optimization for max/min for descending index.

Comment by Sergei Petrunia [ 2023-11-13 ]

The patch is ok, I've only requested some formatting changes via email.

Then, we need to submit it for testing.

Comment by Yuchen Pei [ 2023-11-14 ]

Thanks for the review psergei. Applied the formatting patch and
pushed to branch bb-11.4-mdev-27576-preview for testing:

upstream/bb-11.4-mdev-27576-preview 41566ac3896c6a6a01df10062b2fc29df3f84f86
MDEV-27576 Use reverse index for max/min optimization
 
We use a bool to indicate that the key part used is a descending
index, which will flip the functions and flags used in
get_index_max_value() and get_index_min_value(), that allows correct
optimization for max/min for descending index.

BTW, is this still targeted at 11.5?

Comment by Lena Startseva [ 2023-12-11 ]

Testing done. Ok to push.

But there are some notes about optimization strategies used by the query optimizer when there is ascending/descending index in the table. These moments were discussed with psergei and at this time they do not require improvement (because there is a task for improvement, or there is no need for improvement), but I think it is necessary keep this information.

For the table:

CREATE TABLE t1
(a VARCHAR(10),
b VARCHAR(10),
PRIMARY KEY (a DESC, b DESC),
KEY ab_asc (a ASC, b ASC),
KEY a_asc_b_desc (a ASC, b DESC),
key a_desc_b_asc (a DESC, b ASC))
ENGINE = InnoDB;

The following query plans are:

EXPLAIN  SELECT max(a)  FROM t1;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Select tables optimized away
EXPLAIN  SELECT max(a)  FROM t1 ORDER BY a DESC;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t1	index	NULL	ab_asc	24	NULL	13	Using index

Thus second query:

  1. is using index, but in this case the query will return only one result and the "group by" can be ignored
  2. optimizer uses index "ab_asc" instead of Primary key with descending order
Comment by Yuchen Pei [ 2023-12-14 ]

pushed the following to 11.4

875377ad824473774c833b1aff4346ba3133f092 MDEV-27576 Use reverse index for max/min optimization

Generated at Thu Feb 08 09:53:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.