[MDEV-26996] Support descending indexes in the range optimizer Created: 2021-11-08  Updated: 2023-03-21  Resolved: 2022-02-01

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 10.8.1

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Golubchik
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-13756 Implement descending index: KEY (a DE... Closed
Relates
relates to MDEV-26939 support descending indexes for simple... Closed

 Description   

Trying to figure out what does one needs to support descending index key parts in the range optimizer.

Range analyzer

The range analyzer generates "ascending" SEL_ARG data structures. For example, for tbl.key < 10 it will generate a SEL_ARG(min=NULL, max=10).

There are two possible ways to handle DESC key parts:

  • Change the generated SEL_ARG structures (so that for the above example, SEL_ARG(min=10, max=NULL) will be generated).
  • Leave the SEL_ARG graph as-is, but flip min and max when interpreting it.

(Yes, the second is possible, because there's no "overlap" between key parts in SEL_ARG graphs. They are connected through SEL_ARG::next_key_part, but that's it).

When converting SEL_ARGs to range trees, one can't ignore the DESC attribute anymore.

Example: Consider an index key1(kp1, kp2 DESC)

and a WHERE clause

kp1 BETWEEN 10 and 20 AND kp2 BETWEEN 100 and 200

if both kp1 and kp2 were ASC, one would get this range

 (10, 100) <= (kp1,kp2) <= (20, 200)

but since kp2 is DESC, the range needs to be

 (10, 200) <= (kp1, kp2) <= (20, 100)

Suggestion how to handle range analyzer

Current solution candidate:

  • Leave the range analyzer as-is
  • When we are enumerating SEL_ARG graph, "flip" the SEL_ARG graph direction for DESC key parts
    • Use min_value instead of max_value and vice versa.
    • The same for min_flag and max_flag
    • Also, flip SEL_ARG::last() with SEL_ARG::first() and next() with prev().

Overview of things to fix

The fix will be limited to the functions that "convert" SEL_ARG graphs into range bounds:

  1. sel_arg_range_seq_next() and step_down_to().
  2. get_quick_keys()

get_quick_keys() has a 'key' parameter, key[key_tree->part] has the key descriptor.
sel_arg_range_seq_next() has a pointer to the SEL_ARG_RANGE_SEQ, which can have a pointer to the key with a descriptor.

Handling QUICK_GROUP_MIN_MAX_SELECT

This will also need handling and is outside of the scope of this task.

What about MySQL

They do have SEL_ARG::is_ascending (Yes, in each SEL_ARG). I do not fully follow their reasoning.



 Comments   
Comment by Sergei Krivonos (Inactive) [ 2021-11-12 ]

Igor noticed that asc/desc distinction is irrelevant for optimizer because both has same estimated performance.

Comment by Sergei Petrunia [ 2021-11-16 ]

I think he's missing the point.
Think about this example:

kp1 >= 10 and kp2 >= 100

if the index is (kp1 ASC, kp2 ASC), then the range is

(10, 100) <= (kp1, kp2)  

but if the index is (kp1 ASC, kp2 DESC), then one can only extract this range:

(10) <= (kp1) 

And I think determining this is to be done on the SQL layer.

Comment by Sergei Petrunia [ 2021-12-05 ]

Note: My suggestion in the description breaks down on SEL_ARG::store_min_key()

class SEL_ARG:
  ...
  int store_min_key(KEY_PART *key,
                    uchar **range_key,
                    uint *range_key_flag,
                    uint last_part)
  {
    SEL_ARG *key_tree= first();
    uint res= key_tree->store_min(key[key_tree->part].store_length,
                                  range_key, *range_key_flag);
 
    ...
      res+= key_tree->next_key_part->store_min_key(key,
                                                   range_key,
                                                   range_key_flag,
                                                   last_part);
 
}

Consider an index

INDEX IDX1 (kp1, kp2 DESC, kp3, kp4 DESC, ...)

Consider a call

 sel_arg(kp1)->store_min_key() 

It should get the minimum for kp1, maximum for kp2, minimum for kp3 and so forth.

Comment by Sergei Petrunia [ 2021-12-05 ]

... which means that SEL_ARG::store_min_key() should have knowledge about which keyparts are inverted and which are not.

Comment by Sergei Petrunia [ 2021-12-10 ]

An interesting property of MySQL's implementation.

Consider an example:

create table t10 (
 ... key (d desc)
);
select * from t10 force index(d) where d=2 or d=4 or d=8;

The ranges to be scanned are: d=8, d=4, d=2.

But the handler::multi_range_read_info_const() call receives them in the ascending order: d=2, d=4, d=8.

This might be ok for MySQL, their handler::multi_range_read_info_const() treats each range independently.

But is it good for MariaDB's handler::multi_range_read_info_const which tries to take into account gap size between adjacent ranges?

Comment by Sergei Petrunia [ 2021-12-14 ]

Ready for testing in preview-10.8-MDEV-26996-desc-indexes-range-optimizer

Comment by Elena Stepanova [ 2021-12-16 ]

As of today, the correct branch name is preview-10.8-MDEV-13756-desc-indexes.
So, testing will be done under MDEV-13756, while this one will remain "In testing" until MDEV-13756 with all the included parts is either accepted or rejected.

Comment by Elena Stepanova [ 2022-01-26 ]

Tested in the scope of MDEV-13756, bugs linked to it.

Generated at Thu Feb 08 09:49:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.