[MDEV-9586] Sub-optimal query plan is picked from range vs another range vs ref Created: 2016-02-18  Updated: 2023-12-22

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 11.5

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 2
Labels: None

Attachments: File psergey-ref-or-range-issue-fix1.diff     File psergey-ref-or-range-issue-fix1.diff    

 Description   

CREATE TABLE tbl (
  ...
 
  PRIMARY KEY (pk),
  KEY key1 (key1,pk)
) ENGINE=InnoDB;

And two query plans:

# The bad one:
mysql> explain SELECT * FROM tbl1 WHERE key1 = const1 AND pk > const2A limit 1,1;
+----+-------------+-------+------+---------------+------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+-------+-------+-------------+
|  1 | SIMPLE      | tbl1  | ref  | PRIMARY,key1  | key1 | 9       | const | 24398 | Using where |
+----+-------------+-------+------+---------------+------+---------+-------+-------+-------------+

# The good one one:
mysql> explain SELECT * FROM tbl1 WHERE key1 = const1 AND pk > const2B limit 1,1;
+----+-------------+-------+-------+----------------+------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys  | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+----------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | tbl1  | range | PRIMARY,key1   | key1 | 17      | NULL | 1052604 | Using where |
+----+-------------+-------+-------+-------------- -+------+---------+------+---------+-------------+

optimizer's choice between Good and Bad query plans depends on the values of const2A and const2B.

Note that both query plans use the same index, but the Good plan uses more key
parts. The point is that Good plan should always be used, because it scans a
proper subset of rows that the Bad plan is scanning.



 Comments   
Comment by Sergei Petrunia [ 2016-02-18 ]

== First query plan ==

The plan with type=ref, key=key1, key_len=9 is achieved as follows:

1. Range optimizer chooses a range plan over PK:

quick range select, key PRIMARY, length: 8
const2A < X

2. Join optimizer chooses ref access key1

3. We end up using ref access over key1.

== Second query plan ==

The plan with type=range, key=key1, key_len=17 is chosen as
follows:

1. Range optimizer choses a range_plan over key1:

quick range select, key key1, length: 17
const1/const2B < X <= const1

2. Join optimizer chooses ref access key1

3. There is a heuristic:
if ref(const) used over key X, and there is a range access over the same
key X which uses more key parts, switch to key X.

== Conclusions ==

1. Range access and ref access cost estimates do not agree with one another. See in the "Second plan". It first choses a range access, then on step #2
it decides that ref(const) on the same index is cheaper, although it reads a superset of rows.

2. What to do? I can't fix the cost model.
I can develop a patch that extends the heuristic applied on step#3 of the "Second query plan" to handle cases First Query plan, too.

Comment by Sergei Petrunia [ 2016-02-18 ]

The relevant piece of code from make_join_select():

      if (tab->type == JT_REF && tab->quick &&
	  (((uint) tab->ref.key == tab->quick->index &&
	    tab->ref.key_length < tab->quick->max_used_key_length) ||
	    tab->table->intersect_keys.is_set(tab->ref.key)))
      {
	/* Range uses longer key;  Use this instead of ref on key */
	tab->type=JT_ALL;
	use_quick_range=1;
	tab->use_quick=1;
        tab->ref.key= -1;
	tab->ref.key_parts=0;		// Don't use ref key.

It's a very old code.

Comment by Sergei Petrunia [ 2016-02-18 ]

So, the logic in this piece is

if  (we're using a ref access on $KEY AND
     the best possible quick select for this table is a QUICK_RANGE_SELECT($KEY) AND
     that quick select uses more key parts)
{
    use the quick select
}

and the change I'm suggesting is:

-  the best possible quick select for this table is a QUICK_RANGE_SELECT($KEY) AND
+  there is a possible QUICK_RANGE_SELECT($KEY)

that is, the heuristics should not break just becase there was another index with a cheaper quick select.

Comment by Sergei Petrunia [ 2016-02-19 ]

psergey-ref-or-range-issue-fix1.diff is my first attempt to fix this issue before I saw the above code. It tries to do what is basically described but at another phase.

Comment by Sergei Petrunia [ 2016-02-19 ]

Discussed with igor.

Solutions:

  • in 10.2, the plan change should be moved into best_access_path()
  • in 10.1, a conservative fix can be made. make_join_select is a better location than create_ref_for_key, which psergey-ref-or-range-issue-fix1.diff uses.

So, I'll now take psergey-ref-or-range-issue-fix1.diff and move the code there from create_ref_for_key to make_join_select.

Generated at Thu Feb 08 07:35:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.