[MDEV-17155] Incorrect ORDER BY optimization: full index scan is used instead of range - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.1(EOL), 10.2(EOL), 10.3(EOL)
Fix Version/s: 10.2.18, 10.3.10, 10.1.37
Component/s: Optimizer
Labels:
- order-by-optimization

Description

This is based on a customer testcase.

I wasn't able to construct an actual testcase so far, but I was able to get the invalid query plan by tampering with rows/cost estimates in the debugger.

The problem looks like this:

EXPLAIN

SELECT

  non_idex_col

FROM

t2

WHERE

  key1part1 = 1500 and key1part2 IN (5, 60, 133, 387) AND

  pk1 = 700000

ORDER BY

  pk2 DESC LIMIT 1;

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

|    1 | SIMPLE      | t2    | index | PRIMARY,key1  | PRIMARY | 6       | NULL |   26 | Using where |

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

The plan uses a full index scan, even if there is a potential ref (or range) access over the PRIMARY key: pk1 = 700000. Costs and selectivity numbers are irrelevant - if we pick to do a full index scan, then range on that index will not be slower.

The optimizer actually has this logic and was expected to use range access. It even typically does. Getting the above query plan requires some peculiar relationship between costs of range/ref accesses through key1 and primary key.

Attachments

Activity

Ascending order - Click to sort in descending order

Sergei Petrunia added a comment - 2018-09-07 11:31 - edited

Steps to reproduce:

1. Fill the dataset

CREATE TABLE t2 (

  pk1 int(10) unsigned NOT NULL,

  pk2 smallint(6) NOT NULL DEFAULT '0',

  key1part1 mediumint(8) unsigned NOT NULL,

  key1part2 smallint(5) unsigned NOT NULL,

  non_idex_col smallint(6) NOT NULL,

  other_data varchar(100),

  PRIMARY KEY (pk1, pk2),

  KEY key1 (key1part1, key1part2)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

insert into t2 select A.a+1000*B.a, 123, A.a+1000*B.a, A.a, 123, 'filler-data' from one_k A, one_k B;

update t2 set key1part1=1500, key1part2=5 where pk1 between 700000 and 700000+20000;

set @a=0;

update t2 set pk2=(@a:=@a+1) where pk1 between 700000 and 700000+20000;

update t2 set pk1=700000 where pk1 between 700000 and 700000+20000;

2. In the debugger, put a breakpoint in get_key_scans_params after this call:

      found_records= check_quick_select(param, idx, read_index_only, key,

                                        update_tbl_stats, &mrr_flags,

                                        &buf_size, &cost);

Condition for breakpoint: keynr == 1

Actions to do:

set cost.io_count=10

set found_records=found_records * 0.5

Continue execution. It should be that best_idx=1 after loop exit.

3. Put a breakpoint in best_access_path BEFORE this code:

=>      if (tmp + 0.0001 < best_time - records/(double) TIME_FOR_COMPARE)

          best_time= tmp + records/(double) TIME_FOR_COMPARE;

          best= tmp;

          best_records= records;

          best_key= start_key;

          best_max_key_part= max_key_part;

          best_ref_depends_map= found_ref;

On the second breakpoint hit, you will see tmp=10291 or something like that.

Actions to do:

set tmp=200

It should pick ref access on key1

Then continue execution and test_if_skip_sort_order/test_if_cheaper_ordering will choose to switch to PRIMARY key and use index scan.

Sergei Petrunia added a comment - 2018-09-07 11:31 - edited Steps to reproduce: 1. Fill the dataset CREATE TABLE t2 ( pk1 int (10) unsigned NOT NULL , pk2 smallint (6) NOT NULL DEFAULT '0' , key1part1 mediumint(8) unsigned NOT NULL , key1part2 smallint (5) unsigned NOT NULL , non_idex_col smallint (6) NOT NULL , other_data varchar (100), PRIMARY KEY (pk1, pk2), KEY key1 (key1part1, key1part2) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; insert into t2 select A.a+1000*B.a, 123, A.a+1000*B.a, A.a, 123, 'filler-data' from one_k A, one_k B; update t2 set key1part1=1500, key1part2=5 where pk1 between 700000 and 700000+20000; set @a=0; update t2 set pk2=(@a:=@a+1) where pk1 between 700000 and 700000+20000; update t2 set pk1=700000 where pk1 between 700000 and 700000+20000; 2. In the debugger, put a breakpoint in get_key_scans_params after this call: found_records= check_quick_select(param, idx, read_index_only, key, update_tbl_stats, &mrr_flags, &buf_size, &cost); Condition for breakpoint: keynr == 1 Actions to do: set cost.io_count=10 set found_records=found_records * 0.5 Continue execution. It should be that best_idx=1 after loop exit. 3. Put a breakpoint in best_access_path BEFORE this code: => if (tmp + 0.0001 < best_time - records/( double ) TIME_FOR_COMPARE) { best_time= tmp + records/( double ) TIME_FOR_COMPARE; best= tmp; best_records= records; best_key= start_key; best_max_key_part= max_key_part; best_ref_depends_map= found_ref; } On the second breakpoint hit, you will see tmp=10291 or something like that. Actions to do: set tmp=200 It should pick ref access on key1 Then continue execution and test_if_skip_sort_order/test_if_cheaper_ordering will choose to switch to PRIMARY key and use index scan.

Sergei Petrunia added a comment - 2018-09-07 12:41

I then follow the execution all the way to here

  #0  SQL_SELECT::test_quick_select (this=0x7ff8b402b078, thd=0x7ff8b4000b00, keys_to_use=..., prev_tables=0, limit=1, force_quick_range=true, ordered_output=false, remove_false_parts_of_where=false) at /home/psergey/dev-git/10.3-r3/sql/opt_range.cc:2545

  #1  0x00007ff92ad930b2 in test_if_skip_sort_order (tab=0x7ff8b4029f90, order=0x7ff8b4027430, select_limit=26, no_changes=false, map=0x7ff8b40a8a70) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:22128

  #2  0x00007ff92ad5ebb0 in JOIN::optimize_stage2 (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:2567

  #3  0x00007ff92ad5c7eb in JOIN::optimize_inner (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1921

  #4  0x00007ff92ad5acff in JOIN::optimize (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1448

  #5  0x00007ff92ad648be in mysql_select (thd=0x7ff8b4000b00, tables=0x7ff8b4025f98, wild_num=0, fields=..., conds=0x7ff8b4026ec0, og_num=1, order=0x7ff8b4027430, group=0x0, having=0x0, proc_param=0x0, select_options=2147748612, result=0x7ff8b4027550, unit=0x7ff8b40049b0, select_lex=0x7ff8b4005120) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:4220

The calculations show that yes, it is better to switch to a quick select for the key PRIMARY.
And it calls SQL_SELECT::test_quick_select to get the quick select.
And it's not going to succeed, because the condition that is passed to get_mm_tree is taken from SQL_SELECT::cond and it is:

(gdb) p dbug_print_item(cond)

  $141 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500"

Sergei Petrunia added a comment - 2018-09-07 12:41 I then follow the execution all the way to here #0 SQL_SELECT::test_quick_select (this=0x7ff8b402b078, thd=0x7ff8b4000b00, keys_to_use=..., prev_tables=0, limit=1, force_quick_range=true, ordered_output=false, remove_false_parts_of_where=false) at /home/psergey/dev-git/10.3-r3/sql/opt_range.cc:2545 #1 0x00007ff92ad930b2 in test_if_skip_sort_order (tab=0x7ff8b4029f90, order=0x7ff8b4027430, select_limit=26, no_changes=false, map=0x7ff8b40a8a70) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:22128 #2 0x00007ff92ad5ebb0 in JOIN::optimize_stage2 (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:2567 #3 0x00007ff92ad5c7eb in JOIN::optimize_inner (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1921 #4 0x00007ff92ad5acff in JOIN::optimize (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1448 #5 0x00007ff92ad648be in mysql_select (thd=0x7ff8b4000b00, tables=0x7ff8b4025f98, wild_num=0, fields=..., conds=0x7ff8b4026ec0, og_num=1, order=0x7ff8b4027430, group=0x0, having=0x0, proc_param=0x0, select_options=2147748612, result=0x7ff8b4027550, unit=0x7ff8b40049b0, select_lex=0x7ff8b4005120) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:4220 The calculations show that yes, it is better to switch to a quick select for the key PRIMARY. And it calls SQL_SELECT::test_quick_select to get the quick select. And it's not going to succeed, because the condition that is passed to get_mm_tree is taken from SQL_SELECT::cond and it is: (gdb) p dbug_print_item(cond) $141 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500"

Sergei Petrunia added a comment - 2018-09-07 12:46 - edited

Where did the condition go...

In the case that I am debugging it seems to be pushed down by Index Condition Pushdown:

(gdb) p dbug_print_item(tab->pre_idx_push_select_cond)

  $150 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)"

Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed?

JOIN::conds:

(gdb) p dbug_print_item(join->conds)

  $154 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 = 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)"

Sergei Petrunia added a comment - 2018-09-07 12:46 - edited Where did the condition go... In the case that I am debugging it seems to be pushed down by Index Condition Pushdown: (gdb) p dbug_print_item(tab->pre_idx_push_select_cond) $150 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)" Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed? JOIN::conds: (gdb) p dbug_print_item(join->conds) $154 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 = 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)"

Sergei Petrunia added a comment - 2018-09-07 14:04

Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed?

They would be, but then add_ref_to_table_cond is called to put them back. So this is not a concern.

Sergei Petrunia added a comment - 2018-09-07 14:04 Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed? They would be, but then add_ref_to_table_cond is called to put them back. So this is not a concern.

Sergei Petrunia added a comment - 2018-09-09 18:09

The patch: http://lists.askmonty.org/pipermail/commits/2018-September/012901.html

Sergei Petrunia added a comment - 2018-09-09 18:09 The patch: http://lists.askmonty.org/pipermail/commits/2018-September/012901.html

Sergei Petrunia added a comment - 2018-09-09 18:28

Igor please review

Sergei Petrunia added a comment - 2018-09-09 18:28 Igor please review

Igor Babaev (Inactive) added a comment - 2018-09-11 06:42

ok to push

Igor Babaev (Inactive) added a comment - 2018-09-11 06:42 ok to push

Manuel Arostegui added a comment - 2018-12-05 10:08

I had this issue on my environment too, and 10.1.37 fixed it. The optimizer now behaves correctly

Manuel Arostegui added a comment - 2018-12-05 10:08 I had this issue on my environment too, and 10.1.37 fixed it. The optimizer now behaves correctly

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2018-09-07 11:16

Updated:: 2024-07-07 23:38

Resolved:: 2018-09-12 14:01

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server