[MDEV-17155] Incorrect ORDER BY optimization: full index scan is used instead of range Created: 2018-09-07  Updated: 2020-08-25  Resolved: 2018-09-12

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.1, 10.2, 10.3
Fix Version/s: 10.2.18, 10.3.10, 10.1.37

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: order-by-optimization


 Description   

This is based on a customer testcase.

I wasn't able to construct an actual testcase so far, but I was able to get the invalid query plan by tampering with rows/cost estimates in the debugger.

The problem looks like this:

EXPLAIN 
SELECT 
  non_idex_col 
FROM 
  t2
WHERE 
  key1part1 = 1500 and key1part2 IN (5, 60, 133, 387) AND 
  pk1 = 700000
ORDER BY
  pk2 DESC LIMIT 1;

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | t2    | index | PRIMARY,key1  | PRIMARY | 6       | NULL |   26 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

The plan uses a full index scan, even if there is a potential ref (or range) access over the PRIMARY key: pk1 = 700000. Costs and selectivity numbers are irrelevant - if we pick to do a full index scan, then range on that index will not be slower.

The optimizer actually has this logic and was expected to use range access. It even typically does. Getting the above query plan requires some peculiar relationship between costs of range/ref accesses through key1 and primary key.



 Comments   
Comment by Sergei Petrunia [ 2018-09-07 ]

Steps to reproduce:

1. Fill the dataset

CREATE TABLE t2 (
  pk1 int(10) unsigned NOT NULL,
  pk2 smallint(6) NOT NULL DEFAULT '0',
 
  key1part1 mediumint(8) unsigned NOT NULL,
  key1part2 smallint(5) unsigned NOT NULL,
  non_idex_col smallint(6) NOT NULL,
  other_data varchar(100),
 
  PRIMARY KEY (pk1, pk2),
  KEY key1 (key1part1, key1part2)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
 
insert into t2 select A.a+1000*B.a, 123, A.a+1000*B.a, A.a, 123, 'filler-data' from one_k A, one_k B;
 
update t2 set key1part1=1500, key1part2=5 where pk1 between 700000 and 700000+20000;
set @a=0;
update t2 set pk2=(@a:=@a+1) where pk1 between 700000 and 700000+20000;
update t2 set pk1=700000 where pk1 between 700000 and 700000+20000;

2. In the debugger, put a breakpoint in get_key_scans_params after this call:

      found_records= check_quick_select(param, idx, read_index_only, key,
                                        update_tbl_stats, &mrr_flags,
                                        &buf_size, &cost);

Condition for breakpoint: keynr == 1

Actions to do:

set cost.io_count=10
set found_records=found_records * 0.5

Continue execution. It should be that best_idx=1 after loop exit.

3. Put a breakpoint in best_access_path BEFORE this code:

=>      if (tmp + 0.0001 < best_time - records/(double) TIME_FOR_COMPARE)
        {
          best_time= tmp + records/(double) TIME_FOR_COMPARE;
          best= tmp;
          best_records= records;
          best_key= start_key;
          best_max_key_part= max_key_part;
          best_ref_depends_map= found_ref;
        }

On the second breakpoint hit, you will see tmp=10291 or something like that.

Actions to do:

set tmp=200

It should pick ref access on key1

Then continue execution and test_if_skip_sort_order/test_if_cheaper_ordering will choose to switch to PRIMARY key and use index scan.

Comment by Sergei Petrunia [ 2018-09-07 ]

I then follow the execution all the way to here

  #0  SQL_SELECT::test_quick_select (this=0x7ff8b402b078, thd=0x7ff8b4000b00, keys_to_use=..., prev_tables=0, limit=1, force_quick_range=true, ordered_output=false, remove_false_parts_of_where=false) at /home/psergey/dev-git/10.3-r3/sql/opt_range.cc:2545
  #1  0x00007ff92ad930b2 in test_if_skip_sort_order (tab=0x7ff8b4029f90, order=0x7ff8b4027430, select_limit=26, no_changes=false, map=0x7ff8b40a8a70) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:22128
  #2  0x00007ff92ad5ebb0 in JOIN::optimize_stage2 (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:2567
  #3  0x00007ff92ad5c7eb in JOIN::optimize_inner (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1921
  #4  0x00007ff92ad5acff in JOIN::optimize (this=0x7ff8b4027bd8) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:1448
  #5  0x00007ff92ad648be in mysql_select (thd=0x7ff8b4000b00, tables=0x7ff8b4025f98, wild_num=0, fields=..., conds=0x7ff8b4026ec0, og_num=1, order=0x7ff8b4027430, group=0x0, having=0x0, proc_param=0x0, select_options=2147748612, result=0x7ff8b4027550, unit=0x7ff8b40049b0, select_lex=0x7ff8b4005120) at /home/psergey/dev-git/10.3-r3/sql/sql_select.cc:4220

The calculations show that yes, it is better to switch to a quick select for the key PRIMARY.
And it calls SQL_SELECT::test_quick_select to get the quick select.
And it's not going to succeed, because the condition that is passed to get_mm_tree is taken from SQL_SELECT::cond and it is:

(gdb) p dbug_print_item(cond)
  $141 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500"

Comment by Sergei Petrunia [ 2018-09-07 ]

Where did the condition go...

In the case that I am debugging it seems to be pushed down by Index Condition Pushdown:

(gdb) p dbug_print_item(tab->pre_idx_push_select_cond)
  $150 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 <=> 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)"

Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed?

JOIN::conds:

(gdb) p dbug_print_item(join->conds)
  $154 = 0x7ff92c4a0c80 <dbug_item_print_buf> "t2.key1part1 = 1500 and t2.pk1 = 700000 and t2.key1part2 in (5,60,133,387)"

Comment by Sergei Petrunia [ 2018-09-07 ]

Another potential concern is that if ref access was chosen, then the equalities that used to guarantee to be true were removed?

They would be, but then add_ref_to_table_cond is called to put them back. So this is not a concern.

Comment by Sergei Petrunia [ 2018-09-09 ]

The patch: http://lists.askmonty.org/pipermail/commits/2018-September/012901.html

Comment by Sergei Petrunia [ 2018-09-09 ]

Igor please review

Comment by Igor Babaev [ 2018-09-11 ]

ok to push

Comment by Manuel Arostegui [ 2018-12-05 ]

I had this issue on my environment too, and 10.1.37 fixed it. The optimizer now behaves correctly

Generated at Thu Feb 08 08:34:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.