[MDEV-5917] EITS: different order of predicates in IN (...) causes different estimates Created: 2014-03-20  Updated: 2014-03-21  Resolved: 2014-03-20

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.9
Fix Version/s: 10.0.10

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: eits


 Description   

Create the dataset:

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k(a int);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
create table t1 (col1 int);
insert into t1 select a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
insert into t1 select 2000+a from one_k;
set use_stat_tables = 'preferably';
set optimizer_use_condition_selectivity=3;
set optimizer_use_condition_selectivity=4;
set histogram_size=100;
analyze table t1 persistent for all; 

Then run:

mysql> explain extended select * from t1 where col1 in (2990, 10);
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.03 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+

And then run:

mysql> explain extended select * from t1 where col1 in (10,2990);
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.25 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+

Note that IN predicate is essentailly the same, but "filtered" changes.



 Comments   
Comment by Sergei Petrunia [ 2014-03-20 ]

If I set a breakpoint at Field::pos_in_interval_val_real (...)

It hits here:

  #0  Field::pos_in_interval_val_real (...)
  #1  0x000000000083ac5d in Field_num::pos_in_interval (...)
  #2  0x00000000006f0dd0 in get_column_range_cardinality (...)
  #3  0x000000000095e40f in records_in_column_ranges (...)
  #4  0x000000000095e82d in calculate_cond_selectivity_for_table (...)
  #5  0x000000000069485c in make_join_statistics (...)
  #6  0x000000000068b57f in JOIN::optimize_inner (...)
  #7  0x000000000068a4ee in JOIN::optimize (...)

When I'm running this query:
explain extended select * from t1 where col1 in (10,2990);

I observe

  Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
(gdb) p val_real()
  $177 = 2990
(gdb) c
  Continuing.
  
  Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
(gdb) p val_real()
  $178 = 2990

On the other hand, if I run the query
explain extended select * from t1 where col1 in (2990, 10);
I see

  Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
(gdb) p val_real()
  $180 = 10
(gdb) c
  Continuing.
  
  Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
(gdb) p val_real()
  $181 = 10

Comment by Sergei Petrunia [ 2014-03-20 ]

Apparently, records_in_column_ranges() walks over ranges but doesn't save their values into table->record[0]. We always calculate Field_num::pos_in_interval() for whatever was the last value considered by get_mm_tree().

Generated at Thu Feb 08 07:08:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.