Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5917

EITS: different order of predicates in IN (...) causes different estimates

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.9
    • 10.0.10
    • None

    Description

      Create the dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      create table t1 (col1 int);
      insert into t1 select a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      set use_stat_tables = 'preferably';
      set optimizer_use_condition_selectivity=3;
      set optimizer_use_condition_selectivity=4;
      set histogram_size=100;
      analyze table t1 persistent for all; 

      Then run:

      mysql> explain extended select * from t1 where col1 in (2990, 10);
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.03 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+

      And then run:

      mysql> explain extended select * from t1 where col1 in (10,2990);
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.25 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+

      Note that IN predicate is essentailly the same, but "filtered" changes.

      Attachments

        Activity

          If I set a breakpoint at Field::pos_in_interval_val_real (...)

          It hits here:

            #0  Field::pos_in_interval_val_real (...)
            #1  0x000000000083ac5d in Field_num::pos_in_interval (...)
            #2  0x00000000006f0dd0 in get_column_range_cardinality (...)
            #3  0x000000000095e40f in records_in_column_ranges (...)
            #4  0x000000000095e82d in calculate_cond_selectivity_for_table (...)
            #5  0x000000000069485c in make_join_statistics (...)
            #6  0x000000000068b57f in JOIN::optimize_inner (...)
            #7  0x000000000068a4ee in JOIN::optimize (...)

          When I'm running this query:
          explain extended select * from t1 where col1 in (10,2990);

          I observe

            Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
          (gdb) p val_real()
            $177 = 2990
          (gdb) c
            Continuing.
            
            Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
          (gdb) p val_real()
            $178 = 2990

          On the other hand, if I run the query
          explain extended select * from t1 where col1 in (2990, 10);
          I see

            Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
          (gdb) p val_real()
            $180 = 10
          (gdb) c
            Continuing.
            
            Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
          (gdb) p val_real()
            $181 = 10

          psergei Sergei Petrunia added a comment - If I set a breakpoint at Field::pos_in_interval_val_real (...) It hits here: #0 Field::pos_in_interval_val_real (...) #1 0x000000000083ac5d in Field_num::pos_in_interval (...) #2 0x00000000006f0dd0 in get_column_range_cardinality (...) #3 0x000000000095e40f in records_in_column_ranges (...) #4 0x000000000095e82d in calculate_cond_selectivity_for_table (...) #5 0x000000000069485c in make_join_statistics (...) #6 0x000000000068b57f in JOIN::optimize_inner (...) #7 0x000000000068a4ee in JOIN::optimize (...) When I'm running this query: explain extended select * from t1 where col1 in (10,2990); I observe Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $177 = 2990 (gdb) c Continuing. Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $178 = 2990 On the other hand, if I run the query explain extended select * from t1 where col1 in (2990, 10); I see Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $180 = 10 (gdb) c Continuing. Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $181 = 10

          Apparently, records_in_column_ranges() walks over ranges but doesn't save their values into table->record[0]. We always calculate Field_num::pos_in_interval() for whatever was the last value considered by get_mm_tree().

          psergei Sergei Petrunia added a comment - Apparently, records_in_column_ranges() walks over ranges but doesn't save their values into table->record [0] . We always calculate Field_num::pos_in_interval() for whatever was the last value considered by get_mm_tree().

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.