Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6003

EITS: ref access, keypart2=const vs keypart2=expr - inconsistent filtered% value

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.9
    • 10.0.11
    • None

    Description

      Prepare the dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table t1 (
        kp1 int, kp2 int, 
        filler1 char(100),
        filler2 char(100),
        key(kp1, kp2)
      );
       
      insert into t1 
      select 
        A.a,
        B.a,
        'filler-data-1',
        'filler-data-2'
      from ten A, ten B, ten C;
      set histogram_size=100;
      set use_stat_tables='preferably';
      set optimizer_use_condition_selectivity=4;
      analyze table t1 persistent for all;

      Now, let's try a ref access. Let's start without ref(const):

      explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=ten.a+1;
      +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref            | rows | filtered | Extra                 |
      +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL           |   10 |   100.00 | Using where           |
      |    1 | SIMPLE      | t1    | ref  | kp1           | kp1  | 10      | j19.ten.a,func |   10 |   100.00 | Using index condition |
      +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+

      So, ref access will give us 10 rows (on every index lookup). Ok.

      explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=4;
      +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref             | rows | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL            |   10 |   100.00 | Using where |
      |    1 | SIMPLE      | t1    | ref  | kp1           | kp1  | 10      | j19.ten.a,const |   10 |     9.90 |             |
      +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+

      This one seems to be wrong. ref access still produces 10 rows, but then filtered=9.90% , which is what selectivity would be if we weren't using ref access.

      Indeed, if we disable ref access:

      explain extended select * from ten, t1 ignore index(kp1) where t1.kp1=ten.a and t1.kp2=4;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 |   100.00 |                                                 |
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     9.90 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+

      we get selectivity=9.90%.

      Attachments

        Activity

          psergei Sergei Petrunia created issue -
          psergei Sergei Petrunia made changes -
          Field Original Value New Value
          Labels eits
          psergei Sergei Petrunia made changes -
          Description {noformat}
          create table ten(a int);
          insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
          create table t1 (
            kp1 int, kp2 int,
            filler1 char(100),
            filler2 char(100),
            key(kp1, kp2)
          );

          insert into t1
          select
            A.a,
            B.a,
            'filler-data-1',
            'filler-data-2'
          from ten A, ten B, ten C;
          set histogram_size=100;
          set use_stat_tables='preferably';
          set optimizer_use_condition_selectivity=4;
          analyze table t1 persistent for all;
          {noformat}

          {noformat}
          explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=ten.a+1;
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |
          | 1 | SIMPLE | t1 | ref | kp1 | kp1 | 10 | j19.ten.a,func | 10 | 100.00 | Using index condition |
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          {noformat}
          So, we use ref access, which gives us 10 rows. Ok.

          {noformat}
          explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=4;
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |
          | 1 | SIMPLE | t1 | ref | kp1 | kp1 | 10 | j19.ten.a,const | 10 | 9.90 | |
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          {noformat}

          This one seems to be wrong. ref access still produces 10 rows, but then filtered=9.90% , which is what selectivity would be if we weren't using ref access.

          Indeed, if we disable ref access:
          {noformat}
          explain extended select * from ten, t1 ignore index(kp1) where t1.kp1=ten.a and t1.kp2=ten.a+1;
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | |
          | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where; Using join buffer (flat, BNL join) |
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          {noformat}
          Prepare the dataset:

          {noformat}
          create table ten(a int);
          insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
          create table t1 (
            kp1 int, kp2 int,
            filler1 char(100),
            filler2 char(100),
            key(kp1, kp2)
          );

          insert into t1
          select
            A.a,
            B.a,
            'filler-data-1',
            'filler-data-2'
          from ten A, ten B, ten C;
          set histogram_size=100;
          set use_stat_tables='preferably';
          set optimizer_use_condition_selectivity=4;
          analyze table t1 persistent for all;
          {noformat}

          Now, let's try a ref access. Let's start without ref(const):
          {noformat}
          explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=ten.a+1;
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |
          | 1 | SIMPLE | t1 | ref | kp1 | kp1 | 10 | j19.ten.a,func | 10 | 100.00 | Using index condition |
          +------+-------------+-------+------+---------------+------+---------+----------------+------+----------+-----------------------+
          {noformat}
          So, ref access will give us 10 rows (on every index lookup). Ok.

          {noformat}
          explain extended select * from ten, t1 where t1.kp1=ten.a and t1.kp2=4;
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |
          | 1 | SIMPLE | t1 | ref | kp1 | kp1 | 10 | j19.ten.a,const | 10 | 9.90 | |
          +------+-------------+-------+------+---------------+------+---------+-----------------+------+----------+-------------+
          {noformat}

          This one seems to be wrong. ref access still produces 10 rows, but then filtered=9.90% , which is what selectivity would be if we weren't using ref access.

          Indeed, if we disable ref access:
          {noformat}
          explain extended select * from ten, t1 ignore index(kp1) where t1.kp1=ten.a and t1.kp2=4;
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | |
          | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where; Using join buffer (flat, BNL join) |
          +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
          {noformat}

          we get selectivity=9.90%.
          psergei Sergei Petrunia made changes -
          Assignee Igor Babaev [ igor ]
          psergei Sergei Petrunia made changes -
          Assignee Igor Babaev [ igor ] Sergei Petrunia [ psergey ]
          psergei Sergei Petrunia made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.0.11 [ 15200 ]

          Patch submitted for review

          psergei Sergei Petrunia added a comment - Patch submitted for review

          For this particular example, these lines in table_cond_selectivity()

                      if (keyparts == keyuse->keypart &&
                          !(~(keyuse->val->used_tables()) & pos->ref_depend_map) &&
                          !(found_part_ref_or_null & keyuse->optimize))

          are incorrect

          psergei Sergei Petrunia added a comment - For this particular example, these lines in table_cond_selectivity() if (keyparts == keyuse->keypart && !(~(keyuse->val->used_tables()) & pos->ref_depend_map) && !(found_part_ref_or_null & keyuse->optimize)) are incorrect

          The other place in that function that is wrong (it becomes apparent after you fix the first one) is:

                      if (keyuse->val->const_item())
                        sel*= table->field[fldno]->cond_selectivity; 

          Here we should divide, not multiply.

          psergei Sergei Petrunia added a comment - The other place in that function that is wrong (it becomes apparent after you fix the first one) is: if (keyuse->val->const_item()) sel*= table->field[fldno]->cond_selectivity; Here we should divide, not multiply.

          Committed another patch for review.

          psergei Sergei Petrunia added a comment - Committed another patch for review.
          psergei Sergei Petrunia made changes -
          Status In Progress [ 3 ] Stalled [ 10000 ]
          psergei Sergei Petrunia made changes -
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow defaullt [ 37922 ] MariaDB v2 [ 43385 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 43385 ] MariaDB v3 [ 63148 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 63148 ] MariaDB v4 [ 147736 ]

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.