Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5985

EITS: selectivity estimates look illogical for join and non-key equalities

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.0.9
    • 10.0.12
    • None

    Description

      Create a dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      create table one_k2 as select * from one_k;
      set histogram_size=100;
      set use_stat_tables='preferably';
      set optimizer_use_condition_selectivity=4;
      analyze table one_k persistent for all;
      analyze table one_k2 persistent for all;

      Let's see what histograms give us

      explain extended select * from one_k A where A.a < 40;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      |    1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     4.95 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+

      explain extended select * from one_k2 B where B.a < 100;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      |    1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     9.90 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+

      The numbers looks ok.

      Now, let's try a join:

      explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      |    1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     4.95 | Using where                                     |
      |    1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     9.90 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+

      Looks ok. Now, let's add a condition.

      explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100 and B.a=A.a;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      |    1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     4.95 | Using where                                     |
      |    1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |   100.00 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+

      And filtered% becomes 100%. It used to be 9.90%, we have added another condition into WHERE and now the optimizer expects the condition to be less selective! This looks wrong.

      Attachments

        Issue Links

          Activity

            psergei Sergei Petrunia created issue -
            psergei Sergei Petrunia made changes -
            Field Original Value New Value
            Description Create a dataset:
            {noformat}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table one_k(a int);
            insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
            create table one_k2 as select * from one_k;
            set histogram_size=100;
            set use_stat_tables='preferably';
            set optimizer_use_condition_selectivity=4;
            analyze table one_k persistent for all;
            analyze table one_k2 persistent for all;
            {noformat}

            Let's see what histograms give us
            }
            explain extended select * from one_k A where A.a < 40;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            {noformat}

            {noformat}
            explain extended select * from one_k2 B where B.a < 100;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            {noformat}

            {noformat}
            explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where; Using join buffer (flat, BNL join) |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            {noformat}

            {noformat}
            explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100 and B.a=A.a;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where; Using join buffer (flat, BNL join) |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            {noformat}
            Create a dataset:
            {noformat}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table one_k(a int);
            insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
            create table one_k2 as select * from one_k;
            set histogram_size=100;
            set use_stat_tables='preferably';
            set optimizer_use_condition_selectivity=4;
            analyze table one_k persistent for all;
            analyze table one_k2 persistent for all;
            {noformat}

            Let's see what histograms give us
            {noformat}
            explain extended select * from one_k A where A.a < 40;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            {noformat}

            {noformat}
            explain extended select * from one_k2 B where B.a < 100;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
            {noformat}
            The numbers looks ok.

            Now, let's try a join:
            {noformat}
            explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where; Using join buffer (flat, BNL join) |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            {noformat}
            Looks ok. Now, let's add a condition.

            {noformat}
            explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100 and B.a=A.a;
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where |
            | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where; Using join buffer (flat, BNL join) |
            +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
            {noformat}

            And filtered% becomes 100%. It used to be 9.90%, we have added another condition into WHERE and now the optimizer expects the condition to be less selective! This looks wrong.
            psergei Sergei Petrunia made changes -
            Assignee Igor Babaev [ igor ]
            psergei Sergei Petrunia made changes -
            Assignee Igor Babaev [ igor ] Sergei Petrunia [ psergey ]
            psergei Sergei Petrunia made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            psergei Sergei Petrunia made changes -
            Fix Version/s 10.0.11 [ 15200 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.0.12 [ 15201 ]
            Fix Version/s 10.0.11 [ 15200 ]
            psergei Sergei Petrunia made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            psergei Sergei Petrunia made changes -
            psergei Sergei Petrunia made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            psergei Sergei Petrunia made changes -
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow defaullt [ 37900 ] MariaDB v2 [ 43827 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 43827 ] MariaDB v3 [ 63026 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 63026 ] MariaDB v4 [ 147723 ]

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.