Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6081

ORDER BY+ref(const): selectivity is very incorrect (MySQL Bug#14338686)

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.10
    • Fix Version/s: 10.0.11
    • Component/s: None
    • Labels:
      None

      Description

      The optimizer may make a very wrong choice in select ... where key1=const ORDER BY key2.

      This bug originates from me analyzing the testcase from MDEV-6041, but it is orthogonal to MDEV-6041.

      Create the test dataset:

      create table t1 (a int);
      insert into t1 select * from test.one_k;
       
      create table tsubq(
        id int primary key,
        key1 int,
        col1 int,
        key(key1)
      ) engine=innodb;
       
      insert into tsubq 
        select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B;
      alter table tsubq add key2 int;
      update tsubq set key2=key1;
      alter table tsubq add key(key2);

      Then run the query:

      explain select
           (SELECT
              concat(id, '-', key1, '-', col1)
           FROM tsubq
            WHERE tsubq.key1 = t1.a
           ORDER BY tsubq.key2 ASC LIMIT 1)
      from t1;

      Query plan in MariaDB:

      +------+--------------------+-------+-------+---------------+------+---------+------+------+-------------+
      | id   | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+--------------------+-------+-------+---------------+------+---------+------+------+-------------+
      |    1 | PRIMARY            | t1    | ALL   | NULL          | NULL | NULL    | NULL | 1000 |             |
      |    2 | DEPENDENT SUBQUERY | tsubq | index | key1          | key2 | 5       | NULL |    1 | Using where |
      +------+--------------------+-------+-------+---------------+------+---------+------+------+-------------+

      This is a very inefficient plan. Assume key1 and key2 are not correlated. The plan shows that we will use full index scan on "key2". However, we know that we need to find a record that matches condition "tsubq.key1 = t1.a", which is very selective. We will have to scan a lot of rows before finding a match. It is much cheaper to use ref access on key1 and then use filesort.

      MySQL has fixed this in 5.6. Starting from 5.6, one gets:

      +----+--------------------+-------+------+---------------+------+---------+---------+------+-----------------------------+
      | id | select_type        | table | type | possible_keys | key  | key_len | ref     | rows | Extra                       |
      +----+--------------------+-------+------+---------------+------+---------+---------+------+-----------------------------+
      |  1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL    |    1 | NULL                        |
      |  2 | DEPENDENT SUBQUERY | tsubq | ref  | key1          | key1 | 5       | j5.t1.a |  532 | Using where; Using filesort |
      +----+--------------------+-------+------+---------------+------+---------+---------+------+-----------------------------+

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              psergei Sergei Petrunia
              Reporter:
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration