[MDEV-6041] ORDER BY+subqueries: subquery_table.key=outer_table.col is not recongized as binding Created: 2014-04-07  Updated: 2014-04-12  Resolved: 2014-04-12

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.3.12, 5.5.36, 10.0.10
Fix Version/s: 10.0.11

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: upstream, upstream-5.5

Issue Links:
Relates
relates to MDEV-6081 ORDER BY+ref(const): selectivity is v... Closed

 Description   

Create a test dataset:

create table t1 (a int);
insert into t1 select * from test.one_k;
 
create table tsubq(
  id int primary key,
  key1 int,
  col1 int,
  key(key1)
) engine=innodb;
 
insert into tsubq 
  select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B;

Then, check the plan:

explain select 
   (SELECT 
      concat(id, '-', key1, '-', col1)
    FROM tsubq
    WHERE tsubq.key1 = t1.a
    ORDER BY tsubq.id ASC LIMIT 1)
from 
  t1;

+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | PRIMARY            | t1    | ALL   | NULL          | NULL    | NULL    | NULL | 1000 |             |
|    2 | DEPENDENT SUBQUERY | tsubq | index | key1          | PRIMARY | 4       | NULL |    1 | Using where |
+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+

The subquery uses "index" access, which is very inefficient. The estimate for #rows seems to come from the LIMIT clause and is very wrong in this case.

The table is InnoDB (with extended keys). The index KEY(key1) is actually KEY(key1, id). The query has a restriction on key1 which makes it constant (tsubq.key1 = t1.a). After that, ORDER BY tsubq.id is achieved automatically.

The problem seems to be specifically with references to outside of subquery. If I use a constant instead, the query plan is able to use key1:

explain select     (SELECT        concat(id, '-', key1, '-', col1)     FROM tsubq     WHERE tsubq.key1 = 333     ORDER BY tsubq.id ASC LIMIT 1) from    t1;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|    1 | PRIMARY     | t1    | ALL  | NULL          | NULL | NULL    | NULL  | 1000 |             |
|    2 | SUBQUERY    | tsubq | ref  | key1          | key1 | 5       | const |  999 | Using where |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+



 Comments   
Comment by Sergei Petrunia [ 2014-04-07 ]

Initial investigation:

The optimizer fails to detect that sorting is not needed, because TABLE::const_key_parts[ key1] = 0.

I'm wondering if the fix is to make const_key_parts=1 for cases like this.

TABLE::const_key_parts is modified by

  • reset by make_join_statistics
  • set by sort_and_filter_keyuse
  • update_const_equal_items sets too
  • TABLE::update_const_key_parts() for UPDATE.

Checked by

  • test_if_order_by_key()

Suggestion: change the meaning of "const_key_parts" to be "including
OUTER_REF_TABLE_BIT".

Comment by Sergei Petrunia [ 2014-04-08 ]

Committed a fix for the above suggestion. I'll also investigate why did it pick query plan with 'index' (as opposed to query plan with 'ref' + 'Using filesort').

Comment by Sergei Petrunia [ 2014-04-09 ]

Trying the original testcase on different vesions of MySQL/MariaDB:

5.3.13-MariaDB-log

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t1      ALL     NULL    NULL    NULL    NULL    700
2       DEPENDENT SUBQUERY      tsubq   index   key1    PRIMARY 4       NULL    1       Using where

5.5.37-MariaDB-log

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t1      ALL     NULL    NULL    NULL    NULL    1351
2       DEPENDENT SUBQUERY      tsubq   index   key1    PRIMARY 4       NULL    1       Using where

5.6.16-debug-log

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t1      ALL     NULL    NULL    NULL    NULL    1000    NULL
2       DEPENDENT SUBQUERY      tsubq   ref     key1    key1    5       test.t1.a       493     Using where; Using filesort

MySQL 5.5.32, MySQL 5.5.37

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY t1      ALL     NULL    NULL    NULL    NULL    1157
2       DEPENDENT SUBQUERY      tsubq   index   key1    PRIMARY 4       NULL    1       Using where

Comment by Sergei Petrunia [ 2014-04-09 ]

Looking at the above: MySQL 5.5 and MariaDB 5.3/5.5 had the same "old" query plan. MySQL 5.6 has a "new" query plan. We need to find out what has caused the difference.

Comment by Sergei Petrunia [ 2014-04-11 ]

The change in 5.6 was caused by this patch (it is for 5.7 but it was backported to 5.6):

revno: 4502
revision-id: chaithra.gopalareddy@oracle.com-20120918093140-2ouxgmh0so1j92jw
committer: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
branch nick: mysql-trunk
timestamp: Tue 2012-09-18 15:01:40 +0530
message:
Bug#14338686: MYSQL IS GENERATING DIFFERENT AND SLOWER
(IN NEWER VERSIONS) EXECUTION PLAN
PROBLEM:
While checking for an index to sort for the order by clause
in this query
"SELECT datestamp FROM contractStatusHistory WHERE
contract_id = contracts.id ORDER BY datestamp asc limit 1;"
...

Comment by Sergei Petrunia [ 2014-04-11 ]

The patch has no testcase but looks reasonable.

Comment by Sergei Petrunia [ 2014-04-11 ]

Will address the issue with cost calculations in MDEV-6081.

Comment by Sergei Petrunia [ 2014-04-12 ]

Fix pushed to 10.0

Generated at Thu Feb 08 07:08:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.