Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.3.12, 5.5.36, 10.0.10
-
None
Description
Create a test dataset:
create table t1 (a int);
|
insert into t1 select * from test.one_k;
|
|
create table tsubq(
|
id int primary key,
|
key1 int,
|
col1 int,
|
key(key1)
|
) engine=innodb;
|
|
insert into tsubq
|
select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B;
|
Then, check the plan:
explain select
|
(SELECT
|
concat(id, '-', key1, '-', col1)
|
FROM tsubq
|
WHERE tsubq.key1 = t1.a
|
ORDER BY tsubq.id ASC LIMIT 1)
|
from
|
t1;
|
+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
|
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | |
|
| 2 | DEPENDENT SUBQUERY | tsubq | index | key1 | PRIMARY | 4 | NULL | 1 | Using where |
|
+------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
|
The subquery uses "index" access, which is very inefficient. The estimate for #rows seems to come from the LIMIT clause and is very wrong in this case.
The table is InnoDB (with extended keys). The index KEY(key1) is actually KEY(key1, id). The query has a restriction on key1 which makes it constant (tsubq.key1 = t1.a). After that, ORDER BY tsubq.id is achieved automatically.
The problem seems to be specifically with references to outside of subquery. If I use a constant instead, the query plan is able to use key1:
explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = 333 ORDER BY tsubq.id ASC LIMIT 1) from t1;
|
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | |
|
| 2 | SUBQUERY | tsubq | ref | key1 | key1 | 5 | const | 999 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|
Attachments
Issue Links
- relates to
-
MDEV-6081 ORDER BY+ref(const): selectivity is very incorrect (MySQL Bug#14338686)
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Assignee | Sergei Petrunia [ psergey ] |
Description |
{noformat} create table t1 (a int); insert into t1 select * from test.one_k; create table tsubq( id int primary key, key1 int, col1 int, key(key1) ); insert into tsubq select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B; {noformat} {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = t1.a ORDER BY tsubq.id ASC LIMIT 1) from t1; {noformat} {noformat} +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | DEPENDENT SUBQUERY | tsubq | index | key1 | PRIMARY | 4 | NULL | 1 | Using where | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ {noformat} The above is wrong. Note also a very wrong estimate for #rows. If I change outer reference to constant, the EXPLAIN is more reasonable: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = 333 ORDER BY tsubq.id ASC LIMIT 1) from t1; +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | SUBQUERY | tsubq | ref | key1 | key1 | 5 | const | 999 | Using where | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ {noformat} |
Create a test dataset: {noformat} create table t1 (a int); insert into t1 select * from test.one_k; create table tsubq( id int primary key, key1 int, col1 int, key(key1) ); insert into tsubq select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B; {noformat} Then, check the plan: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = t1.a ORDER BY tsubq.id ASC LIMIT 1) from t1; {noformat} {noformat} +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | DEPENDENT SUBQUERY | tsubq | index | key1 | PRIMARY | 4 | NULL | 1 | Using where | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ {noformat} The subquery uses "index" access, which is very inefficient. The estimate for #rows seems to come from the LIMIT clause and is very wrong in this case. The query is not using a plan If I change outer reference to constant, the EXPLAIN is more reasonable: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = 333 ORDER BY tsubq.id ASC LIMIT 1) from t1; +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | SUBQUERY | tsubq | ref | key1 | key1 | 5 | const | 999 | Using where | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ {noformat} |
Description |
Create a test dataset: {noformat} create table t1 (a int); insert into t1 select * from test.one_k; create table tsubq( id int primary key, key1 int, col1 int, key(key1) ); insert into tsubq select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B; {noformat} Then, check the plan: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = t1.a ORDER BY tsubq.id ASC LIMIT 1) from t1; {noformat} {noformat} +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | DEPENDENT SUBQUERY | tsubq | index | key1 | PRIMARY | 4 | NULL | 1 | Using where | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ {noformat} The subquery uses "index" access, which is very inefficient. The estimate for #rows seems to come from the LIMIT clause and is very wrong in this case. The query is not using a plan If I change outer reference to constant, the EXPLAIN is more reasonable: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = 333 ORDER BY tsubq.id ASC LIMIT 1) from t1; +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | SUBQUERY | tsubq | ref | key1 | key1 | 5 | const | 999 | Using where | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ {noformat} |
Create a test dataset: {noformat} create table t1 (a int); insert into t1 select * from test.one_k; create table tsubq( id int primary key, key1 int, col1 int, key(key1) ) engine=innodb; insert into tsubq select A.a + B.a*1000, A.a, 123456 from test.one_k A, test.one_k B; {noformat} Then, check the plan: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = t1.a ORDER BY tsubq.id ASC LIMIT 1) from t1; {noformat} {noformat} +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | DEPENDENT SUBQUERY | tsubq | index | key1 | PRIMARY | 4 | NULL | 1 | Using where | +------+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+ {noformat} The subquery uses "index" access, which is very inefficient. The estimate for #rows seems to come from the LIMIT clause and is very wrong in this case. The table is InnoDB (with extended keys). The index KEY(key1) is actually KEY(key1, id). The query has a restriction on key1 which makes it constant (tsubq.key1 = t1.a). After that, ORDER BY tsubq.id is achieved automatically. The problem seems to be specifically with references to outside of subquery. If I use a constant instead, the query plan is able to use key1: {noformat} explain select (SELECT concat(id, '-', key1, '-', col1) FROM tsubq WHERE tsubq.key1 = 333 ORDER BY tsubq.id ASC LIMIT 1) from t1; +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | | | 2 | SUBQUERY | tsubq | ref | key1 | key1 | 5 | const | 999 | Using where | +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+ {noformat} |
Affects Version/s | 5.3.12 [ 12000 ] | |
Affects Version/s | 5.5.36 [ 14600 ] |
Fix Version/s | 10.0.11 [ 15200 ] |
Labels | upstream, upstream-5.5 |
Labels | upstream, upstream-5.5 | upstream upstream-5.5 |
Status | Open [ 1 ] | In Progress [ 3 ] |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
Workflow | defaullt [ 38410 ] | MariaDB v2 [ 43782 ] |
Workflow | MariaDB v2 [ 43782 ] | MariaDB v3 [ 62972 ] |
Workflow | MariaDB v3 [ 62972 ] | MariaDB v4 [ 147757 ] |
Initial investigation:
The optimizer fails to detect that sorting is not needed, because TABLE::const_key_parts[ key1] = 0.
I'm wondering if the fix is to make const_key_parts=1 for cases like this.
TABLE::const_key_parts is modified by
Checked by
Suggestion: change the meaning of "const_key_parts" to be "including
OUTER_REF_TABLE_BIT".