|
Thx Elena for pointing this out.
This means, I don't simplified the issue good enough.
I add a mysqldump with 4 tables (a 250000 rows)
MariaDB [dbs2]> explain
|
-> SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
|
->
|
->
|
-> (
|
-> SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
|
-> inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
|
-> inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
|
-> inner JOIN (
|
-> SELECT * FROM `t1`
|
-> GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
|
->
|
-> where ref_no = 'Jn5tEw';
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
| 1 | PRIMARY | RD | ref | PRIMARY,t4_I1 | t4_I1 | 434 | const | 1 | Using where; Using index |
|
| 1 | PRIMARY | RB | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using index |
|
| 1 | PRIMARY | RRB | ref | t2_I1 | t2_I1 | 174 | dbs2.RD.ID,const | 1 | Using where; Using index |
|
| 1 | PRIMARY | <derived3> | ref | key0 | key0 | 496 | dbs2.RD.REF_NO,dbs2.RD.ID | 10 | |
|
| 3 | DERIVED | t1 | index | NULL | t1_I1 | 498 | NULL | 241890 | Using index |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
|
But the optimizer should do something like adding the key in the group by clause like this:(if I not oversee something)
MariaDB [dbs2]> explain
|
-> SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
|
->
|
->
|
-> (
|
-> SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
|
-> inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
|
-> inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
|
-> inner JOIN (
|
-> SELECT * FROM `t1` where ref_no = 'Jn5tEw'
|
-> GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
|
->
|
-> where ref_no = 'Jn5tEw';
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
|
| 1 | PRIMARY | RD | ref | PRIMARY,t4_I1 | t4_I1 | 434 | const | 1 | Using where; Using index |
|
| 1 | PRIMARY | RB | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using index |
|
| 1 | PRIMARY | RRB | ref | t2_I1 | t2_I1 | 174 | dbs2.RD.ID,const | 1 | Using where; Using index |
|
| 1 | PRIMARY | <derived3> | ref | key0 | key0 | 496 | dbs2.RD.REF_NO,dbs2.RD.ID | 2 | |
|
| 3 | DERIVED | t1 | ref | t1_I1 | t1_I1 | 434 | const | 1 | Using where; Using index |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
|
|
|
|
Reproducible as described with the attached data sample and with these queries (the difference is in the highlighted line):
explain
|
SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
|
(
|
SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
|
inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
|
inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
|
inner JOIN (
|
SELECT * FROM `t1`
|
GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
|
where ref_no = 'Jn5tEw';
|
explain
|
SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
|
(
|
SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
|
inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
|
inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
|
inner JOIN (
|
SELECT * FROM `t1` where ref_no = 'Jn5tEw'
|
GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
|
where ref_no = 'Jn5tEw';
|
Plans (note sufficiently different number of rows):
|
First plan
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
| 1 | PRIMARY | RD | ref | PRIMARY,t4_I1 | t4_I1 | 434 | const | 1 | Using where; Using index |
|
| 1 | PRIMARY | RB | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using index |
|
| 1 | PRIMARY | RRB | ref | t2_I1 | t2_I1 | 174 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using where; Using index |
|
| 1 | PRIMARY | <derived3> | ref | key0 | key0 | 496 | dbs2.RD.REF_NO,dbs2.RD.ID | 10 | |
|
| 3 | DERIVED | t1 | index | NULL | t1_I1 | 498 | NULL | 249742 | Using index |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|
|
Second plan
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
|
| 1 | PRIMARY | RD | ref | PRIMARY,t4_I1 | t4_I1 | 434 | const | 1 | Using where; Using index |
|
| 1 | PRIMARY | RB | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using index |
|
| 1 | PRIMARY | RRB | ref | t2_I1 | t2_I1 | 174 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using where; Using index |
|
| 1 | PRIMARY | <derived3> | ref | key0 | key0 | 496 | dbs2.RD.REF_NO,dbs2.RD.ID | 2 | |
|
| 3 | DERIVED | t1 | ref | t1_I1 | t1_I1 | 434 | const | 1 | Using where; Using index; Using temporary; Using filesort |
|
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
|
I'll leave it to psergey or other optimizer experts to decide whether it's actually supposed to work identically or not.
|
|
I can reproduce the problem on the following simple tables:
create table t1 (a int);
|
insert into t1 values
|
(4), (8), (5), (3), (10), (2), (7);
|
|
create table t2 (b int, c int);
|
insert into t2 values
|
(2,1), (5,2), (2,2), (4,1), (4,3),
|
(5,3), (2,4), (4,6), (2,1);
|
Exclusively for readability of the queries demonstrating the problem I use a view instead of a derived table.
create view v1 as
|
select b, sum(c) as s from t2 group by b;
|
The problem appears with the query
select a
|
from
|
( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
|
where a > 2;
|
MariaDB [test]> explain format=json
|
-> select a
|
-> from
|
-> ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
|
-> where a > 2;
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| EXPLAIN |
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| {
|
"query_block": {
|
"select_id": 1,
|
"table": {
|
"table_name": "t1",
|
"access_type": "ALL",
|
"rows": 7,
|
"filtered": 100,
|
"attached_condition": "t1.a > 2 and t1.a is not null"
|
},
|
"table": {
|
"table_name": "<derived3>",
|
"access_type": "ref",
|
"possible_keys": ["key0"],
|
"key": "key0",
|
"key_length": "5",
|
"used_key_parts": ["b"],
|
"ref": ["test.t1.a"],
|
"rows": 2,
|
"filtered": 100,
|
"materialized": {
|
"query_block": {
|
"select_id": 3,
|
"filesort": {
|
"sort_key": "t2.b",
|
"temporary_table": {
|
"table": {
|
"table_name": "t2",
|
"access_type": "ALL",
|
"rows": 9,
|
"filtered": 100
|
}
|
}
|
}
|
}
|
}
|
}
|
}
|
} |
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
We see that no condition is pushed into the view.
At the same time if I change the query for
select b
|
from
|
( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
|
where b > 2;
|
I see that the condition b > 2 is pushed into the view:
MariaDB [test]> explain format=json
|
-> select b
|
-> from
|
-> ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
|
-> where b > 2;
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| EXPLAIN |
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| {
|
"query_block": {
|
"select_id": 1,
|
"table": {
|
"table_name": "t1",
|
"access_type": "ALL",
|
"rows": 7,
|
"filtered": 100,
|
"attached_condition": "t1.a > 2 and t1.a is not null"
|
},
|
"table": {
|
"table_name": "<derived3>",
|
"access_type": "ref",
|
"possible_keys": ["key0"],
|
"key": "key0",
|
"key_length": "5",
|
"used_key_parts": ["b"],
|
"ref": ["test.t1.a"],
|
"rows": 2,
|
"filtered": 100,
|
"materialized": {
|
"query_block": {
|
"select_id": 3,
|
"filesort": {
|
"sort_key": "t2.b",
|
"temporary_table": {
|
"table": {
|
"table_name": "t2",
|
"access_type": "ALL",
|
"rows": 9,
|
"filtered": 100,
|
"attached_condition": "t2.b > 2"
|
}
|
}
|
}
|
}
|
}
|
}
|
}
|
} |
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
The difference for these two queries is that for the second query the condition b > 2 can be pushed as it is, while for the second query the condition a > 2 has to be first transformed into the condition b > 2 using the equality t1.a = v1.b.
Our code is supposed to use equalities when pushing conditions into materialized derived tables / views and the results of derived_cond_pushdown.test show that equalities are really used for the queries from this test.
Apparently there is a flaw in the code that prevents using equalities when pushing conditions of a mergeable derived table.
|