Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
Note: with the provided test case, the problem is only reproducible when optimizer_prune_level=0. If you decide that it's acceptable and isn't worth fixing, I won't object, but I'm filing it so you could check if it's expected.
The following query
SELECT a FROM t1, t2, t3 AS t3_alias
|
WHERE b <= ( SELECT SUM(a) FROM t1 )
|
AND EXISTS ( SELECT 1 FROM t3 WHERE c > t3_alias.c )
|
ORDER BY a LIMIT 1;
|
takes ~18 times longer on MDEV-193 tree comparing to main/5.5 revno 3426 (and to 3402).
Reproducible with MyISAM and Aria; with InnoDB it's sporadic.
Reproducible with the default optimizer_switch.
EXPLAIN on MDEV-193:
|
id select_type table type possible_keys key key_len ref rows filtered Extra
|
1 PRIMARY t1 index NULL a 5 NULL 40 100.00 Using index; Using temporary; Using filesort
|
1 PRIMARY t2 range b b 5 NULL 96 100.00 Using where; Using index; Using join buffer (flat, BNL join)
|
1 PRIMARY t3_alias index NULL c 5 NULL 40 100.00 Using where; Using index; Using join buffer (incremental, BNL join)
|
3 DEPENDENT SUBQUERY t3 index c c 5 NULL 40 100.00 Using where; Using index
|
2 SUBQUERY t1 index NULL a 5 NULL 40 100.00 Using index
|
Warnings:
|
Note 1276 Field or reference 'test.t3_alias.c' of SELECT #3 was resolved in SELECT #1
|
Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` join `test`.`t2` join `test`.`t3` `t3_alias` where ((`test`.`t2`.`b` <= (select sum(`test`.`t1`.`a`) from `test`.`t1`)) and <expr_cache><`test`.`t3_alias`.`c`>(exists(select 1 from `test`.`t3` where (`test`.`t3`.`c` > `test`.`t3_alias`.`c`)))) order by `test`.`t1`.`a` limit 1
|
EXPLAIN on maria/5.5:
|
id select_type table type possible_keys key key_len ref rows filtered Extra
|
1 PRIMARY t1 index NULL a 5 NULL 40 100.00 Using index; Using temporary; Using filesort
|
1 PRIMARY t3_alias index NULL c 5 NULL 40 100.00 Using where; Using index; Using join buffer (flat, BNL join)
|
1 PRIMARY t2 index b b 5 NULL 100 100.00 Using where; Using index; Using join buffer (incremental, BNL join)
|
3 DEPENDENT SUBQUERY t3 index c c 5 NULL 40 100.00 Using where; Using index
|
2 SUBQUERY t1 index NULL a 5 NULL 40 100.00 Using index
|
Warnings:
|
Note 1276 Field or reference 'test.t3_alias.c' of SELECT #3 was resolved in SELECT #1
|
Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` join `test`.`t2` join `test`.`t3` `t3_alias` where ((`test`.`t2`.`b` <= (select sum(`test`.`t1`.`a`) from `test`.`t1`)) and <expr_cache><`test`.`t3_alias`.`c`>(exists(select 1 from `test`.`t3` where (`test`.`t3`.`c` > `test`.`t3_alias`.`c`)))) order by `test`.`t1`.`a` limit 1
|
Test case:
SET optimizer_prune_level = 0;
|
|
CREATE TABLE t1 (a INT, KEY(a)) ENGINE=MyISAM;
|
INSERT INTO t1 VALUES
|
(0),(8),(1),(8),(9),(24),(6),(1),(6),(2),
|
(4),(8),(4),(4),(7),(4),(1),(9),(4),(8);
|
INSERT INTO t1 SELECT * FROM t1;
|
|
CREATE TABLE t2 (b INT, KEY(b)) ENGINE=MyISAM;
|
INSERT INTO t2 VALUES
|
(4),(8),(0),(0),(0),(7),(7),(5),(3),(188),
|
(4),(9),(6),(1),(5),(6),(2),(4),(231),(4),
|
(3),(3),(7),(6),(7),(9),(4),(4),(2),(1),(2),
|
(194),(2),(3),(8),(4),(9),(4),(5),(5),(9),
|
(3),(8),(0),(98),(3),(1),(0),(189),(8),(3),
|
(3),(9),(6),(8),(3),(9),(5),(9),(2),(2),(5),
|
(8),(6),(9),(0),(3),(6),(5),(8),(2),(120),
|
(25),(1),(3),(1),(3),(153),(5),(9),(1),(8),
|
(7),(6),(2),(4),(7),(3),(8),(4),(6),(1),(7),
|
(1),(0),(2),(7),(2),(1),(5);
|
|
CREATE TABLE t3 (c INT, KEY(c)) ENGINE=MyISAM;
|
INSERT INTO t3 VALUES
|
(7),(0),(9),(3),(4),(2),(5),(3),(1),(3),(6),
|
(7),(5),(1),(204),(224),(9),(5),(0),(3);
|
INSERT INTO t3 SELECT * FROM t3;
|
|
SELECT a FROM t1, t2, t3 AS t3_alias
|
WHERE b <= ( SELECT SUM(a) FROM t1 )
|
AND EXISTS ( SELECT 1 FROM t3 WHERE c > t3_alias.c )
|
ORDER BY a LIMIT 1;
|
Attachments
Issue Links
- relates to
-
MDEV-193 LP:944706 - Query with impossible or constant subquery in WHERE or HAVING is not precomputed and thus not part of optimization
-
- Closed
-
This is not a bug, but rather a deficiency of the current optimizer that is
completely unrelated to mdev-193. It is a regression only with respect to
MariaDB 5.5/5.3. Detailed explanation follows.
1. What happens in mdev-193
The patch for mdev-193 makes MariaDB 5.5 behave like any MySQL version, or
MariaDB 5.2, namely - it allows constant subqueries to be evaluated during
optimization, and the resulting value to be used by the optimizer.
Specifically in this test case, the subquery (SELECT SUM(a) FROM t1) is
evaluated by the range optimizer, and the final query plan uses "range"
access to table "t2".
If we set "optimizer_prune_level = 0" (that is, not pruning, consider all
plans), the optimizer determines that the optimal query plan is:
-----
------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------join->best_read = 32396
If we set "optimizer_prune_level = 1" (that is prune some plans), the optimizer
prunes the above plan, and finds a worse plan with respect to its cost model:
-----
------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------join->best_read = 66869 (~ two times higher cost)
However, the second plan in reality is much faster. The reason for this is that
table "t3_alias" appears in second place in the plan instead of in third, as a
result the correlated EXISTS subquery is executed fewer times (1531 times). The
first "optimal" plan puts table "t3_alias" in third place in the join, which
results in the EXISTS subquery being executed 152011 times. At the same time
the result of the constant subquery is cached, so it doesn't matter much how
many times it is evaluated.
2. Exposing the same problem in the main 5.5 branch
If we substitute manually the aggregate subquery with its constant value, we can
reproduce the problem in 5.5 via the following test case:
– better plan
SET optimizer_prune_level = 1;
EXPLAIN
FROM t1, t2, t3 AS t3_alias
SELECT count
WHERE b <= 236
AND EXISTS ( SELECT 1 FROM t3 WHERE c > t3_alias.c )
ORDER BY a LIMIT 1;
– worse plan
SET optimizer_prune_level = 0;
EXPLAIN
FROM t1, t2, t3 AS t3_alias
SELECT count
WHERE b <= 236
AND EXISTS ( SELECT 1 FROM t3 WHERE c > t3_alias.c )
ORDER BY a LIMIT 1;