Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
Description
Consider a CTE that is not recursive, not mergeable, and used multiple times in the query:
create table ten(a int primary key); |
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); |
 |
create table t10 (a int); |
insert into t10 select T.a from ten T, ten T2, ten T3; |
explain
|
with T as ( |
select distinct a from t10 |
)
|
select * from T as T1, T as T2 where T1.a+T2.a>1000; |
EXPLAIN shows that each use of the CTE is completely independent, as if one has copy-pasted its definition:
id select_type table type possible_keys key key_len ref rows Extra
|
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1000
|
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 1000 Using where; Using join buffer (flat, BNL join)
|
3 DERIVED t10 ALL NULL NULL NULL NULL 1000 Using temporary
|
2 DERIVED t10 ALL NULL NULL NULL NULL 1000 Using temporary
|
This can cause poor performance when the CTE in question costs a lot to compute.
Note that MySQL would re-use the CTE in this case:
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL |
|
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where; Using join buffer (hash join) |
|
| 2 | DERIVED | t10 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using temporary |
|
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|
It's interesting that if one defines a VIEW instead of the CTE and uses it multiple times, both MariaDB and MySQL will materialize the view as many times as it is used.
One can say that sharing the CTE may interfere with other optimizations, like derived condition pushdown or derived_with_keys, or Split-Materialized.
A suggestion off the top of the head: if neither of these optimizations were employed, can we still re-use the CTE?