It is possible to do condition pushdown into non-mergable VIEWs or derived tables.
Example:
select ...
from
(select col1, max(col2) as max_val
from t2
group by t1 ) TBL
where
col1 !='foo' and max_val > 100
here, both parts of top-level select's WHERE clause can be pushed into derived table's HAVING clause.
Implementation challenges
Moving condition from one select into another changes the context which condition is used in.
Table condition pushdown produces strictest-possible conditions, but has a property that certain part of condition can be attached to multiple tables.
Doing the same thing here will cause trouble: if the same part of the condition is both in subquery's HAVING and in upper select's WHERE, it should e.g. automagically changes its used_tables() depending on what context we're looking at it from.
The solution to this is to do limited pushdown: only push down expressions that we can completely remove from upper query's WHERE.
Pushdown may also require adjustments to ref_pointer_array. Sanja and Igor seem to understand the details of this.
The following optimization of the queries that use nonÂmergeable views and/or derived tables will be implemented.
A condition that depends only on the columns of the view/derived table is extracted from the
WHERE condition of the given query and pushed into the the subquery that produces the rows of the view/derived table.
Generally such condition can be pushed into the HAVING clause of the subquery.
But sometimes a new condition that depends only on grouping columns can be extracted from it and pushed into the WHERE clause of the subquery.
Let's consider a view defined through a grouping query:
create view v as select a, avg(b) from t1 group by a;
Let's use this view in a query:
select * from v, t where v.a in (const1, const2) and t.a=v.a;
Apparently this query could be very fast, especially if there were an index on t1.a.
Unfortunately now in MariaDB/MySQL the query is slow because any view defined by a grouping query is first materialized in a temporary table.
If before materialization we could push the condition v.a in (const1, const2) into the view and materialized the result of the query:
select a, avg(b) from t1 group by a having a in (const1, const2);
materialization would become much faster.
At some conditions it makes sense to push a in (const1, const2) even further into the WHERE clause:
select a, avg(b) from t1 where a in (const1, const2) group by a;
(E.g. when there is an index on t1.a)
The above is a simple case. In a general case we'll consider separable and nonÂseparable conditions that can be pushed into materialized view.
1. Separable conditions
These are conjunctive conditions that depend only on view columns:
Example:
select * from v,t where P1(v) AND P2(v,t);
Here P1(v) is a predicate depending only on the columns of v, while P2(v,t) is a predicate depending on the columns of v and t.
Apparently P1(v) can be separated from the WHERE clause of the main query and moved to the HAVING clause of the view. (In some cases it can be moved to the WHERE clause of the view).
The case with separable conditions is pretty easy to implement.
2. NonÂseparable conditions
In a more general case there are no conjunctive conditions depending only on the columns of v, but still some restrictive condition depending only on v can be extracted from the WHERE clause.
Example:
select * from v,t where (P1(v) AND P2(v,t) OR P3(v) AND P4(v,t)) AND P(v,t);
Here the condition (P1(v) OR P3(v)) is implied by the WHERE condition and can be pushed into the view. Yet it's not a conjunctive condition so it cannot be separated from the WHERE condition and moved into the HAVING condition of the view.
In this case we have to build clones of the items for P1(v) and P3(v).
3. Building item clones
In a general case building a clone for an item is quite a big task (E.g. when the item contains a subquery).
But for simple predicates and functions (like inequality, addition) it's not so difficult if we use copy constructors.
4. Using equalities
We could write the first query in the following equivalent form:
select * from v, t where t.a in (const1, const2) and t.a=v.a;
The condition t.a in (const1, const2) can be pushed into the view if we take into account the conjunctive equality t.a=v.a.
When searching for pushable condition we should take into account existing equality predicates.
Galina Shalygina (Inactive)
added a comment - - edited Detailed description of the task.
The following optimization of the queries that use nonÂmergeable views and/or derived tables will be implemented.
A condition that depends only on the columns of the view/derived table is extracted from the
WHERE condition of the given query and pushed into the the subquery that produces the rows of the view/derived table.
Generally such condition can be pushed into the HAVING clause of the subquery.
But sometimes a new condition that depends only on grouping columns can be extracted from it and pushed into the WHERE clause of the subquery.
Let's consider a view defined through a grouping query:
create view v as select a, avg(b) from t1 group by a;
Let's use this view in a query:
select * from v, t where v.a in (const1, const2) and t.a=v.a;
Apparently this query could be very fast, especially if there were an index on t1.a .
Unfortunately now in MariaDB/MySQL the query is slow because any view defined by a grouping query is first materialized in a temporary table.
If before materialization we could push the condition v.a in (const1, const2) into the view and materialized the result of the query:
select a, avg(b) from t1 group by a having a in (const1, const2);
materialization would become much faster.
At some conditions it makes sense to push a in (const1, const2) even further into the WHERE clause:
select a, avg(b) from t1 where a in (const1, const2) group by a;
(E.g. when there is an index on t1.a )
The above is a simple case. In a general case we'll consider separable and nonÂseparable conditions that can be pushed into materialized view.
1. Separable conditions
These are conjunctive conditions that depend only on view columns:
Example:
select * from v,t where P1(v) AND P2(v,t);
Here P1(v) is a predicate depending only on the columns of v , while P2(v,t) is a predicate depending on the columns of v and t .
Apparently P1(v) can be separated from the WHERE clause of the main query and moved to the HAVING clause of the view. (In some cases it can be moved to the WHERE clause of the view).
The case with separable conditions is pretty easy to implement.
2. NonÂseparable conditions
In a more general case there are no conjunctive conditions depending only on the columns of v , but still some restrictive condition depending only on v can be extracted from the WHERE clause.
Example:
select * from v,t
where (P1(v) AND P2(v,t) OR P3(v) AND P4(v,t)) AND P(v,t);
Here the condition (P1(v) OR P3(v)) is implied by the WHERE condition and can be pushed into the view. Yet it's not a conjunctive condition so it cannot be separated from the WHERE condition and moved into the HAVING condition of the view.
In this case we have to build clones of the items for P1(v) and P3(v) .
3. Building item clones
In a general case building a clone for an item is quite a big task (E.g. when the item contains a subquery).
But for simple predicates and functions (like inequality, addition) it's not so difficult if we use copy constructors.
4. Using equalities
We could write the first query in the following equivalent form:
select * from v, t where t.a in (const1, const2) and t.a=v.a;
The condition t.a in (const1, const2) can be pushed into the view if we take into account the conjunctive equality t.a=v.a .
When searching for pushable condition we should take into account existing equality predicates.
1. The build_clone method was implemented for simple exressions (items)
2. Conditions depended directly only on the columns of a materialized view now pushed into the having clause of the query specifying the view.
3. Conditions depended directly only on the groupinf columns of a materialized view now pushed into the where clause of the query specifying the view.
Table t1:
select * from t1;
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 24 | 6 |
| 6 | 7 | 11 |
| 3 | 5 | 19 |
| 8 | 2 | 2 |
+------+------+------+
Table t2:
select * from t2;
+------+------+------+
| a | b | c |
+------+------+------+
| 2 | 7 | 16 |
| 8 | 4 | 5 |
| 2 | 3 | 19 |
| 11 | 1 | 9 |
+------+------+------+
View v1:
create or replace view v1 as select a, round(avg(b)) b, max(c) c from t2 group by a;
+------+------+------+
| a | b | c |
+------+------+------+
| 2 | 5 | 19 |
| 8 | 4 | 5 |
| 11 | 1 | 9 |
+------+------+------+
Queries:
1.
select * from v1, t1 where (t1.a>2) and (v1.b<3);
+------+------+------+------+------+------+
| a | b | c | a | b | c |
+------+------+------+------+------+------+
| 11 | 1 | 9 | 6 | 7 | 11 |
| 11 | 1 | 9 | 3 | 5 | 19 |
| 11 | 1 | 9 | 8 | 2 | 2 |
+------+------+------+------+------+------+
explain format=json select * from v1, t1 where (t1.a>2) and (v1.b<3);
| {
"query_block": {
"select_id": 1,
"table": {
"table_name": "<derived2>",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"materialized": {
"query_block": {
"select_id": 2,
"having_condition": "(b < 3)",
"filesort": {
"sort_key": "t2.a",
"temporary_table": {
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 4,
"filtered": 100
}
}
}
}
}
},
"block-nl-join": {
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t1.a > 2)"
},
"buffer_type": "flat",
"buffer_size": "256Kb",
"join_type": "BNL"
}
}
} |
2.
select * from v1, t1 where (t1.a>2) and (v1.a<3);
+------+------+------+------+------+------+
| a | b | c | a | b | c |
+------+------+------+------+------+------+
| 2 | 5 | 19 | 6 | 7 | 11 |
| 2 | 5 | 19 | 3 | 5 | 19 |
| 2 | 5 | 19 | 8 | 2 | 2 |
+------+------+------+------+------+------+
explain format=json select * from v1, t1 where (t1.a>2) and (v1.a<3);
| {
"query_block": {
"select_id": 1,
"table": {
"table_name": "<derived2>",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"materialized": {
"query_block": {
"select_id": 2,
"filesort": {
"sort_key": "t2.a",
"temporary_table": {
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t2.a < 3)"
}
}
}
}
}
},
"block-nl-join": {
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t1.a > 2)"
},
"buffer_type": "flat",
"buffer_size": "256Kb",
"join_type": "BNL"
}
}
} |
Galina Shalygina (Inactive)
added a comment - - edited The state of the development by June 24 2016:
( see the development tree here https://github.com/shagalla/server/tree/10.2-mdev9197 )
1. The build_clone method was implemented for simple exressions (items)
2. Conditions depended directly only on the columns of a materialized view now pushed into the having clause of the query specifying the view.
3. Conditions depended directly only on the groupinf columns of a materialized view now pushed into the where clause of the query specifying the view.
Table t1:
select * from t1;
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 24 | 6 |
| 6 | 7 | 11 |
| 3 | 5 | 19 |
| 8 | 2 | 2 |
+------+------+------+
Table t2:
select * from t2;
+------+------+------+
| a | b | c |
+------+------+------+
| 2 | 7 | 16 |
| 8 | 4 | 5 |
| 2 | 3 | 19 |
| 11 | 1 | 9 |
+------+------+------+
View v1:
create or replace view v1 as select a, round(avg(b)) b, max(c) c from t2 group by a;
+------+------+------+
| a | b | c |
+------+------+------+
| 2 | 5 | 19 |
| 8 | 4 | 5 |
| 11 | 1 | 9 |
+------+------+------+
Queries:
1.
select * from v1, t1 where (t1.a>2) and (v1.b<3);
+------+------+------+------+------+------+
| a | b | c | a | b | c |
+------+------+------+------+------+------+
| 11 | 1 | 9 | 6 | 7 | 11 |
| 11 | 1 | 9 | 3 | 5 | 19 |
| 11 | 1 | 9 | 8 | 2 | 2 |
+------+------+------+------+------+------+
explain format=json select * from v1, t1 where (t1.a>2) and (v1.b<3);
| {
"query_block": {
"select_id": 1,
"table": {
"table_name": "<derived2>",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"materialized": {
"query_block": {
"select_id": 2,
"having_condition": "(b < 3)",
"filesort": {
"sort_key": "t2.a",
"temporary_table": {
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 4,
"filtered": 100
}
}
}
}
}
},
"block-nl-join": {
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t1.a > 2)"
},
"buffer_type": "flat",
"buffer_size": "256Kb",
"join_type": "BNL"
}
}
} |
2.
select * from v1, t1 where (t1.a>2) and (v1.a<3);
+------+------+------+------+------+------+
| a | b | c | a | b | c |
+------+------+------+------+------+------+
| 2 | 5 | 19 | 6 | 7 | 11 |
| 2 | 5 | 19 | 3 | 5 | 19 |
| 2 | 5 | 19 | 8 | 2 | 2 |
+------+------+------+------+------+------+
explain format=json select * from v1, t1 where (t1.a>2) and (v1.a<3);
| {
"query_block": {
"select_id": 1,
"table": {
"table_name": "<derived2>",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"materialized": {
"query_block": {
"select_id": 2,
"filesort": {
"sort_key": "t2.a",
"temporary_table": {
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t2.a < 3)"
}
}
}
}
}
},
"block-nl-join": {
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 4,
"filtered": 100,
"attached_condition": "(t1.a > 2)"
},
"buffer_type": "flat",
"buffer_size": "256Kb",
"join_type": "BNL"
}
}
} |
-pushing conditions into HAVING;
Methods:
1. TABLE_LIST::check_pushable_cond_for_table:
Mark subformulas of a condition unusable for the condition pushed into table.
2. TABLE_LIST::build_pushable_cond_for_table:
Build condition extractable from the given one depended only on this table.
-pushing conditions into WHERE;
Methods:
1. st_select_lex::check_cond_extraction_for_grouping_fields:
For a condition check possibility of exraction a formula over grouping fields.
2. st_select_lex::extract_cond_for_grouping_fields:
Build condition extractable from the given one.
-pushing conditions into embedded derived tables; -using equalities to extract pushable conditions; -implementation of the case when derived table with UNION is used; -tests: derived_cond_pushdown.test; -comments to each new method;
Galina Shalygina (Inactive)
added a comment - What's I've done for final evaluation:
-building item clones;
Methods: virtual build_clone, virtual get_copy.
-pushing conditions into HAVING;
Methods:
1. TABLE_LIST::check_pushable_cond_for_table:
Mark subformulas of a condition unusable for the condition pushed into table.
2. TABLE_LIST::build_pushable_cond_for_table:
Build condition extractable from the given one depended only on this table.
-pushing conditions into WHERE;
Methods:
1. st_select_lex::check_cond_extraction_for_grouping_fields:
For a condition check possibility of exraction a formula over grouping fields.
2. st_select_lex::extract_cond_for_grouping_fields:
Build condition extractable from the given one.
-pushing conditions into embedded derived tables;
-using equalities to extract pushable conditions;
-implementation of the case when derived table with UNION is used;
-tests: derived_cond_pushdown.test;
-comments to each new method;
The code is ready to be merged into MariaDB 10.2.
Here is the list of commits on github:
https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197
And here is the commit with the consolidated patch:
https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197-cons
Detailed description of the task.
The following optimization of the queries that use nonÂmergeable views and/or derived tables will be implemented.
A condition that depends only on the columns of the view/derived table is extracted from the
WHERE condition of the given query and pushed into the the subquery that produces the rows of the view/derived table.
Generally such condition can be pushed into the HAVING clause of the subquery.
But sometimes a new condition that depends only on grouping columns can be extracted from it and pushed into the WHERE clause of the subquery.
Let's consider a view defined through a grouping query:
create view v as select a, avg(b) from t1 group by a;
Let's use this view in a query:
select * from v, t where v.a in (const1, const2) and t.a=v.a;
Apparently this query could be very fast, especially if there were an index on t1.a.
Unfortunately now in MariaDB/MySQL the query is slow because any view defined by a grouping query is first materialized in a temporary table.
If before materialization we could push the condition v.a in (const1, const2) into the view and materialized the result of the query:
select a, avg(b) from t1 group by a having a in (const1, const2);
materialization would become much faster.
At some conditions it makes sense to push a in (const1, const2) even further into the WHERE clause:
select a, avg(b) from t1 where a in (const1, const2) group by a;
(E.g. when there is an index on t1.a)
The above is a simple case. In a general case we'll consider separable and nonÂseparable conditions that can be pushed into materialized view.
1. Separable conditions
These are conjunctive conditions that depend only on view columns:
Example:
select * from v,t where P1(v) AND P2(v,t);
Here P1(v) is a predicate depending only on the columns of v, while P2(v,t) is a predicate depending on the columns of v and t.
Apparently P1(v) can be separated from the WHERE clause of the main query and moved to the HAVING clause of the view. (In some cases it can be moved to the WHERE clause of the view).
The case with separable conditions is pretty easy to implement.
2. NonÂseparable conditions
In a more general case there are no conjunctive conditions depending only on the columns of v, but still some restrictive condition depending only on v can be extracted from the WHERE clause.
Example:
select * from v,t
where (P1(v) AND P2(v,t) OR P3(v) AND P4(v,t)) AND P(v,t);
Here the condition (P1(v) OR P3(v)) is implied by the WHERE condition and can be pushed into the view. Yet it's not a conjunctive condition so it cannot be separated from the WHERE condition and moved into the HAVING condition of the view.
In this case we have to build clones of the items for P1(v) and P3(v).
3. Building item clones
In a general case building a clone for an item is quite a big task (E.g. when the item contains a subquery).
But for simple predicates and functions (like inequality, addition) it's not so difficult if we use copy constructors.
4. Using equalities
We could write the first query in the following equivalent form:
select * from v, t where t.a in (const1, const2) and t.a=v.a;
The condition t.a in (const1, const2) can be pushed into the view if we take into account the conjunctive equality t.a=v.a.
When searching for pushable condition we should take into account existing equality predicates.