Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9197

Pushdown conditions into non-mergeable views/derived tables

Details

    Description

      It is possible to do condition pushdown into non-mergable VIEWs or derived tables.

      Example:

      select ...
      from 
        (select col1, max(col2) as max_val 
         from t2 
         group by t1 ) TBL
      where 
         col1 !='foo' and max_val > 100

      here, both parts of top-level select's WHERE clause can be pushed into derived table's HAVING clause.

      Implementation challenges

      Moving condition from one select into another changes the context which condition is used in.
      Table condition pushdown produces strictest-possible conditions, but has a property that certain part of condition can be attached to multiple tables.
      Doing the same thing here will cause trouble: if the same part of the condition is both in subquery's HAVING and in upper select's WHERE, it should e.g. automagically changes its used_tables() depending on what context we're looking at it from.
      The solution to this is to do limited pushdown: only push down expressions that we can completely remove from upper query's WHERE.

      Pushdown may also require adjustments to ref_pointer_array. Sanja and Igor seem to understand the details of this.

      Attachments

        Issue Links

          Activity

            Detailed description of the task.

            The following optimization of the queries that use non­mergeable views and/or derived tables will be implemented.

            A condition that depends only on the columns of the view/derived table is extracted from the
            WHERE condition of the given query and pushed into the the subquery that produces the rows of the view/derived table.

            Generally such condition can be pushed into the HAVING clause of the subquery.
            But sometimes a new condition that depends only on grouping columns can be extracted from it and pushed into the WHERE clause of the subquery.

            Let's consider a view defined through a grouping query:

            create view v as select a, avg(b) from t1 group by a;

            Let's use this view in a query:

            select * from v, t where v.a in (const1, const2) and t.a=v.a;

            Apparently this query could be very fast, especially if there were an index on t1.a.
            Unfortunately now in MariaDB/MySQL the query is slow because any view defined by a grouping query is first materialized in a temporary table.

            If before materialization we could push the condition v.a in (const1, const2) into the view and materialized the result of the query:

            select a, avg(b) from t1 group by a having a in (const1, const2);

            materialization would become much faster.

            At some conditions it makes sense to push a in (const1, const2) even further into the WHERE clause:

            select a, avg(b) from t1 where a in (const1, const2) group by a;

            (E.g. when there is an index on t1.a)
            The above is a simple case. In a general case we'll consider separable and non­separable conditions that can be pushed into materialized view.

            1. Separable conditions

            These are conjunctive conditions that depend only on view columns:
            Example:

            select * from v,t where P1(v) AND P2(v,t);

            Here P1(v) is a predicate depending only on the columns of v, while P2(v,t) is a predicate depending on the columns of v and t.

            Apparently P1(v) can be separated from the WHERE clause of the main query and moved to the HAVING clause of the view. (In some cases it can be moved to the WHERE clause of the view).

            The case with separable conditions is pretty easy to implement.

            2. Non­separable conditions

            In a more general case there are no conjunctive conditions depending only on the columns of v, but still some restrictive condition depending only on v can be extracted from the WHERE clause.
            Example:

            select * from v,t
            where (P1(v) AND P2(v,t) OR P3(v) AND P4(v,t)) AND P(v,t);

            Here the condition (P1(v) OR P3(v)) is implied by the WHERE condition and can be pushed into the view. Yet it's not a conjunctive condition so it cannot be separated from the WHERE condition and moved into the HAVING condition of the view.

            In this case we have to build clones of the items for P1(v) and P3(v).

            3. Building item clones

            In a general case building a clone for an item is quite a big task (E.g. when the item contains a subquery).
            But for simple predicates and functions (like inequality, addition) it's not so difficult if we use copy constructors.

            4. Using equalities

            We could write the first query in the following equivalent form:

            select * from v, t where t.a in (const1, const2) and t.a=v.a;

            The condition t.a in (const1, const2) can be pushed into the view if we take into account the conjunctive equality t.a=v.a.

            When searching for pushable condition we should take into account existing equality predicates.

            shagalla Galina Shalygina (Inactive) added a comment - - edited Detailed description of the task. The following optimization of the queries that use non­mergeable views and/or derived tables will be implemented. A condition that depends only on the columns of the view/derived table is extracted from the WHERE condition of the given query and pushed into the the subquery that produces the rows of the view/derived table. Generally such condition can be pushed into the HAVING clause of the subquery. But sometimes a new condition that depends only on grouping columns can be extracted from it and pushed into the WHERE clause of the subquery. Let's consider a view defined through a grouping query: create view v as select a, avg(b) from t1 group by a; Let's use this view in a query: select * from v, t where v.a in (const1, const2) and t.a=v.a; Apparently this query could be very fast, especially if there were an index on t1.a . Unfortunately now in MariaDB/MySQL the query is slow because any view defined by a grouping query is first materialized in a temporary table. If before materialization we could push the condition v.a in (const1, const2) into the view and materialized the result of the query: select a, avg(b) from t1 group by a having a in (const1, const2); materialization would become much faster. At some conditions it makes sense to push a in (const1, const2) even further into the WHERE clause: select a, avg(b) from t1 where a in (const1, const2) group by a; (E.g. when there is an index on t1.a ) The above is a simple case. In a general case we'll consider separable and non­separable conditions that can be pushed into materialized view. 1. Separable conditions These are conjunctive conditions that depend only on view columns: Example: select * from v,t where P1(v) AND P2(v,t); Here P1(v) is a predicate depending only on the columns of v , while P2(v,t) is a predicate depending on the columns of v and t . Apparently P1(v) can be separated from the WHERE clause of the main query and moved to the HAVING clause of the view. (In some cases it can be moved to the WHERE clause of the view). The case with separable conditions is pretty easy to implement. 2. Non­separable conditions In a more general case there are no conjunctive conditions depending only on the columns of v , but still some restrictive condition depending only on v can be extracted from the WHERE clause. Example: select * from v,t where (P1(v) AND P2(v,t) OR P3(v) AND P4(v,t)) AND P(v,t); Here the condition (P1(v) OR P3(v)) is implied by the WHERE condition and can be pushed into the view. Yet it's not a conjunctive condition so it cannot be separated from the WHERE condition and moved into the HAVING condition of the view. In this case we have to build clones of the items for P1(v) and P3(v) . 3. Building item clones In a general case building a clone for an item is quite a big task (E.g. when the item contains a subquery). But for simple predicates and functions (like inequality, addition) it's not so difficult if we use copy constructors. 4. Using equalities We could write the first query in the following equivalent form: select * from v, t where t.a in (const1, const2) and t.a=v.a; The condition t.a in (const1, const2) can be pushed into the view if we take into account the conjunctive equality t.a=v.a . When searching for pushable condition we should take into account existing equality predicates.

            The state of the development by June 24 2016:

            ( see the development tree here https://github.com/shagalla/server/tree/10.2-mdev9197 )

            1. The build_clone method was implemented for simple exressions (items)
            2. Conditions depended directly only on the columns of a materialized view now pushed into the having clause of the query specifying the view.
            3. Conditions depended directly only on the groupinf columns of a materialized view now pushed into the where clause of the query specifying the view.

            Table t1:

            select * from t1;
            +------+------+------+
            | a    | b    | c    |
            +------+------+------+
            |    1 |   24 |    6 |
            |    6 |    7 |   11 |
            |    3 |    5 |   19 |
            |    8 |    2 |    2 |
            +------+------+------+
            

            Table t2:

            select * from t2;
            +------+------+------+
            | a    | b    | c    |
            +------+------+------+
            |    2 |    7 |   16 |
            |    8 |    4 |    5 |
            |    2 |    3 |   19 |
            |   11 |    1 |    9 |
            +------+------+------+
            

            View v1:

            create or replace view v1 as select a, round(avg(b)) b, max(c) c from t2 group by a;
            +------+------+------+
            | a    | b    | c    |
            +------+------+------+
            |    2 |    5 |   19 |
            |    8 |    4 |    5 |
            |   11 |    1 |    9 |
            +------+------+------+
            

            Queries:

            1.

            select * from v1, t1 where (t1.a>2) and (v1.b<3);
            +------+------+------+------+------+------+
            | a    | b    | c    | a    | b    | c    |
            +------+------+------+------+------+------+
            |   11 |    1 |    9 |    6 |    7 |   11 |
            |   11 |    1 |    9 |    3 |    5 |   19 |
            |   11 |    1 |    9 |    8 |    2 |    2 |
            +------+------+------+------+------+------+
            

            explain format=json select * from v1, t1 where (t1.a>2) and (v1.b<3);
            | {
              "query_block": {
                "select_id": 1,
                "table": {
                  "table_name": "<derived2>",
                  "access_type": "ALL",
                  "rows": 4,
                  "filtered": 100,
                  "materialized": {
                    "query_block": {
                      "select_id": 2,
                      "having_condition": "(b < 3)",
                      "filesort": {
                        "sort_key": "t2.a",
                        "temporary_table": {
                          "table": {
                            "table_name": "t2",
                            "access_type": "ALL",
                            "rows": 4,
                            "filtered": 100
                          }
                        }
                      }
                    }
                  }
                },
                "block-nl-join": {
                  "table": {
                    "table_name": "t1",
                    "access_type": "ALL",
                    "rows": 4,
                    "filtered": 100,
                    "attached_condition": "(t1.a > 2)"
                  },
                  "buffer_type": "flat",
                  "buffer_size": "256Kb",
                  "join_type": "BNL"
                }
              }
            } |
            

            2.

            select * from v1, t1 where (t1.a>2) and (v1.a<3);
            +------+------+------+------+------+------+
            | a    | b    | c    | a    | b    | c    |
            +------+------+------+------+------+------+
            |    2 |    5 |   19 |    6 |    7 |   11 |
            |    2 |    5 |   19 |    3 |    5 |   19 |
            |    2 |    5 |   19 |    8 |    2 |    2 |
            +------+------+------+------+------+------+
            

            explain format=json select * from v1, t1 where (t1.a>2) and (v1.a<3);
            | {
              "query_block": {
                "select_id": 1,
                "table": {
                  "table_name": "<derived2>",
                  "access_type": "ALL",
                  "rows": 4,
                  "filtered": 100,
                  "materialized": {
                    "query_block": {
                      "select_id": 2,
                      "filesort": {
                        "sort_key": "t2.a",
                        "temporary_table": {
                          "table": {
                            "table_name": "t2",
                            "access_type": "ALL",
                            "rows": 4,
                            "filtered": 100,
                            "attached_condition": "(t2.a < 3)"
                          }
                        }
                      }
                    }
                  }
                },
                "block-nl-join": {
                  "table": {
                    "table_name": "t1",
                    "access_type": "ALL",
                    "rows": 4,
                    "filtered": 100,
                    "attached_condition": "(t1.a > 2)"
                  },
                  "buffer_type": "flat",
                  "buffer_size": "256Kb",
                  "join_type": "BNL"
                }
              }
            } |
            

            shagalla Galina Shalygina (Inactive) added a comment - - edited The state of the development by June 24 2016: ( see the development tree here https://github.com/shagalla/server/tree/10.2-mdev9197 ) 1. The build_clone method was implemented for simple exressions (items) 2. Conditions depended directly only on the columns of a materialized view now pushed into the having clause of the query specifying the view. 3. Conditions depended directly only on the groupinf columns of a materialized view now pushed into the where clause of the query specifying the view. Table t1: select * from t1; +------+------+------+ | a | b | c | +------+------+------+ | 1 | 24 | 6 | | 6 | 7 | 11 | | 3 | 5 | 19 | | 8 | 2 | 2 | +------+------+------+ Table t2: select * from t2; +------+------+------+ | a | b | c | +------+------+------+ | 2 | 7 | 16 | | 8 | 4 | 5 | | 2 | 3 | 19 | | 11 | 1 | 9 | +------+------+------+ View v1: create or replace view v1 as select a, round(avg(b)) b, max(c) c from t2 group by a; +------+------+------+ | a | b | c | +------+------+------+ | 2 | 5 | 19 | | 8 | 4 | 5 | | 11 | 1 | 9 | +------+------+------+ Queries: 1. select * from v1, t1 where (t1.a>2) and (v1.b<3); +------+------+------+------+------+------+ | a | b | c | a | b | c | +------+------+------+------+------+------+ | 11 | 1 | 9 | 6 | 7 | 11 | | 11 | 1 | 9 | 3 | 5 | 19 | | 11 | 1 | 9 | 8 | 2 | 2 | +------+------+------+------+------+------+ explain format=json select * from v1, t1 where (t1.a>2) and (v1.b<3); | { "query_block": { "select_id": 1, "table": { "table_name": "<derived2>", "access_type": "ALL", "rows": 4, "filtered": 100, "materialized": { "query_block": { "select_id": 2, "having_condition": "(b < 3)", "filesort": { "sort_key": "t2.a", "temporary_table": { "table": { "table_name": "t2", "access_type": "ALL", "rows": 4, "filtered": 100 } } } } } }, "block-nl-join": { "table": { "table_name": "t1", "access_type": "ALL", "rows": 4, "filtered": 100, "attached_condition": "(t1.a > 2)" }, "buffer_type": "flat", "buffer_size": "256Kb", "join_type": "BNL" } } } | 2. select * from v1, t1 where (t1.a>2) and (v1.a<3); +------+------+------+------+------+------+ | a | b | c | a | b | c | +------+------+------+------+------+------+ | 2 | 5 | 19 | 6 | 7 | 11 | | 2 | 5 | 19 | 3 | 5 | 19 | | 2 | 5 | 19 | 8 | 2 | 2 | +------+------+------+------+------+------+ explain format=json select * from v1, t1 where (t1.a>2) and (v1.a<3); | { "query_block": { "select_id": 1, "table": { "table_name": "<derived2>", "access_type": "ALL", "rows": 4, "filtered": 100, "materialized": { "query_block": { "select_id": 2, "filesort": { "sort_key": "t2.a", "temporary_table": { "table": { "table_name": "t2", "access_type": "ALL", "rows": 4, "filtered": 100, "attached_condition": "(t2.a < 3)" } } } } } }, "block-nl-join": { "table": { "table_name": "t1", "access_type": "ALL", "rows": 4, "filtered": 100, "attached_condition": "(t1.a > 2)" }, "buffer_type": "flat", "buffer_size": "256Kb", "join_type": "BNL" } } } |

            What's I've done for final evaluation:

            -building item clones;
            Methods: virtual build_clone, virtual get_copy.

            -pushing conditions into HAVING;
            Methods:
            1. TABLE_LIST::check_pushable_cond_for_table:
            Mark subformulas of a condition unusable for the condition pushed into table.
            2. TABLE_LIST::build_pushable_cond_for_table:
            Build condition extractable from the given one depended only on this table.

            -pushing conditions into WHERE;
            Methods:
            1. st_select_lex::check_cond_extraction_for_grouping_fields:
            For a condition check possibility of exraction a formula over grouping fields.
            2. st_select_lex::extract_cond_for_grouping_fields:
            Build condition extractable from the given one.

            -pushing conditions into embedded derived tables;
            -using equalities to extract pushable conditions;
            -implementation of the case when derived table with UNION is used;
            -tests: derived_cond_pushdown.test;
            -comments to each new method;

            The code is ready to be merged into MariaDB 10.2.

            Here is the list of commits on github:

            https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197

            And here is the commit with the consolidated patch:

            https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197-cons

            shagalla Galina Shalygina (Inactive) added a comment - What's I've done for final evaluation: -building item clones; Methods: virtual build_clone, virtual get_copy. -pushing conditions into HAVING; Methods: 1. TABLE_LIST::check_pushable_cond_for_table: Mark subformulas of a condition unusable for the condition pushed into table. 2. TABLE_LIST::build_pushable_cond_for_table: Build condition extractable from the given one depended only on this table. -pushing conditions into WHERE; Methods: 1. st_select_lex::check_cond_extraction_for_grouping_fields: For a condition check possibility of exraction a formula over grouping fields. 2. st_select_lex::extract_cond_for_grouping_fields: Build condition extractable from the given one. -pushing conditions into embedded derived tables; -using equalities to extract pushable conditions; -implementation of the case when derived table with UNION is used; -tests: derived_cond_pushdown.test; -comments to each new method; The code is ready to be merged into MariaDB 10.2. Here is the list of commits on github: https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197 And here is the commit with the consolidated patch: https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197-cons

            The consolidated patch was submitted as a contribution for MariaDB server on 08-23-2016.

            shagalla Galina Shalygina (Inactive) added a comment - - edited The consolidated patch was submitted as a contribution for MariaDB server on 08-23-2016.
            igor Igor Babaev added a comment -

            The code for this task has been pushed into 10.2.2.

            igor Igor Babaev added a comment - The code for this task has been pushed into 10.2.2.
            igor Igor Babaev added a comment -

            The feature was pushed into the 10.2 tree and will appear in the 10.2.2 release.

            igor Igor Babaev added a comment - The feature was pushed into the 10.2 tree and will appear in the 10.2.2 release.
            Krishnadas Krishnadas added a comment -

            DBS test results MDEV-9197.zip

            Krishnadas Krishnadas added a comment - DBS test results MDEV-9197.zip

            People

              igor Igor Babaev
              psergei Sergei Petrunia
              Votes:
              8 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.