Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5123

Remove duplicated conditions pushed both to join_tab->select_cond and join_tab->cache_select->cond for blocked joins.

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.5
    • None
    • None

    Description

      If there is a condition that depends only on the second table in a blocked join,
      the following blocked join algorithms (BNL, BNLH) use this condition to pre-filter records from this table prior to joining the records.

      Currently the optimizer extracts such a single table condition independently from the generic condition pushdown. As a result the same condition may be evaluated up to two times more. This is a problem if the condition is expensive. It also makes it problematic to move the condition to an optimal partial join (see MDEV-83).

      Attachments

        Issue Links

          Activity

            timour Timour Katchaounov (Inactive) created issue -
            timour Timour Katchaounov (Inactive) made changes -
            Field Original Value New Value
            timour Timour Katchaounov (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            "Generic" pushdown conditions are attached to specific joins via the member: JOIN_TAB::select_cond.
            The filter conditions used by blocked join algorithms are attached to JOIN_TAB::cache_select::cond.

            Currently the extraction of these conditions happens in the following places in the optimizer:

            JOIN::optimize_inner calls:

            • make_join_select
              Step #2: Extract WHERE/ON parts
              ...
              tmp= make_cond_for_table(thd, cond, used_tables, current_map,
              FALSE, FALSE);
              ...
              if (tmp || !cond || tab->type == JT_REF || tab->type == JT_REF_OR_NULL ||
              tab->type == JT_EQ_REF || first_inner_tab)
              {
              ...
              else if (tab->type == JT_ALL && ! use_quick_range)
              Unknown macro: { .... if (i != join->const_tables && tab->use_quick != 2 && !tab->first_inner) { /* Read with cache */ if (tab->make_scan_filter()) DBUG_RETURN(1); } .... }

              }

            • make_join_readinfo
            • check_join_cache_usage_for_tables()
              if (!tab->hash_join_is_possible() ||
              tab->make_scan_filter())
              goto no_join_cache;
              ...
              tab->cache= new JOIN_CACHE_BNLH(...)
            timour Timour Katchaounov (Inactive) added a comment - "Generic" pushdown conditions are attached to specific joins via the member: JOIN_TAB::select_cond. The filter conditions used by blocked join algorithms are attached to JOIN_TAB::cache_select::cond. Currently the extraction of these conditions happens in the following places in the optimizer: JOIN::optimize_inner calls: make_join_select Step #2: Extract WHERE/ON parts ... tmp= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); ... if (tmp || !cond || tab->type == JT_REF || tab->type == JT_REF_OR_NULL || tab->type == JT_EQ_REF || first_inner_tab) { ... else if (tab->type == JT_ALL && ! use_quick_range) Unknown macro: { .... if (i != join->const_tables && tab->use_quick != 2 && !tab->first_inner) { /* Read with cache */ if (tab->make_scan_filter()) DBUG_RETURN(1); } .... } } make_join_readinfo check_join_cache_usage_for_tables() if (!tab->hash_join_is_possible() || tab->make_scan_filter()) goto no_join_cache; ... tab->cache= new JOIN_CACHE_BNLH(...)
            timour Timour Katchaounov (Inactive) made changes -
            Description If there is a condition that depends only on the second table in a blocked join,
            the following blocked join algorithms (BNL, BNLH, BKAH) use this condition to pre-filter records from this table prior to joining the records.

            Currently the optimizer extracts such a single table condition independently from the generic condition pushdown. As a result the same condition may be evaluated up to two times more. This is a problem if the condition is expensive. It also makes it problematic to move the condition to an optimal partial join (see MDEV-83).
            If there is a condition that depends only on the second table in a blocked join,
            the following blocked join algorithms (BNL, BNLH) use this condition to pre-filter records from this table prior to joining the records.

            Currently the optimizer extracts such a single table condition independently from the generic condition pushdown. As a result the same condition may be evaluated up to two times more. This is a problem if the condition is expensive. It also makes it problematic to move the condition to an optimal partial join (see MDEV-83).

            Low-level design:

            1. Split JOIN_TAB::select_cond into two conditions:

            • select_cond_for_table
            • select_cond

            2. Instead of extracting the whole pushdown condition, make_join_select extracts the above two conditions.
            The current call:
            tmp= make_cond_for_table(thd, cond, used_tables, current_map,
            FALSE, FALSE);
            is substituted by two calls:
            select_cond_for_table= make_cond_for_table(thd, cond, (const_table_map | table->map), table->map, FALSE, TRUE);
            select_cond= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE);

            3. The calls to make_scan_filter() will not extract the single table condition they used to.
            Instead they will use the already extracted condition JOIN_TAB::select_cond_for_table.
            Once select_cond_for_table is used, set it to NULL.

            4. In the end of make_join_read_info set the final pushdown condition to:
            (select_cond_for_table AND select_cond).

            timour Timour Katchaounov (Inactive) added a comment - Low-level design: 1. Split JOIN_TAB::select_cond into two conditions: select_cond_for_table select_cond 2. Instead of extracting the whole pushdown condition, make_join_select extracts the above two conditions. The current call: tmp= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); is substituted by two calls: select_cond_for_table= make_cond_for_table(thd, cond, (const_table_map | table->map), table->map, FALSE, TRUE); select_cond= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); 3. The calls to make_scan_filter() will not extract the single table condition they used to. Instead they will use the already extracted condition JOIN_TAB::select_cond_for_table. Once select_cond_for_table is used, set it to NULL. 4. In the end of make_join_read_info set the final pushdown condition to: (select_cond_for_table AND select_cond).

            So, after this task, instead of extracting one condition:

            table_n_cond = make_cond_for_table(....)

            we will always extract two:

            table_n_cond_for_table = ...
            table_n_select_cond =...

            and will then use

            table_n_cond_for_table AND table_n_select_cond

            this approach may cause a problem:

            "table_n_cond" is not always the same as "table_n_cond_for_table AND table_n_select_cond"

            psergei Sergei Petrunia added a comment - So, after this task, instead of extracting one condition: table_n_cond = make_cond_for_table(....) we will always extract two: table_n_cond_for_table = ... table_n_select_cond =... and will then use table_n_cond_for_table AND table_n_select_cond this approach may cause a problem: "table_n_cond" is not always the same as "table_n_cond_for_table AND table_n_select_cond"
            timour Timour Katchaounov (Inactive) made changes -
            Attachment md5123.1.diff [ 23902 ]

            The attached diff implements the above incomplete idea.

            timour Timour Katchaounov (Inactive) added a comment - The attached diff implements the above incomplete idea.

            The task will be implemented according the following idea porposed by Sergey Petrunia:

            for each top-level AND-item in select_cond
            {
            if (item->used_tables() is covered by current_table | const_tables)
            remove item from select_cond
            }

            These conjuncts can be removed from select_cond because they are false
            if cache_cond->select is false. This is so, because cache_cond->select is the
            maximal condition that depends only on (current_table | const_tables).

            timour Timour Katchaounov (Inactive) added a comment - - edited The task will be implemented according the following idea porposed by Sergey Petrunia: for each top-level AND-item in select_cond { if (item->used_tables() is covered by current_table | const_tables) remove item from select_cond } These conjuncts can be removed from select_cond because they are false if cache_cond->select is false. This is so, because cache_cond->select is the maximal condition that depends only on (current_table | const_tables).
            timour Timour Katchaounov (Inactive) made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]

            Had a review discussion on the latest variant of the patch. Approved, after requested changes are implemented.

            psergei Sergei Petrunia added a comment - Had a review discussion on the latest variant of the patch. Approved, after requested changes are implemented.

            The task is pushed to 5.5-timour (because this is my only tree that runs under buildbot).

            • All review comments implemented,
            • All tests pass,
            • Submitted for testing to Elena.
            timour Timour Katchaounov (Inactive) added a comment - The task is pushed to 5.5-timour (because this is my only tree that runs under buildbot). All review comments implemented, All tests pass, Submitted for testing to Elena.
            • reviewed by Sergey Petrunia
            • all review comments implemented
            • tested by Elena
            • pushed to 10.0-base (10.0.5)
            timour Timour Katchaounov (Inactive) added a comment - reviewed by Sergey Petrunia all review comments implemented tested by Elena pushed to 10.0-base (10.0.5)
            timour Timour Katchaounov (Inactive) made changes -
            Fix Version/s 10.0.5 [ 13201 ]
            Fix Version/s 10.0.6 [ 13202 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow defaullt [ 29249 ] MariaDB v2 [ 43423 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 43423 ] MariaDB v3 [ 62518 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 62518 ] MariaDB v4 [ 132204 ]

            People

              timour Timour Katchaounov (Inactive)
              timour Timour Katchaounov (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.