Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31356

Range cost calculations does not take into acount join_buffer

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 11.0(EOL)
    • 11.0.3, 11.1.2
    • Optimizer
    • None

    Description

      Range cost cost calculations does not take into account join_buffer.

      When 11.0 optimizer was coded, the assumption was that RANGE queries will not use the join buffer (and join buffer flag was not set for these).
      The optimizer would however decide later to use join buffer, which means that the cost
      calculations for ranges did not match reality, which could mean that the optimizer could
      use a worse plan.

      The task is to fix that range + join_buffer cost is calculated correctly.

      This is a bit complex to do as the 'range total cost' that best_access_plan() is not specific
      enough to allow correct calculation of range + join_buffer.

      Attachments

        Issue Links

          Activity

            MDEV-31356: Range cost calculations does not take into account join_buffer

            This patch also fixes
            MDEV-31391 Assertion `((best.records_out) == 0.0 ... failed

            Cost changes caused by this change:

            • range queries with join buffer now have a notable smaller cost.
            • range ranges are bit more expensive as the MULTI_RANGE_COST is now
              properly applied to it in all cases (this extra cost is equal to a
              key lookup).
            • table scan cost is slight smaller as we now assume data is cached in
              the engine after the first scan pass. (We did this before for range
              scans and other access methods).
            • partition tables had wrong values for max_row_blocks and
              max_index_blocks. Correcting this, causes range access on
              partitioned tables to have slightly higher cost because of the
              increased estimated IO.
            • Using first match + join buffer caused 'filtered' to be calcualted
              wrong. (Only affected EXPLAIN, not query costs).
            • Added cost_without_join_buffer to optimizer_trace.
            • check_quick_select() adjusted the number of rows according to persistent
              statistics, but did not adjust cost. Now fixed.

            The big change in the patch are:

            • In best_access_path(), where we now are using storing the cost in
              'ALL_READ_COST cost' and only converting it to a double at the end.
              This allows us to more exactly calculate the effect of the join_cache.
            • In JOIN_TAB::estimate_scan_time(), store the cost also in a
              ALL_READ_COST object.

            One of effect if this change is that when joining very small tables:

            t1 some_access_method
            t2 range
            t3 ALL Use join buffer

            This is swiched to

            t1 some_access_method
            t3 ALL
            t2 range use join buffer

            Both plans has the same cost, but as table scan in this case has less
            cost than rang, the table scan will be considered first and thus have
            precidence.

            Test case changes:

            • optimizer_trace - Addition of cost_without_join_buffer
            • subselect_mat_cost_bugs - Small tables and scan versus range
            • range & range_mrr_icp - Range + join_cache is faster than ref
            • optimizer_trace - cost_without_join_buffer, smaller scan cost,
              range setup cost.
            • mrr
            monty Michael Widenius added a comment - MDEV-31356 : Range cost calculations does not take into account join_buffer This patch also fixes MDEV-31391 Assertion `((best.records_out) == 0.0 ... failed Cost changes caused by this change: range queries with join buffer now have a notable smaller cost. range ranges are bit more expensive as the MULTI_RANGE_COST is now properly applied to it in all cases (this extra cost is equal to a key lookup). table scan cost is slight smaller as we now assume data is cached in the engine after the first scan pass. (We did this before for range scans and other access methods). partition tables had wrong values for max_row_blocks and max_index_blocks. Correcting this, causes range access on partitioned tables to have slightly higher cost because of the increased estimated IO. Using first match + join buffer caused 'filtered' to be calcualted wrong. (Only affected EXPLAIN, not query costs). Added cost_without_join_buffer to optimizer_trace. check_quick_select() adjusted the number of rows according to persistent statistics, but did not adjust cost. Now fixed. The big change in the patch are: In best_access_path(), where we now are using storing the cost in 'ALL_READ_COST cost' and only converting it to a double at the end. This allows us to more exactly calculate the effect of the join_cache. In JOIN_TAB::estimate_scan_time(), store the cost also in a ALL_READ_COST object. One of effect if this change is that when joining very small tables: t1 some_access_method t2 range t3 ALL Use join buffer This is swiched to t1 some_access_method t3 ALL t2 range use join buffer Both plans has the same cost, but as table scan in this case has less cost than rang, the table scan will be considered first and thus have precidence. Test case changes: optimizer_trace - Addition of cost_without_join_buffer subselect_mat_cost_bugs - Small tables and scan versus range range & range_mrr_icp - Range + join_cache is faster than ref optimizer_trace - cost_without_join_buffer, smaller scan cost, range setup cost. mrr

            People

              monty Michael Widenius
              monty Michael Widenius
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.