Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34720

Poor plan choice for large JOIN with ORDER BY and small LIMIT

    XMLWordPrintable

Details

    Description

      Task setting

      This is a subset of MDEV-8306 that hits the customer.
      They have

      select 
      ...
      from 
         big_table
         join small_table1 on small_table1.key=big_table.key
         join small_table2 on small_table2.key=big_table.key
         ... and so forth ...
      order by bug_table.key limit N
      

      for a very small value of N.
      The optimizer sets

        join->sort_by_table = (big_table);
      

      so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

      But this doesn't help in all cases. Join orders of

         big_table, small_table1, small_table_2 ....
      

      and

         small_table1, big_table (using ref access), small_table2, ...
      

      have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
      TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

      What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

      Doing it in general case requires everything from MDEV-8306.

      This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
      The new behavior is to be controlled by a setting.
      If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

      Implementation

      First, identify sort_by_table - the table that allows to use LIMIT short-cutting.
      (Due to multiple equalities there can be multiple but we can assume it's just one table)

      Do adjustment of cost/cardinality only for full join orders

      Let's assume we've built a complete join order (ignoring ORDER BY ... LIMIT).
      Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

      (An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

      Adjusting join order cost and cardinality

      Do it as follows:
      1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

      Otherwise, we have two options:

      2A: Change the access method for sort_by_table so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

      2B. Pass sort_by_table to filesort. We will need to spend the entire cost of reading sort_by_table but then we will be able to short-cut joining with subsequent tables.

      Interplay with join optimizer pruning

      So, we will adjust the costs only for complete join orders. The adjustment will reduce the cost and output-records.

      Join pruning compares costs of prefix with full join order costs. Comparing un-adjusted costs with adjusted can produce wrong results.

      • we can prune un-adjusted prefix that will be adjusted

      Hooking into the join optimizer - variant 1

      First, run the join optimization as usual. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

      Save join->best_read and join->best_positions.

      Then, start the optimization again:
      set join->best_read= DBL_MAX,
      Run the join optimization with first table=sort_by_table.
      This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
      Normally it's more expensive than QUERY_PLAN1.

      Now, account for LIMIT short-cutting as specified in Adjusting join order's cost and cardinality section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

      Hooking into the join optimizer - variant 2

      Make the join optimizer first start with sort_table as the first table.
      Once we have constructed a full join order, perform #rows/cost adjustments for LIMIT.
      We can set the adjusted cost as join->best_read.

      Then, proceed to run the join optimizer as usual.

      Account for risks

      QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

      To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.