[MDEV-8306] Complete cost-based optimization for ORDER BY with LIMIT Created: 2015-06-11 Updated: 2023-12-16 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Fix Version/s: | 11.5 |
| Type: | New Feature | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Unresolved | Votes: | 22 |
| Labels: | optimizer, optimizer-feature, order-by-optimization | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sub-Tasks: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
A long standing (and informally known) issue: Join optimizer makes its choices [almost] without regard for ORDER BY ... LIMIT clause. ORDER BY ... LIMIT optimizer is invoked when the join order is already fixed. If the picked join order doesn't allow to resolve ORDER BY ... LIMIT efficiently... then we end up with a very poor query plan. Example:
This uses filesort and takes ~8 sec.
This uses index to resolve the ORDER BY ... LIMIT and the select takes 0.01 sec to execute. Dataset:
|
| Comments |
| Comment by Sergei Petrunia [ 2015-06-11 ] | |||||||||||||||||||||||||||||||||||||||||||
|
This can be solved by making the optimizer do this: 1. Run the join optimizer (as it is currently done) Unclear parts: 3.1 may also need to include the case where one runs 3.3 is the most complex and needs to be elaborated on. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2015-11-09 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Another problematic issue is WHERE clause. It is easy to make a decision to switch to using index-based resolution to ORDER BY ... LIMIT N when you assume that the first N rows you enumerate will match the WHERE clause. If the WHERE condition is very selective, one may end up enumerating many more rows before they find the first N rows that match the WHERE. EITS gives the optimizer information about condition selectivity, but does it give enough? The way EITS works currently, it is difficult to tell whether EITS plus join operations "took into account" selectivities of all parts of WHERE clause or there were some un-handled conditions. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2016-10-11 ] | |||||||||||||||||||||||||||||||||||||||||||
|
See also: http://bugs.mysql.com/bug.php?id=83323 | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2017-08-06 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Two ideas about properly taking ORDER BY (but not LIMIT) into account in join optimizer. 1. Join optimizer is doing depth-first search. First it builds some join order, then tries other partial orders and compares costs. When a very first some join order is built, optimizer has the estimate of number of rows in the result. At that point it can estimate the cost of a filesort. Later when comparing plans (or partial plans), it should add the cost of a filesort to those plans that will end up using filesort. This way it can to a fair comparison of a plan-with-filesort and a plan that uses index and no filesort. One of the weak points of this idea, it assumes that independently from the join order, optimizer will always get the same (or similar) estimate for the number of rows in the result. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2017-08-06 ] | |||||||||||||||||||||||||||||||||||||||||||
|
2. Another idea, calculate three plans in parallel. That is, currently, best_access_path() does
Instead it can do the following:
basically, it finds three best plans — one that uses no filesort, one that uses filesort after the first table, and one that does temptable and filesort at the end. When join optimizer is finished, we can select the best plan out of three, taking filesort cost into account. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-03 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Takeaways from browsing through some more-or-less related papers : "Interesting orders"Traditional query optimizer relies on the concept of "interesting orders". ORDER BY/GROUP BY specify "interesting orders", for example. The optimizer tries to keep (ie to not prune away) the query plan [fragments] that produce rows in interesting orders. For example, if there is a query plan [fragment] QP1 that produces an interesting order, and a query plan QP2 (or a fragment for the same part of the query) which is cheaper but doesn't produce the same interesting order, then QP1 is not pruned in favor of QP2. Peculiar property of the LIMIT-n constructConsider the process of constructing a regular join plan for "t1 JOIN t2 JOIN t3". We start with "t1 JOIN t2", and compute its cost. Now, consider "t1 JOIN t2 JOIN t3 ORDER BY ... LIMIT N". Let's assume that short-cutting is possible (execution stops after producing N What value of LIMIT should we use? It depends on the fanout of t2, t3 (which we do not know). We could get an estimate of fanout by running the join optimizer and checking how many rows are expected to be in the join output. @serg's comment above was talking about this also. Simplified approach - satisfy ORDER BY, ignore the LIMITCan we just assume that any query plan that produces the matching order is good enough? | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-07 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Notes from the optimizer call: | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-06-05 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Details about the approach to be taken, termed as Order by limit (bushy plan) General description
So an approach for the bushy plan(if chosen as the best plan) would be like
Note: the bush strategy allows to shortcut the execution for LIMIT early:
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-06-07 ] | |||||||||||||||||||||||||||||||||||||||||||
Optimizer partTake the LIMIT part of ORDER BY LIMIT into account during join optimizationThe biggest difference in query execution for ORDER BY … LIMIT plans comes from ability (or inability) to terminate the query execution early, as soon as #LIMIT matches were found. Two issues for the optimizer here:
Suggested answer for the second: run the join optimizer, compute the total join output cardinality. Then, we can see which fraction of join output we will need and assume that we will need to do the same fraction of work for each table (TODO: our optimizer typically heavily over-estimates cardinality due to unknown selectivity of predicates. Will this cause an overly-optimistic estimates?) (TODO2: another suggestion is to Add a sort operation when join prefix allows itWe can add a sorting operation if all of the columns that are required to compute the sort key are in the join prefix. Note that the check should take into account equality propagation (See Example: WHERE For each reference to tblX.columnY in ORDER BY list, we get:
Then, when we have a join prefix, we can check if it covers all the table_map values that we have. 1.3 Cost of sorting No join buffering in ordered-join modeAfter the sorting nest has been added, the rest of the join is performed in ordered-join mode (first “table” is sorted, the join output is sorted too). Join buffering and BKA must be disabled here. Representation of the sort operation in the JOIN prefix 1.6 Enumeration of different sorting options
And then consider adding table t3, We will add it afterwards:
No overlaps with semi/outer joinsIt’s difficult to handle the situation where the sorting operation comes in the middle of outer join or semi-join processing. SELECT * FROM t1 LEFT JOIN (t2, t3) ON … It is hard to make a sort bush of {t1, t2} here. The suggestion is to not allow this. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-06-10 ] | |||||||||||||||||||||||||||||||||||||||||||
Plan refinement partNo linked join buffers across the bush boundJoin buffering code should be adjusted to not do this. Create temp. tableFigure out the set of fields we need to be saved from the tables inside the sort nest, and create a temporary table with those fields. Access to tables inside the sort nest from post-sorting contextQuery execution with sort nest: When we are at Step 3 and 4, we may need to read columns of tables inside the sort nest. Possible solutions:The “Unpack” solution (aka copy-back)After we read a row from the temporary table, put the values back into the record buffers of the original tables. Advantages: no need to modify any Item objects. The “refer-to-temptable like GROUP BY does” solutionThis is used for example by GROUP BY code. When a tableX.fieldY is accessed from post-group operation context (e.g. when computing the HAVING clause), it is done as follows: The Item_field referring to tableX.fieldY is wrapped inside Item_ref object. Advantages: no need to copy The “refer to temptable but not like GROUP BY does it” solution.Is there anything that prevents us from changing Item_field objects to point to temptable directly? (This change will need to be un-done for SP re-execution). Advantages: no need to copy | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-06-11 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The branch is 10.5-mdev8306 | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-06-25 ] | |||||||||||||||||||||||||||||||||||||||||||
|
While testing for the optimizer part where we enumerate the plans, I found two issues, opened sub-tasks MDEV-19853 and MDEV-19854. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-03 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Optimizer part Part 1 of the project is done, the planner now enumerates plans that would also consider adding a nest to a partial join that would satisfy the order by clause. Also I have made some progress in the second part This would like this:
2) Created the temp table with that would store the partial join | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-09 ] | |||||||||||||||||||||||||||||||||||||||||||
|
More updates
The commits are: | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-16 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Updates
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-23 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Updates
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-30 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Tasks left:
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-08-20 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Issue with DEPENDENT SUBQUERY Case 1: when the subquery is attached to the tables inside the nest Query:
x refers to the item abs(t1.a) as x in the parent select The idea to fix this would be:
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-08-23 ] | |||||||||||||||||||||||||||||||||||||||||||
|
So there was some more progress on taking into consideration of an index to achieve ordering for the first non-const table (MDEV-19854). Here is the output for the explain: | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-08-28 ] | |||||||||||||||||||||||||||||||||||||||||||
|
While testing with dbt3 I found an example where using sort-nest optimization is not picking the best accesses for the table
So with use_sort_nest=1 we use table scan on table nation while we could have use eq_ref access, somehow the optimizer is not picking it | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-08-28 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The optimizer trace shows that table scan was preferred
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-09-24 ] | |||||||||||||||||||||||||||||||||||||||||||
Implementation detailsFinding if the Query can consider use the cost based approach for ORDER BY with LIMIT
Equality propagation for Order BY ItemsPropagate the multiple equalities to all the items in the ORDER BY list, this
Find indexes that can resolve the ORDER BY clause Walk through all the usable indexes for a table and find a map of keys for
Get the cardinality estimate for the join Run the join planner to get the estimate of the cardinality of the join.
Run the join planner to get the best plan that will consider using a sort nest optimizer_prune_level=0 For each prefix if the ORDER BY clause can be resolved by the tables
c) Apply the limit selectivity. This is done because after the nest
d) Also special case when one can use indexes to resolve ORDER BY
After the best plan is picked, we create the sort-nest info structure if the best plan uses a sort-nest.
For the first non-const table that uses an index to resolve ordering setup range/index scan
Creating the temp table for the sort-nest We create a list of field items of the tables inside the nest which are needed
Join buffering is disabled as it can break ordering
Substitution of items belonging to tables inside the nest with the items of the temp table of the nest Here we substitute the items of the tables inside the sort-nest with items
Materializing the result inside the temp table of the sort-nest At the execution phase we materialize the result of the inner tables inside
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-31 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Before MDEV-8306, best_extension_by_limited_search had this code:
Now, I see only the comment is left:
Does this mean we can't provide the old behavior? (Yes I know it is ugly and incorrect) Still, being able to provide old behavior has value and we should discuss if we could do it, here. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-02-02 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The entire snippet is
So when the ORDER BY LIMIT optimization is not enabled we enter the branch
And then this cost of sorting is added to the cost of the current plan
So we still provide the same old behaviour when the ORDER BY LIMIT optimization is not enabled. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-02-02 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The rebased code on top of 10.5 is in the branch 10.5-mdev8306-2 | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-09-19 ] | |||||||||||||||||||||||||||||||||||||||||||
|
A related bug in MySQL to look at : https://bugs.mysql.com/bug.php?id=97001 | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2021-02-09 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The updated branch is on 10.6-order_by_limt |