[MDEV-34720] Poor plan choice for large JOIN with ORDER BY and small LIMIT - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.6
Fix Version/s: 10.6.20, 10.11.10, 11.2.6, 11.4.4, 11.6.2
Component/s: Optimizer
Labels:
- order-by-optimization

Description

Task setting

This is a subset of MDEV-8306 that hits the customer.
They have

select

...

from

   big_table

   join small_table1 on small_table1.key=big_table.key

   join small_table2 on small_table2.key=big_table.key

   ... and so forth ...

order by bug_table.key limit N

for a very small value of N.
The optimizer sets

  join->sort_by_table = (big_table);

so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of

   big_table, small_table1, small_table_2 ....

and

   small_table1, big_table (using ref access), small_table2, ...

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

Implementation

First, identify sort_by_table - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a complete join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for sort_by_table so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass sort_by_table to filesort. We will need to spend the entire cost of reading sort_by_table but then we will be able to short-cut joining with subsequent tables.

Interplay with join optimizer pruning

So, we will adjust the costs only for complete join orders. The adjustment will reduce the cost and output-records.

Join pruning compares costs of prefix with full join order costs. Comparing un-adjusted costs with adjusted can produce wrong results.

we can prune un-adjusted prefix that will be adjusted

Hooking into the join optimizer - variant 1

First, run the join optimization as usual. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save join->best_read and join->best_positions.

Then, start the optimization again:
set join->best_read= DBL_MAX,
Run the join optimization with first table=sort_by_table.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

Now, account for LIMIT short-cutting as specified in Adjusting join order's cost and cardinality section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

Hooking into the join optimizer - variant 2

Make the join optimizer first start with sort_table as the first table.
Once we have constructed a full join order, perform #rows/cost adjustments for LIMIT.
We can set the adjusted cost as join->best_read.

Then, proceed to run the join optimizer as usual.

Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

psergey-mdev34720-1.diff
9 kB
2024-08-07 12:50

Issue Links

causes

MDEV-35072 Assertion with optimizer_join_limit_pref_ratio and 1-table select

Closed

MDEV-35164 optimizer_join_limit_pref_ratio: assertion when the ORDER BY table becomes constant

Closed

relates to

MDEV-8306 Complete cost-based optimization for ORDER BY with LIMIT

Stalled

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 1 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 2024-08-07 12:12

Updated:: 2025-09-01 16:05

Resolved:: 2024-09-03 13:14

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.