[MDEV-34720] Poor plan choice for large JOIN with ORDER BY and small LIMIT - Jira

Sergei Petrunia created issue - 2024-08-07 12:12

Sergei Petrunia made changes - 2024-08-07 12:19

Field	Original Value	New Value
Description	h2. Task setting This is a subset of MDEV-8306 that hits the customer. They have {code:sql} select ... from big_table join small_table1 on small_table1.key=big_table.key join small_table2 on small_table2.key=big_table.key ... and so forth ... order by bug_table.key limit N {code} for a very small value of N. The optimizer sets {code:cpp} join->sort_by_table = (big_table); {code} so join optimization adds extra costs for query plans that do not start from big_table. But this doesn't help in all cases. Join orders of {code} big_table, small_table1, small_table_2 .... {code} and {code} small_table1, big_table (using ref access), small_table2, ... {code} have very close costs. What we need is a fix that takes the LIMIT clause into account. Doing it in general case requires everything from MDEV-8306. This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit. The new behavior is controlled by a setting. If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice. h2. Implementation	h2. Task setting This is a subset of MDEV-8306 that hits the customer. They have {code:sql} select ... from big_table join small_table1 on small_table1.key=big_table.key join small_table2 on small_table2.key=big_table.key ... and so forth ... order by bug_table.key limit N {code} for a very small value of N. The optimizer sets {code:cpp} join->sort_by_table = (big_table); {code} so join optimization adds extra costs for query plans that do not start from big_table. But this doesn't help in all cases. Join orders of {code} big_table, small_table1, small_table_2 .... {code} and {code} small_table1, big_table (using ref access), small_table2, ... {code} have very close costs. What we need is a fix that takes the LIMIT clause into account. Doing it in general case requires everything from MDEV-8306. This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit. The new behavior is controlled by a setting. If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice. h2. Implementation - variant1 First, run the optimization as usual. This gives an idea about how much records the join would produce. Second, run the optimization starting from the table that could use short-cutting. Note that there are two variants: 1. Use an index that will produce rows in the desired order 2. Feed the first table to filesort(), then read the sorted result. We will still be able to short-cut joining with other tables. In any case, we will get a join order that can take advantage of short-cutting.

Sergei Petrunia made changes - 2024-08-07 12:29

Assignee

Sergei Petrunia [ psergey ]

Sergei Petrunia made changes - 2024-08-07 12:29

Affects Version/s

10.6 [ 24028 ]

Sergei Petrunia made changes - 2024-08-07 12:29

Fix Version/s

10.6 [ 24028 ]

Sergei Petrunia made changes - 2024-08-07 12:29

Labels

order-by-optimization

Sergei Petrunia made changes - 2024-08-07 12:30

Link

This issue relates to TODO-4807 [ TODO-4807 ]

Sergei Petrunia made changes - 2024-08-07 12:41

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, run the optimization as usual. This gives an idea about how much records the join would produce.

Second, run the optimization starting from the table that could use short-cutting. Note that there are two variants:
1. Use an index that will produce rows in the desired order
2. Feed the first table to filesort(), then read the sorted result. We will still be able to short-cut joining with other tables.

In any case, we will get a join order that can take advantage of short-cutting.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table"*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

Sergei Petrunia made changes - 2024-08-07 12:41

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table"*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

Sergei Petrunia made changes - 2024-08-07 12:47

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia made changes - 2024-08-07 12:50

Summary

Poor plan choice for large JOIN and small LIMIT.

Poor plan choice for large JOIN with ORDER BY and small LIMIT

Sergei Petrunia made changes - 2024-08-07 12:50

Attachment

psergey-mdev34720-1.diff [ 73919 ]

Sergei Petrunia added a comment - 2024-08-08 07:05 - edited

bb-10.6-mdev34720 has a patch that is a work-in-progress

Sergei Petrunia added a comment - 2024-08-08 07:05 - edited bb-10.6-mdev34720 has a patch that is a work-in-progress

Sergei Petrunia added a comment - 2024-08-09 07:18

What about cost model and related changes in 11.0?

test_if_skip_sort_order() function itself is almost unchanged.
it calls test_if_cheaper_ordering(). There, there are some changes, some code seems to be moved into the get_range_limit_read_cost() calls.
- Cost of sorting is added, see cost_of_filesort().
In get_range_limit_read_cost, there are changes.

Sergei Petrunia added a comment - 2024-08-09 07:18 What about cost model and related changes in 11.0? test_if_skip_sort_order() function itself is almost unchanged. it calls test_if_cheaper_ordering() . There, there are some changes, some code seems to be moved into the get_range_limit_read_cost() calls. Cost of sorting is added, see cost_of_filesort(). In get_range_limit_read_cost , there are changes.

Sergei Petrunia added a comment - 2024-08-09 07:30 - edited

Can we just build a join order and then call test_if_skip_sort_order()?
No:
even with no_changes=true parameter, test_if_skip_sort_order(... no_changes=true...) will do this:

Access tab->ref.*, tab->type. These are only usable after join optimization has finished and get_best_combination() was called.
Change tab->quick to hold different quick select.

What about other functions used from test_if_skip_sort_order?

get_range_limit_read_cost(const JOIN_TAB *tab, ... ) CAN be used at join optimization stage. It only uses tab->worst_seeks.

test_if_cheaper_ordering(const JOIN_TAB *tab, ...) CANNOT be used at join optimization stage: it uses tab->type and tab->ref which are not yet set.
- It also does this: "uint tablenr= (uint)(tab - join->join_tab);" - this is valid only after get_best_combination re-arranged to JOIN_TABs to follow the picked join order.

Sergei Petrunia added a comment - 2024-08-09 07:30 - edited Can we just build a join order and then call test_if_skip_sort_order()? No: even with no_changes=true parameter, test_if_skip_sort_order(... no_changes=true...) will do this: Access tab->ref.*, tab->type. These are only usable after join optimization has finished and get_best_combination() was called. Change tab->quick to hold different quick select. What about other functions used from test_if_skip_sort_order? get_range_limit_read_cost(const JOIN_TAB *tab, ... ) CAN be used at join optimization stage. It only uses tab->worst_seeks. test_if_cheaper_ordering(const JOIN_TAB *tab, ...) CANNOT be used at join optimization stage: it uses tab->type and tab->ref which are not yet set. It also does this: "uint tablenr= (uint)(tab - join->join_tab);" - this is valid only after get_best_combination re-arranged to JOIN_TABs to follow the picked join order.

Sergei Petrunia added a comment - 2024-08-09 07:44

A question: what is the point of looking for a suitable key in test_if_skip_sort_order() here:

  if ((new_ref_key= test_if_subkey(order, table, ref_key, ref_key_parts,

                                   &usable_keys)) < MAX_KEY)

if then in test_if_cheaper_ordering() we try all keys anyway?

Sergei Petrunia added a comment - 2024-08-09 07:44 A question: what is the point of looking for a suitable key in test_if_skip_sort_order() here: if ((new_ref_key= test_if_subkey(order, table, ref_key, ref_key_parts, &usable_keys)) < MAX_KEY) if then in test_if_cheaper_ordering() we try all keys anyway?

Sergei Petrunia made changes - 2024-08-15 15:43

Status

Open [ 1 ]

In Progress [ 3 ]

Sergei Petrunia made changes - 2024-08-16 12:50

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table.

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have very close costs.
What we need is a fix that takes the LIMIT clause into account.
Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order
cost for cases with very small limit.

The new behavior is controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia made changes - 2024-08-16 12:53

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN2 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia made changes - 2024-08-16 13:21

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Then determine sort_by_table - the table that allows to use LIMIT short-cutting.

Then, *run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1 (note that we don't account for short-cutting yet).

*Now, account for short-cutting*

1. If QUERY_PLAN2 uses an index that produces rows in the desired order, we can just take a fraction of it to see how much it would take to produce #LIMIT rows. (e.g. plan produces 100 records, we have LIMIT 20 -> multiply plan cost by 20/100=0.2).

If QUERY_PLAN2 doesn't produce rows in the desired order, then

2A: consider changing access method for sort_by_table so that we do get rows in the desired order.
After that, we can take a fraction of the cost like described above.

2B. consider passing {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut the join operation.

In any case, we produce cost of QUERY_PLAN2-WITH-SHORTCUTTING and can compare it with cost of QUERY_PLAN1

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation

First, identify {{sort_by_table}} - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

h3. Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a *complete* join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

h3. Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for {{sort_by_table}} so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut joining with subsequent tables.

h3. Join optimizer pruning

TODO

h3. Hooking into the join optimizer - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save {{join->best_read}} and {{join->best_positions}}.

Then, start the optimization again:
set join->best_read= DBL_MAX,
*Run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

*Now, account for LIMIT short-cutting* as specified in _Adjusting join order's cost and cardinality_ section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia made changes - 2024-08-16 13:34

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation

First, identify {{sort_by_table}} - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

h3. Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a *complete* join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

h3. Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for {{sort_by_table}} so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut joining with subsequent tables.

h3. Join optimizer pruning

TODO

h3. Hooking into the join optimizer - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save {{join->best_read}} and {{join->best_positions}}.

Then, start the optimization again:
set join->best_read= DBL_MAX,
*Run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

*Now, account for LIMIT short-cutting* as specified in _Adjusting join order's cost and cardinality_ section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation

First, identify {{sort_by_table}} - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

h3. Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a *complete* join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

h3. Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for {{sort_by_table}} so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut joining with subsequent tables.

h3. Interplay with join optimizer pruning

So, we will adjust the costs only for complete join orders. The adjustment will reduce the cost and output-records.

Join pruning compares costs of prefix with full join order costs. Comparing un-adjusted costs with adjusted can produce wrong results.
* we can prune un-adjusted prefix that will be adjusted

h3. Hooking into the join optimizer - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save {{join->best_read}} and {{join->best_positions}}.

Then, start the optimization again:
set join->best_read= DBL_MAX,
*Run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

*Now, account for LIMIT short-cutting* as specified in _Adjusting join order's cost and cardinality_ section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia made changes - 2024-08-17 10:09

Description

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation

First, identify {{sort_by_table}} - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

h3. Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a *complete* join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

h3. Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for {{sort_by_table}} so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut joining with subsequent tables.

h3. Interplay with join optimizer pruning

So, we will adjust the costs only for complete join orders. The adjustment will reduce the cost and output-records.

Join pruning compares costs of prefix with full join order costs. Comparing un-adjusted costs with adjusted can produce wrong results.
* we can prune un-adjusted prefix that will be adjusted

h3. Hooking into the join optimizer - variant1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save {{join->best_read}} and {{join->best_positions}}.

Then, start the optimization again:
set join->best_read= DBL_MAX,
*Run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

*Now, account for LIMIT short-cutting* as specified in _Adjusting join order's cost and cardinality_ section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

h2. Task setting
This is a subset of MDEV-8306 that hits the customer.
They have
{code:sql}
select
...
from
   big_table
   join small_table1 on small_table1.key=big_table.key
   join small_table2 on small_table2.key=big_table.key
   ... and so forth ...
order by bug_table.key limit N
{code}

for a very small value of N.
The optimizer sets
{code:cpp}
  join->sort_by_table = (big_table);
{code}
so join optimization adds extra costs for query plans that do not start from big_table (denote this TMP-TABLE-COST).

But this doesn't help in all cases. Join orders of
{code}
   big_table, small_table1, small_table_2 ....
{code}
and
{code}
   small_table1, big_table (using ref access), small_table2, ...
{code}

have close costs. And indeed, the execution times for the two join orders (with ORDER BY ... LIMIT removed) are close.
TMP-TABLE-COST is a small fraction of join cost and doesn't make a difference (this seems to match the reality, too).

What does make a difference is short-cutting of execution due to LIMIT, but this is not taken into account.

Doing it in general case requires everything from MDEV-8306.

This MDEV is to have a fix for the join optimizer which adjusts the join order cost for cases with very small limit.
The new behavior is to be controlled by a setting.
If one runs OLTP-mostly workloads, enabling new behavior is an obvious choice.

h2. Implementation

First, identify {{sort_by_table}} - the table that allows to use LIMIT short-cutting.
(Due to multiple equalities there can be multiple but we can assume it's just one table)

h3. Do adjustment of cost/cardinality only for full join orders

Let's assume we've built a *complete* join order (ignoring ORDER BY ... LIMIT).
Having done that, we can compare join_output_size with the LIMIT number and see which "fraction" of join output we need and adjust the cost accordingly.

(An alternative to this: try to account for ORDER BY ...LIMIT earlier, while building join prefixes. We don't try to do this).

h3. Adjusting join order cost and cardinality

Do it as follows:
1: If the join output is already produced in the required order, we can assume that $FRACT of join output will take $FRACT of the cost to produce. (Note: this is incorrect when using e.g. semi-join materialization or materialized derived tables. We ignore this fact for now).

Otherwise, we have two options:

2A: Change the access method for {{sort_by_table}} so that we do get rows in the desired order. After that, we can take a fraction of the cost like described above.

2B. Pass {{sort_by_table}} to filesort. We will need to spend the entire cost of reading {{sort_by_table}} but then we will be able to short-cut joining with subsequent tables.

h3. Interplay with join optimizer pruning

So, we will adjust the costs only for complete join orders. The adjustment will reduce the cost and output-records.

Join pruning compares costs of prefix with full join order costs. Comparing un-adjusted costs with adjusted can produce wrong results.
* we can prune un-adjusted prefix that will be adjusted

h3. Hooking into the join optimizer - variant 1

First, *run the join optimization as usual*. This produces QUERY_PLAN1 and a tight upper bound of how many records the join would produce. (If it's smaller than LIMIT, short-cutting will probably not be beneficial).

Save {{join->best_read}} and {{join->best_positions}}.

Then, start the optimization again:
set join->best_read= DBL_MAX,
*Run the join optimization with first table=sort_by_table*.
This produces QUERY_PLAN2 query plan that can take advantage of short-cutting.
Normally it's more expensive than QUERY_PLAN1.

*Now, account for LIMIT short-cutting* as specified in _Adjusting join order's cost and cardinality_ section. This produces cost of QUERY_PLAN2-WITH-SHORTCUTTING. We can compare it with the cost of QUERY_PLAN1.

h3. Hooking into the join optimizer - variant 2
Make the join optimizer first start with {{sort_table}} as the first table.
Once we have constructed a full join order, perform #rows/cost adjustments for LIMIT.
We can set the adjusted cost as {{join->best_read}}.

Then, proceed to run the join optimizer as usual.

h3. Account for risks

QUERY_PLAN2-WITH-SHORTCUTTING is inherently more "risky". For example, if the join has a condition with selectivity=0.5 that the optimizer is not aware of, the actual cost of producing #LIMIT rows will be 1/0.5=2 times higher.

To account for this, we will add a multiplier, e.g. use QUERY_PLAN2-WITH-SHORTCUTTING only if promises a 100x speedup over QUERY_PLAN1.

Sergei Petrunia added a comment - 2024-08-18 17:15

The latest variant is here: bb-10.6-mdev34720-v2

Sergei Petrunia added a comment - 2024-08-18 17:15 The latest variant is here: bb-10.6-mdev34720-v2

Julien Fritsch made changes - 2024-08-19 07:28

Link

This issue is part of TODO-4816 [ TODO-4816 ]

Ralf Gebhardt made changes - 2024-08-19 11:03

Link

This issue blocks TODO-4816 [ TODO-4816 ]

Ralf Gebhardt made changes - 2024-08-19 11:04

Link

This issue is part of TODO-4816 [ TODO-4816 ]

Ralf Gebhardt made changes - 2024-08-19 11:04

Priority

Major [ 3 ]

Critical [ 2 ]

Michael Widenius added a comment - 2024-08-21 20:26

Review done. One small change requested. Implementation looks fine. Looks almost done.

Michael Widenius added a comment - 2024-08-21 20:26 Review done. One small change requested. Implementation looks fine. Looks almost done.

Sergei Petrunia added a comment - 2024-08-22 08:11 - edited

On the suggestion to not make two calls to greedy_search():

Exhaustive search

This works well when choose_plan is doing exhaustive search:

In choose_plan():

Put the sort_by_table first into join->best_ref;

the rest of join->best_ref() is sorted according to jtab_sort_func

Then, exhaustive search will enumerate prefixes starting from sort_table first.

Add a hook at the end of the main loop in best_extension_by_limited_search():

  if (idx == join->const_tables &&            // we're changing first table

      join->positions[idx] == sort_by_table)  // currently it's sort_by_table

    // We have finished considering join prefixes starting with sort_table

    DBUG_ASSERT(join->best_positions[idx].table == sort_by_table);

    join_limit_shortcut_adjust_cost();

Note: What about commit 515b9ad05 for ~~MDEV-28073~~ which introduces sorting of tables and get_costs_for_tables()?
The sorting done after get_costs_for_tables() call may put some other table ahead of sort_table and the requirement that "join orders starting with sort_table are considered before the any other join orders" will be broken.

Limited lookahead

What if greedy optimization uses a limited lookahead?

Produce a limited prefix starting with sort_table;

Extend it until we have build a full join order;

Adjust the cost with join_limit_shortcut_adjust_cost();

Then we need to restart the process. (this cannot be done from inside best_extension_by_limited_search).

We need to

Build a limited prefix from scratch;  // with no limitation on the first table

Extend it until we have build a full join order;

The easiest way to achieve this is to call choose_plan twice... Everything else seems to be too error-prone.

Sergei Petrunia added a comment - 2024-08-22 08:11 - edited On the suggestion to not make two calls to greedy_search(): Exhaustive search This works well when choose_plan is doing exhaustive search: In choose_plan(): Put the sort_by_table first into join->best_ref; the rest of join->best_ref() is sorted according to jtab_sort_func Then, exhaustive search will enumerate prefixes starting from sort_table first. Add a hook at the end of the main loop in best_extension_by_limited_search(): if (idx == join->const_tables && // we're changing first table join->positions[idx] == sort_by_table) // currently it's sort_by_table { // We have finished considering join prefixes starting with sort_table DBUG_ASSERT(join->best_positions[idx].table == sort_by_table); join_limit_shortcut_adjust_cost(); } Note: What about commit 515b9ad05 for MDEV-28073 which introduces sorting of tables and get_costs_for_tables()? The sorting done after get_costs_for_tables() call may put some other table ahead of sort_table and the requirement that "join orders starting with sort_table are considered before the any other join orders" will be broken. Limited lookahead What if greedy optimization uses a limited lookahead? Produce a limited prefix starting with sort_table; Extend it until we have build a full join order; Adjust the cost with join_limit_shortcut_adjust_cost(); Then we need to restart the process. (this cannot be done from inside best_extension_by_limited_search). We need to Build a limited prefix from scratch; // with no limitation on the first table Extend it until we have build a full join order; The easiest way to achieve this is to call choose_plan twice... Everything else seems to be too error-prone.

Michael Widenius added a comment - 2024-08-22 08:40

A few ways to solve this:

Don't allow limited lookahead when the limit optimization in enabled and we can use the optimization.
Same as above, but do not do limit lookhead for the order-by-first table (don't know if this is possible)
Use your old code to handle this case. This code still can benefit is having the order-by table first.

Michael Widenius added a comment - 2024-08-22 08:40 A few ways to solve this: Don't allow limited lookahead when the limit optimization in enabled and we can use the optimization. Same as above, but do not do limit lookhead for the order-by-first table (don't know if this is possible) Use your old code to handle this case. This code still can benefit is having the order-by table first.

Sergei Petrunia added a comment - 2024-08-22 10:39

There is a problem with the current patch and greedy (aka limited-lookahead ) optimization.

The first greedy_search() call results in a complete plan in join->best_positions and its cost in join->best_read.

But then the second greedy_search() call is made.
We build a join prefix of seaarch_depth size:

  greedy_search()

    best_extension_by_limited_search(...search_depth=N...)

      best_extension_by_limited_search(...search_depth=N-1...)

...

          best_extension_by_limited_search(...search_depth=1...)

Here, best_extension_by_limited_search() will do this:

        if (current_read_time < join->best_read)

          memcpy((uchar*) join->best_positions, (uchar*) join->positions,

                 sizeof(POSITION) * (idx + 1));

          join->join_record_count= partial_join_cardinality;

          join->best_read= current_read_time - 0.001;

Here,

join->best_read is the cost of entire join order (adjusted for LIMIT-shortcut)
current_read_time is the cost of join prefix.

It is clear that these two are not comparable.

Sergei Petrunia added a comment - 2024-08-22 10:39 There is a problem with the current patch and greedy (aka limited-lookahead ) optimization. The first greedy_search() call results in a complete plan in join->best_positions and its cost in join->best_read . But then the second greedy_search() call is made. We build a join prefix of seaarch_depth size: greedy_search() best_extension_by_limited_search(...search_depth=N...) best_extension_by_limited_search(...search_depth=N-1...) ... best_extension_by_limited_search(...search_depth=1...) Here, best_extension_by_limited_search() will do this: if (current_read_time < join->best_read) { memcpy ((uchar*) join->best_positions, (uchar*) join->positions, sizeof (POSITION) * (idx + 1)); join->join_record_count= partial_join_cardinality; join->best_read= current_read_time - 0.001; } Here, join->best_read is the cost of entire join order (adjusted for LIMIT-shortcut) current_read_time is the cost of join prefix. It is clear that these two are not comparable.

Sergei Petrunia added a comment - 2024-08-23 08:00

Fixed the above

commit 4c00931dafbb87edeac6afe44b2f6eeb3c6afc61 (HEAD -> bb-10.6-mdev34720-v2, origin/bb-10.6-mdev34720-v2)

Author: Sergei Petrunia <sergey@mariadb.com>

Date:   Sun Aug 18 20:10:19 2024 +0300

    MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT

    (Variant 2b: call greedy_search() twice, correct handling for limited

     search_depth)

Sergei Petrunia added a comment - 2024-08-23 08:00 Fixed the above commit 4c00931dafbb87edeac6afe44b2f6eeb3c6afc61 (HEAD -> bb-10.6-mdev34720-v2, origin/bb-10.6-mdev34720-v2) Author: Sergei Petrunia <sergey@mariadb.com> Date: Sun Aug 18 20:10:19 2024 +0300 MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT (Variant 2b: call greedy_search() twice, correct handling for limited search_depth)

Sergei Petrunia made changes - 2024-08-23 11:00

Assignee	Sergei Petrunia [ psergey ]	Michael Widenius [ monty ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Michael Widenius added a comment - 2024-08-23 14:33

Reviewed

Michael Widenius added a comment - 2024-08-23 14:33 Reviewed

Michael Widenius added a comment - 2024-08-23 14:34

Only one comment about code indentation.

Michael Widenius added a comment - 2024-08-23 14:34 Only one comment about code indentation.

Michael Widenius made changes - 2024-08-23 14:34

Status

In Review [ 10002 ]

Stalled [ 10000 ]

Sergei Petrunia added a comment - 2024-08-24 15:25 - edited

More details about test_if_cheaper_ordering using rec_per_key and not actual_rec_per_key()

Test case from subselect_innodb shows:

explain select

   (SELECT

      concat(id, '-', key1, '-', col1)

    FROM t2

    WHERE t2.key1 = t1.a

    ORDER BY t2.key2 ASC LIMIT 1)

from

t1;

In best_access_path(s=t2)

              /* Check if we have statistic about the distribution */

              if ((records= keyinfo->actual_rec_per_key(max_key_part-1)))

records=100.
Then, in test_if_cheaper_ordering(tab=t2) rec_per_key is used:

/*

    Calculate the selectivity of the ref_key for REF_ACCESS. For

    RANGE_ACCESS we use table->opt_range_condition_rows.

*/

  if (ref_key >= 0 && ref_key != MAX_KEY && tab->type == JT_REF)

...

    else

      const KEY *ref_keyinfo= table->key_info + ref_key;

      refkey_rows_estimate= ref_keyinfo->rec_per_key[tab->ref.key_parts - 1];

    set_if_bigger(refkey_rows_estimate, 1);

This shows refkey_rows_estimate=50

Sergei Petrunia added a comment - 2024-08-24 15:25 - edited More details about test_if_cheaper_ordering using rec_per_key and not actual_rec_per_key() Test case from subselect_innodb shows: explain select ( SELECT concat(id, '-' , key1, '-' , col1) FROM t2 WHERE t2.key1 = t1.a ORDER BY t2.key2 ASC LIMIT 1) from t1; In best_access_path(s=t2) /* Check if we have statistic about the distribution */ if ((records= keyinfo->actual_rec_per_key(max_key_part-1))) records=100. Then, in test_if_cheaper_ordering(tab=t2) rec_per_key is used: /* Calculate the selectivity of the ref_key for REF_ACCESS. For RANGE_ACCESS we use table->opt_range_condition_rows. */ if (ref_key >= 0 && ref_key != MAX_KEY && tab->type == JT_REF) { ... else { const KEY *ref_keyinfo= table->key_info + ref_key; refkey_rows_estimate= ref_keyinfo->rec_per_key[tab-> ref .key_parts - 1]; } set_if_bigger(refkey_rows_estimate, 1); This shows refkey_rows_estimate=50

Sergei Petrunia added a comment - 2024-08-25 14:14 - edited

For the above, pushed

commit 9020baf126c7d4068b40c57bc2d7dcb747b4b341 (HEAD -> 10.6)

Author: Sergei Petrunia <sergey@mariadb.com>

Date:   Sat Aug 24 19:01:06 2024 +0300

    Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key()

    Discovered this while working on MDEV-34720: test_if_cheaper_ordering()

    uses rec_per_key, while the original estimate for the access method

    is produced in best_access_path() by using actual_rec_per_key().

    Make test_if_cheaper_ordering() also use actual_rec_per_key().

    Also make several getter function "const" to make this compile.

    Also adjusted the testcase to handle this (the change backported from

    11.0)

(This is not necessary for the patch but will be useful when getting the patch to the main tree)

Sergei Petrunia added a comment - 2024-08-25 14:14 - edited For the above, pushed commit 9020baf126c7d4068b40c57bc2d7dcb747b4b341 (HEAD -> 10.6) Author: Sergei Petrunia <sergey@mariadb.com> Date: Sat Aug 24 19:01:06 2024 +0300 Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key() Discovered this while working on MDEV-34720: test_if_cheaper_ordering() uses rec_per_key, while the original estimate for the access method is produced in best_access_path() by using actual_rec_per_key(). Make test_if_cheaper_ordering() also use actual_rec_per_key(). Also make several getter function "const" to make this compile. Also adjusted the testcase to handle this (the change backported from 11.0) (This is not necessary for the patch but will be useful when getting the patch to the main tree)

Sergei Petrunia added a comment - 2024-08-25 15:09

The patch for the issue itself, latest revision:

commit 4c00931dafbb87edeac6afe44b2f6eeb3c6afc61 (HEAD -> bb-10.6-mdev34720-v2, origin/bb-10.6-mdev34720-v2)

Author: Sergei Petrunia <sergey@mariadb.com>

Date:   Sun Aug 18 20:10:19 2024 +0300

    MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT

    (Variant 2b: call greedy_search() twice, correct handling for limited

     search_depth)

    Modify the join optimizer to specifically try to produce join orders that

    can short-cut their execution for ORDER BY..LIMIT clause.

    The optimization is controlled by @@optimizer_join_limit_pref_ratio.

    Default value 0 means don't construct short-cutting join orders.

    Other value means construct short-cutting join order, and prefer it only

    if it promises speedup of more than #value times.

    In Optimizer Trace, look for these names:

    * join_limit_shortcut_is_applicable

    * join_limit_shortcut_plan_search

    * join_limit_shortcut_choice

Sergei Petrunia added a comment - 2024-08-25 15:09 The patch for the issue itself, latest revision: commit 4c00931dafbb87edeac6afe44b2f6eeb3c6afc61 (HEAD -> bb-10.6-mdev34720-v2, origin/bb-10.6-mdev34720-v2) Author: Sergei Petrunia <sergey@mariadb.com> Date: Sun Aug 18 20:10:19 2024 +0300 MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT (Variant 2b: call greedy_search() twice, correct handling for limited search_depth) Modify the join optimizer to specifically try to produce join orders that can short-cut their execution for ORDER BY..LIMIT clause. The optimization is controlled by @@optimizer_join_limit_pref_ratio. Default value 0 means don't construct short-cutting join orders. Other value means construct short-cutting join order, and prefer it only if it promises speedup of more than #value times. In Optimizer Trace, look for these names: * join_limit_shortcut_is_applicable * join_limit_shortcut_plan_search * join_limit_shortcut_choice

Lena Startseva added a comment - 2024-08-27 16:48

Testing done on branch bb-10.6-mdev34720-v2. If there are no new changes it is ok to push

Lena Startseva added a comment - 2024-08-27 16:48 Testing done on branch bb-10.6-mdev34720-v2. If there are no new changes it is ok to push

Julien Fritsch made changes - 2024-08-28 11:58

Link

This issue blocks MENT-2124 [ MENT-2124 ]

Sergei Petrunia added a comment - 2024-08-28 19:28

Documentation: https://mariadb.com/kb/en/optimizer_join_limit_pref_ratio-optimization/#providing-guidelines-to-the-optimizer

Sergei Petrunia added a comment - 2024-08-28 19:28 Documentation: https://mariadb.com/kb/en/optimizer_join_limit_pref_ratio-optimization/#providing-guidelines-to-the-optimizer

Marko Mäkelä added a comment - 2024-09-03 12:08

psergei, was a 10.11 version of this implemented, tested and reviewed? There are conflicts when attempting to merge this to 10.11.

Marko Mäkelä added a comment - 2024-09-03 12:08 psergei , was a 10.11 version of this implemented, tested and reviewed? There are conflicts when attempting to merge this to 10.11.

Ralf Gebhardt made changes - 2024-09-03 12:33

Assignee

Michael Widenius [ monty ]

Sergei Petrunia [ psergey ]

Sergei Petrunia made changes - 2024-09-03 13:14

Fix Version/s		10.6.20 [ 29903 ]
Fix Version/s		10.11.10 [ 29904 ]
Fix Version/s		11.2.6 [ 29906 ]
Fix Version/s		11.4.4 [ 29907 ]
Fix Version/s		11.6.2 [ 29908 ]
Fix Version/s	10.6 [ 24028 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Sergei Petrunia added a comment - 2024-09-03 13:16

marko it should be similar. let me look.

Sergei Petrunia added a comment - 2024-09-03 13:16 marko it should be similar. let me look.

Yuchen Pei added a comment - 2024-09-06 04:52 - edited

marko and psergei: I also noticed some non-trivial conflict when trying to merge 10.6->10.11 today, mainly with the following commits:

commit b3c74bdc1f09b2b8babd7f2bd9d52df2749ddcc3

Author: Monty <monty@mariadb.org>

Date:   Tue May 31 17:36:32 2022 +0300

    Improve pruning in greedy_search by sorting tables during search

    MDEV-28073 Slow query performance in MariaDB when using many tables

...

commit 8c2faad576d6a77314e92755a389de2c41e21242 (github/bb-10.10-optimizer-features)

Author: Sergei Petrunia <sergey@mariadb.com>

Date:   Tue Jul 19 14:13:17 2022 +0300

    MDEV-28929: Plan selection takes forever with MDEV-28852 ...

    Part #2: Extend heuristic pruning to use multiple tables as the

    "Model tables".

...

The nontrivial conflict is this one in sql_select.cc:

@@@ -10738,98 -10853,31 +11096,109 @@@ best_extension_by_limited_search(JOI

    DBUG_EXECUTE("opt", print_plan(join, idx, record_count, read_time, read_time,

                                  "part_plan"););

# [... 32 lines elided]

<<<<<<< A: HEAD

# [... 57 lines elided]

  for (SORT_POSITION *pos= sort ; pos < sort_end ; pos++)

    s= *pos->join_tab;

    if (!(found_eq_ref_tables & s->table->map) &&

        !check_interleaving_with_nj(s))

||||||| Ancestor

    if ((allowed_tables & real_table_bit) &&

        !(remaining_tables & s->dependent) &&

        !check_interleaving_with_nj(s))

=======

    if ((allowed_tables & real_table_bit) &&

        !(remaining_tables & s->dependent) &&

        !check_interleaving_with_nj(s) &&

        join_limit_shortcut_allows_table(join, idx, s))

>>>>>>> B: Incoming

 +      table_map real_table_bit= s->table->map;

        double current_record_count, current_read_time;

        double partial_join_cardinality;

 -      POSITION *position= join->positions + idx;

 -      POSITION loose_scan_pos;

 +      POSITION *position= join->positions + idx, *loose_scan_pos;

        Json_writer_object trace_one_table(thd);

Can you advise how to resolve it, psergei?

Yuchen Pei added a comment - 2024-09-06 04:52 - edited marko and psergei : I also noticed some non-trivial conflict when trying to merge 10.6->10.11 today, mainly with the following commits: commit b3c74bdc1f09b2b8babd7f2bd9d52df2749ddcc3 Author: Monty <monty@mariadb.org> Date: Tue May 31 17:36:32 2022 +0300 Improve pruning in greedy_search by sorting tables during search MDEV-28073 Slow query performance in MariaDB when using many tables ... commit 8c2faad576d6a77314e92755a389de2c41e21242 (github/bb-10.10-optimizer-features) Author: Sergei Petrunia <sergey@mariadb.com> Date: Tue Jul 19 14:13:17 2022 +0300 MDEV-28929: Plan selection takes forever with MDEV-28852 ... Part #2: Extend heuristic pruning to use multiple tables as the "Model tables". ... The nontrivial conflict is this one in sql_select.cc: @@@ -10738,98 -10853,31 +11096,109 @@@ best_extension_by_limited_search(JOI DBUG_EXECUTE("opt", print_plan(join, idx, record_count, read_time, read_time, "part_plan");); # [... 32 lines elided] <<<<<<< A: HEAD # [... 57 lines elided] for (SORT_POSITION *pos= sort ; pos < sort_end ; pos++) { s= *pos->join_tab; if (!(found_eq_ref_tables & s->table->map) && !check_interleaving_with_nj(s)) ||||||| Ancestor if ((allowed_tables & real_table_bit) && !(remaining_tables & s->dependent) && !check_interleaving_with_nj(s)) ======= if ((allowed_tables & real_table_bit) && !(remaining_tables & s->dependent) && !check_interleaving_with_nj(s) && join_limit_shortcut_allows_table(join, idx, s)) >>>>>>> B: Incoming { + table_map real_table_bit= s->table->map; double current_record_count, current_read_time; double partial_join_cardinality; - POSITION *position= join->positions + idx; - POSITION loose_scan_pos; + POSITION *position= join->positions + idx, *loose_scan_pos; Json_writer_object trace_one_table(thd); Can you advise how to resolve it, psergei ?

Marko Mäkelä added a comment - 2024-09-06 12:21

I just pushed a merge of this to 10.11 using this for reference. There will be the following conflicts left for a merge to 11.2:

	both modified:   mysql-test/main/mysqld--help.result

	both modified:   mysql-test/suite/sys_vars/r/sysvars_server_embedded.result

	both modified:   mysql-test/suite/sys_vars/r/sysvars_server_notembedded.result

	both modified:   sql/sql_select.cc

I didn’t check if there would be further conflicts for a merge to 11.4.

Marko Mäkelä added a comment - 2024-09-06 12:21 I just pushed a merge of this to 10.11 using this for reference. There will be the following conflicts left for a merge to 11.2: both modified: mysql-test/main/mysqld--help.result both modified: mysql-test/suite/sys_vars/r/sysvars_server_embedded.result both modified: mysql-test/suite/sys_vars/r/sysvars_server_notembedded.result both modified: sql/sql_select.cc I didn’t check if there would be further conflicts for a merge to 11.4.

Marko Mäkelä added a comment - 2024-09-06 12:24

ycp, I see that you must have been trying to merge 10.6→10.11 in order to resolve conflicts that your recently pushed changes introduced. You should be able to continue with that now.

Marko Mäkelä added a comment - 2024-09-06 12:24 ycp , I see that you must have been trying to merge 10.6→10.11 in order to resolve conflicts that your recently pushed changes introduced. You should be able to continue with that now.

Ramesh Sivaraman made changes - 2024-10-04 07:12

Link

This issue causes ~~MDEV-35072~~ [ ~~MDEV-35072~~ ]

Sergei Petrunia made changes - 2024-10-18 13:37

Link

This issue causes ~~MDEV-35164~~ [ ~~MDEV-35164~~ ]

Sergei Petrunia made changes - 2025-02-27 13:52

Link

This issue relates to MDEV-8306 [ MDEV-8306 ]

MariaDB Server

Poor plan choice for large JOIN with ORDER BY and small LIMIT

Details

Description

Task setting

Implementation

Do adjustment of cost/cardinality only for full join orders

Adjusting join order cost and cardinality

Interplay with join optimizer pruning

Hooking into the join optimizer - variant 1

Hooking into the join optimizer - variant 2

Account for risks

Attachments

Attachments

Issue Links

Activity

Exhaustive search

Limited lookahead

People

Dates

Git Integration