[MDEV-36055] Optimise reorderable LEFT JOINs - Jira

XML

Word

Printable

Details

Type: Task
Status: In Testing (View Workflow)
Priority: Critical
Resolution: Unresolved
Fix Version/s: 12.3
Component/s: Optimizer
Labels:
- optimizer-12.1-sprint
- optimizer-feature

Sprint:
Q4/2025 Server Development

Description

Consider a query starting with table t1 and doing two LEFT JOINs to it:

set optimizer_trace=1;

explain

select *

from

t1

  left join t3 on t3.key1=t1.a

  left join t2 on t2.key1=t1.a;

+------+-------------+-------+------+-----------------+---------+---------+---------+------+-------------+

| id   | select_type | table | type | possible_keys   | key     | key_len | ref     | rows | Extra       |

+------+-------------+-------+------+-----------------+---------+---------+---------+------+-------------+

|    1 | SIMPLE      | t1    | ALL  | NULL            | NULL    | NULL    | NULL    | 10   |             |

|    1 | SIMPLE      | t3    | ref  | t3_key2         | t3_key2 | 5       | j1.t1.a | 100  | Using where |

|    1 | SIMPLE      | t2    | ref  | t2_key1,t3_key1 | t2_key1 | 5       | j1.t1.a | 10   | Using where |

+------+-------------+-------+------+-----------------+---------+---------+---------+------+-------------+

The optimizer trace shows that t1-t3-t2 is the only acceptable join order:

            "table_dependencies": [

                "table": "t1",

                "row_may_be_null": false,

                "map_bit": 0,

                "depends_on_map_bits": []

},

                "table": "t3",

                "row_may_be_null": true,

                "map_bit": 1,

                "depends_on_map_bits": ["0"]

},

                "table": "t2",

                "row_may_be_null": true,

                "map_bit": 2,

                "depends_on_map_bits": ["0", "1"]

But is t1-t2-t3 order also allowed?

Notes

MySQL also has this limitation, and it seems so does PostgreSQL.

Rephrasing the task

t1

  left join t3 on t3.key1=t1.a

  left join t2 on t2.key1=t1.a

equivalent to

t1

  left join t2 on t2.key1=t1.a

  left join t3 on t3.key1=t1.a

Q: More generally, given a JOIN, is it possible to change the optimizer such that it picks an equivalent reordered JOIN that is executed faster?
A: yes, it seems to be easily possible. But when we do that, "pruned_by_heuristic" pruning starts to prune away good join orders in favor of ones with higher cost.

Other details

If this changes the way join optimization is done, we'll need a way to provide old behavior.

Formally speaking

A function P that preprocesses a join J, including noting down the table dependencies. It produces a join_tab, which is passed to greedy search G to find a good join order:

G(P(J))

Denote G1 and P1 to be the algorithm in the base. The basic patch under testing in MDEV-37340 corrects the table dependencies, so we now have a new preprocessor P2 such that

G1(P2(J) is better than G1(P1(J)) for some J.

The problem is that there exists join J such that

G1(P2(J)) is worse than G1(P1(J)).

Let S( n ) be the set of joins of at most n tables and S be the set of all joins. Apparently we could patch the greedy search to G2 such that for all J in S( n ) for small n, by disabling heuristic pruning for such J,

G2(P2(J)) is no worse than G1(P1(J))

The main question is now:

Find a new G3 such that G3(P2(J)) is no worse than G1(P1(J)) for any join J in S, and still better than G1(P1(J)) for some J.

More specifically, P2 expands the search space over P1. A greedy search G attempts to find

argmin_{r in R(P(J))} cost(r),

where R(P(J)) is the set of join orders allowed by P(J). By removing table dependencies, R(P1(J)) is a subset of R(P2(J)). All the permissable join orders are leaves of the search tree for the greedy search. So P2 adds branches and leaves compared to P1. The known problem is that for certain J's some "bad" branches are added, and G1 erroneously thinks these branches are better, and prunes "good" branches in R(P1(J)) instead.

One idea could be to find a characterisation of the added "bad" branches, and make G remove them based on these characteristics.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mdev-36055-testcase1.sql
0.5 kB
2025-07-13 16:02
mdev-36055-testcase3-same-fanout.sql
0.8 kB
2025-07-13 16:15

Issue Links

includes

MDEV-37274 Re-order independent LEFT JOINs

Closed

relates to

MDEV-13817 add support for oracle outer join syntax - the ( + )

Closed

split to

MDEV-37340 Testing for MDEV-36055: Optimise reorderable LEFT JOINs

Open

MDEV-37346 Join optimizer prunes good join orders

Confirmed

MDEV-37824 Testing of MDEV-36055

Open

Activity

People

Assignee:: Pavithra Pandith

Reporter:: Sergei Petrunia

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2025-02-10 09:16

Updated:: 2025-10-08 09:58

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.