[MDEV-38728] Improve join size estimation for `ref` access - Jira

XML

Word

Printable

Details

Type: Technical task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 11.4
Fix Version/s: None
Component/s: Optimizer
Labels:
None

Sprint:
Q1/2026 Server Development, Q1/2026 Server Maintenance

Description

When estimating number of rows produced by a join after `ref` access (i.e, lookups into a non-unique index), the optimizer assumes all driving table ( * ) values will find matches in the inner table ( ** ). This causes overestimation when the driving table has more distinct values than the inner table's key.

( * ) driving table is the one preceding in the join order
( ** ) inner table is the one following the driving table in the join order

Example

CREATE TABLE t_driving (a INT);  -- 100 rows, 100 distinct values

CREATE TABLE t_inner (a INT, KEY(a));  -- 300 rows, 3 distinct values (0,1,2)

INSERT INTO t_driving SELECT seq FROM seq_1_to_100;

INSERT INTO t_inner SELECT seq % 3 FROM seq_1_to_300;

SELECT * FROM t_driving JOIN t_inner ON t_driving.a = t_inner.a;

rec_per_key for KEY(t_inner.a) == 100.
Current estimate: 100 records in `t_driving` * 100 `rec_per_key` = 10,000 rows.
Actual: Only values 0,1,2 match => 300 rows.

Attachments

Activity

People

Assignee:: Oleg Smirnov

Reporter:: Oleg Smirnov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2026-02-02 12:33

Updated:: 2026-02-02 15:05

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.