[MDEV-24740] Selectivity for equi-join predicates not involed in ref access is not taken into account for join cardinality estimation Created: 2021-01-30 Updated: 2023-11-06 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Varun Gupta (Inactive) | Assignee: | Sergei Petrunia |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Description |
|
Dataset:
Run ANALYZE for the tables, so the EITS are collected
Just setting optimizer_use_condition_selectivity to 3, so we use EITS as there are no indexed columns
So here we have an equi-join non-indexed condition t1.a = t2.a. The filtered for table t2 shows 100% that means the estimates are such that we would expect all rows to match this condition. But for such predicates the selectivity is calculated in table_multi_eq_cond_selectivity as:
here fi is an iterator over an Item_equal object (t1.a=t2.a). So lets say our join order is t1,t2 for table t2 when the function table_multi_eq_cond_selectivity is invoked, then we state the the selectivity for the predicate (t1.a=t2.a) is get_column_avg_frequency(t1.a) / #rows(t1) In this case: get_column_avg_frequency(t1.a)= (Number of non-NULL values of t1.a / Distinct values for t1.a) which is 2 here |
| Comments |
| Comment by Varun Gupta (Inactive) [ 2021-02-02 ] |
|
The branch for the changes is 10.6-selectivity |
| Comment by Varun Gupta (Inactive) [ 2021-02-05 ] |
|
Also another imporatant takeaway from the optimizer call was that the function which calculates selectivity for the non-indexes columns should be used only when the ORDER BY LIMIT optimization is enabled. |