Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2, 10.3, 10.4, 10.5, 10.6
-
None
Description
Dataset:
CREATE TABLE t0(a INT); |
INSERT INTO t0 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); |
CREATE TABLE t1(a INT, b INT); |
INSERT INTO t1 SELECT a, a FROM t0; |
INSERT INTO t1 SELECT a, a FROM t0; |
 |
CREATE TABLE t2(a INT, b INT); |
INSERT INTO t2 SELECT a, a FROM t0; |
INSERT INTO t2 SELECT a, a FROM t0; |
Run ANALYZE for the tables, so the EITS are collected
ANALYZE TABLE t1 PERSISTENT FOR ALL; |
ANALYZE TABLE t2 PERSISTENT FOR ALL |
Just setting optimizer_use_condition_selectivity to 3, so we use EITS as there are no indexed columns
set optimizer_use_condition_selectivity=3; |
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t1,t2 WHERE t1.a=t2.a;
|
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
|
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 20 | 100.00 | |
|
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 20 | 100.00 | Using where; Using join buffer (flat, BNL join) |
|
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
|
2 rows in set, 1 warning (0.006 sec)
|
So here we have an equi-join non-indexed condition t1.a = t2.a. The filtered for table t2 shows 100% that means the estimates are such that we would expect all rows to match this condition.
But for such predicates the selectivity is calculated in table_multi_eq_cond_selectivity as:
while ((fi++)) |
{
|
double curr_eq_fld_sel; |
Field *fld= fi.get_curr_field();
|
if (!(fld->table->map & ~(table_bit | rem_tables))) |
continue; |
curr_eq_fld_sel= get_column_avg_frequency(fld) /
|
fld->table->stat_records();
|
if (curr_eq_fld_sel < 1.0) |
set_if_bigger(eq_fld_sel, curr_eq_fld_sel);
|
}
|
here fi is an iterator over an Item_equal object (t1.a=t2.a).
So lets say our join order is t1,t2
for table t2 when the function table_multi_eq_cond_selectivity is invoked, then we state the the selectivity for the predicate (t1.a=t2.a) is
get_column_avg_frequency(t1.a) / #rows(t1)
In this case: get_column_avg_frequency(t1.a)= (Number of non-NULL values of t1.a / Distinct values for t1.a) which is 2 here
so the selectivity would be 2/20 = 0.1
so the filtered should show 10% here for table t2