Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2(EOL), 10.3(EOL), 10.4(EOL)
Description
calculate_cond_selectivity_for_table has the code to compute table->cond_selectivity from selectivities of potential range accesses.
Selectivities of range accesses are multiplied, but there is code to handle the situation where range accesses use restrictions on the same column. In that case, the code has additional multiplier to account for this fact (and avoid counting the same selectivity twice).
This bug is about that code not working for a fairly basic example where I think it should work.
The testcase (not necessarily minimal):
create table ten(a int primary key); |
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); |
|
create table one_k(a int primary key); |
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C; |
|
create table t1 ( |
col1 int not null, |
col2 int not null, |
useless_1 int not null, |
useless_2 int not null, |
index i0(useless_1, useless_2, col2), |
INDEX i1(col1, col2) |
) engine=myisam;
|
|
insert into t1 |
select |
mod(A.a+1000*B.a, 100),
|
A.a+1000*B.a,
|
1,
|
2
|
from
|
one_k A,
|
one_k B;
|
|
# Just MyISAM's ANALYZE, we don't need EITS data. |
analyze table t1; |
The query:
explain extended
|
select * from t1 |
where |
col2 >= 500000 and col1 >= 50 and useless_1=1 and useless_2=2; |
Checking the optimizer trace, I can see that this will create potential range accesses:
"range_scan_alternatives": [
|
{
|
"index": "i0",
|
"ranges": ["(1,2,500000) <= (useless_1,useless_2,col2)"],
|
"rowid_ordered": false,
|
"using_mrr": false,
|
"index_only": false,
|
"rows": 538797,
|
"cost": 695451,
|
"chosen": false,
|
"cause": "cost"
|
},
|
{
|
"index": "i1",
|
"ranges": ["(50,500000) <= (col1,col2)"],
|
"rowid_ordered": false,
|
"using_mrr": false,
|
"index_only": false,
|
"rows": 507454,
|
"cost": 650303,
|
"chosen": false,
|
"cause": "cost"
|
}
|
],
|
Both range accesses use column col2, both have selectivity around 0.5
But when I step through calculate_cond_selectivity_for_table() function, I can see that
- table->cond_selectivity is assigned the value of 1
- table->cond_selectivity is multiplied by first range access selectivity
- table->cond_selectivity is multiplied by second range access selectivity
I don't see any adjustments to the selectivity that are due to the fact that both potential range acccesses use the same column. I think this is not what was intended.
Attachments
Issue Links
- blocks
-
MDEV-20595 Assertion `0 < sel && sel <= 2.0' failed in table_cond_selectivity
- Stalled
- relates to
-
MDEV-23707 Fix condition selectivity computation for join prefixes
- Stalled