Details
-
Bug
-
Status: In Progress (View Workflow)
-
Major
-
Resolution: Unresolved
-
11.4
-
None
Description
This comes from customer case analysis in TODO-5687.
Consider a table which defines a lot of long VARCHAR columns but the values are much smaller than the maximum length (this is fairly common):
create table t1 ( |
pk1 int not null, |
pk2 int not null, |
filler1 varchar(500), |
filler2 varchar(500), |
filler3 varchar(500), |
filler4 varchar(500), |
filler5 varchar(500), |
filler6 varchar(500), |
filler7 varchar(500), |
filler8 varchar(500), |
filler9 varchar(500), |
fillerA varchar(500), |
fillerB varchar(500), |
fillerC varchar(500), |
fillerD varchar(500), |
fillerE varchar(500), |
fillerF varchar(500), |
primary key(pk1, pk2) |
) collate utf8mb4_general_ci engine=innodb; |
 |
insert into t1 (pk1, pk2) select |
mod(seq, 8), seq
|
from
|
seq_1_to_10000;
|
analyze table t1; |
The table has 10K rows, let's use a condition on PK prefix to select 12.5% of that:
explain select * from t1 where pk1=1;
|
Surprisingly, this will pick full table scan:
id select_type table type possible_keys key key_len ref rows Extra
|
1 SIMPLE t1 ALL PRIMARY NULL NULL NULL 10000 Using where
|
Optimizer trace will show that
- estimated record numbers are OK for both access methods
- estimated cost of full scan is OK
- estimated cost for RANGE is very high:
"range_analysis": {"table_scan": {"rows": 10000,"cost": 1.6673536},"potential_range_indexes": [{"index": "PRIMARY","usable": true,"key_parts": ["pk1", "pk2"]}],"setup_range_conditions": [],"analyzing_range_alternatives": {"range_scan_alternatives": [{"index": "PRIMARY","ranges": ["(1) <= (pk1) <= (1)"],"rowid_ordered": true,"using_mrr": false,"index_only": false,"rows": 1250,"cost": 2.82014029,"chosen": false,"cause": "cost"}],
The reason for this is that the cost of range access over clustered PK is computed as cost of reading N index tuples of TABLE_SHARE::stored_rec_length size. This number is a worst-case upper-bound estimate. It is very far from reality in this case.
The cost of full table scan is computed using handler->stats.data_file_length estimate, which does account for values being much smaller than their maximum possible size.