Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Create the dataset:
create table t5 (col1 int);
|
set @a=-1;
|
create table one_k (a int) select (@a:=@a+1) as a from information_schema.session_variables A, information_schema.session_variables B limit 1000;
|
insert into t5 select A.a from one_k A, one_k B where A.a < 100 and B.a < 100;
|
set histogram_size=100;
|
analyze table t5 persistent for all;
|
select *, hex(histogram) from mysql.column_stats where table_name='t5'\G
|
*************************** 1. row ***************************
|
db_name: j10
|
table_name: t5
|
column_name: col1
|
min_value: 0
|
max_value: 99
|
nulls_ratio: 0.0000
|
avg_length: 4.0000
|
avg_frequency: 100.0000
|
hist_size: 100
|
hist_type:
|
histogram: (100 bytes here)
|
Ok, so we've got a table with 100 rows of 0, 100 rows of 1, and so forth up to 99.
Let's see how estimating works:
MariaDB [j10]> explain extended select * from t5 where col1 in (1,2,3);
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| 1 | SIMPLE | t5 | ALL | NULL | NULL | NULL | NULL | 10000 | 3.79 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
1 row in set, 1 warning (10.64 sec)
|
The real selectivity is 3%, we've got 3.79. Good.
Now, let's try values that are certainly not in the table:
MariaDB [j10]> explain extended select * from t5 where col1 in (-1,-2,-3);
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| 1 | SIMPLE | t5 | ALL | NULL | NULL | NULL | NULL | 10000 | 3.79 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
OOps, again 3.79%.
Let's see what non-equality range shows:
MariaDB [j10]> explain extended select * from t5 where col1<=-1;
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| 1 | SIMPLE | t5 | ALL | NULL | NULL | NULL | NULL | 10000 | 0.99 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
1%. It's better.
I consider selecitivity obtained for "where col1 in (-1,-2,-3)" to be a bug.
Attachments
Issue Links
- relates to
-
MDEV-4145 Take into account the selectivity of single-table range predicates on non-indexed columns when searching for the best execution plan
- Closed