Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
None
-
None
Description
Create the dataset as specified by MDEV-4363.
MariaDB [j10]> set use_stat_tables='preferably';
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [j10]> set optimizer_use_condition_selectivity=4;
|
Query OK, 0 rows affected, 1 warning (0.00 sec)
|
The point of this test is to check skewed data distributions.
The value of 178 is very frequent:
MariaDB [j10]> select (select count(*) from t5 where col2 = 178 ) /(select count(*) from t5);
|
+--------------------------------------------------------------------------+
|
| (select count(*) from t5 where col2 = 178 ) /(select count(*) from t5) |
|
+--------------------------------------------------------------------------+
|
| 0.3301 |
|
+--------------------------------------------------------------------------+
|
1 row in set (0.18 sec)
|
What does EXPLAIN think:
MariaDB [j10]> explain extended select count(*) from t5 where col2 = 178 ;
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| 1 | SIMPLE | t5 | ALL | NULL | NULL | NULL | NULL | 10000 | 3.60 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
1 row in set, 1 warning (0.00 sec)
|
it thinks that selectivity is 3.6%.
If i try it with a regular, rare value:
|
MariaDB [j10]> select (select count(*) from t5 where col2 = 179 ) /(select count(*) from t5);
|
+--------------------------------------------------------------------------+
|
| (select count(*) from t5 where col2 = 179 ) /(select count(*) from t5) |
|
+--------------------------------------------------------------------------+
|
| 0.0001 |
|
+--------------------------------------------------------------------------+
|
1 row in set (0.20 sec)
|
|
MariaDB [j10]> explain extended select count(*) from t5 where col2 = 179 ;
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
| 1 | SIMPLE | t5 | ALL | NULL | NULL | NULL | NULL | 10000 | 3.60 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
|
1 row in set, 1 warning (0.00 sec)
|
It looks like histograms do not allow to distinguish between a frequent and a rare value?
Attachments
Issue Links
- relates to
-
MDEV-4145 Take into account the selectivity of single-table range predicates on non-indexed columns when searching for the best execution plan
-
- Closed
-
One can have an example with non "RQG data", also:
MariaDB [dbt3sf10]> select c_mktsegment, count(*)/@total from customer group by c_mktsegment;
+--------------+-----------------+
| c_mktsegment | count(*)/@total |
+--------------+-----------------+
| AUTOMOBILE | 0.2000 |
| BUILDING | 0.2002 |
| FURNITURE | 0.1997 |
| HOUSEHOLD | 0.1998 |
| MACHINERY | 0.2003 |
+--------------+-----------------+
5 rows in set (0.64 sec)
MariaDB [dbt3sf10]> explain extended select count(*) from customer where c_mktsegment ='AUTOMOBILE';
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 1500000 | 100.00 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
MariaDB [dbt3sf10]> explain extended select count(*) from customer where c_mktsegment ='BUILDING';
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 1500000 | 48.33 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
MariaDB [dbt3sf10]> explain extended select count(*) from customer where c_mktsegment ='FURNITURE';
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 1500000 | 11.94 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
MariaDB [dbt3sf10]> explain extended select count(*) from customer where c_mktsegment ='HOUSEHOLD';
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 1500000 | 23.60 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
MariaDB [dbt3sf10]> explain extended select count(*) from customer where c_mktsegment ='MACHINERY';
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 1500000 | 9.57 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+---------+----------+-------------+
MariaDB [dbt3sf10]> select *, hex(histogram) from mysql.column_stats where column_name='c_mktsegment' and table_name='customer'\G
*************************** 1. row ***************************
db_name: dbt3sf10
table_name: customer
column_name: c_mktsegment
min_value: AUTOMOBILE
max_value: MACHINERY
nulls_ratio: 0.0000
avg_length: 8.9998
avg_frequency: 300000.0000
hist_size: 200
hist_type:
histogram: ����������������������������������������jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj��������������������������������������������������������������������������������
hex(histogram): 00000000000000000000000000000000000000000000000000000000000000000000000000000000151515151515151515151515151515151515151515151515151515151515151515151515151515156A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A6A95959595959595959595959595959595959595959595959595959595959595959595959595959595FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF