Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.0.22, 10.0(EOL), 10.1(EOL)
-
Ubuntu 12.04, Centos 6.6
Description
When there is a tinyint column with 0/1 values and only handful of rows contain one of them, optimizer computes the rows extimate for using index on that column as 50% of the rows instead of 95%+ and moreover it actually uses the index instead of a table scan even when there are additional conditions on a non-indexed column (so PK ref access will be needed anyway).
Selecting the "rare" value gives much better estimate (and using index is the right choice in that case).
I tested it between 15k - 1M rows and did observe it in production on 3.5M table. It seemed to behave correctly (optimizer prefering tablescan) somewhere under 15k rows.
I tested it on 10.0.22 an 10.1.6 (debug build I had at hand). Not using histogram-based stats. set global innodb_stats_traditional=off; did not make a difference.
I reported this to MySQL too as I found 5.6 doing the same (did not test 5.7) - http://bugs.mysql.com/bug.php?id=79271
Testcase attached, my results here:
MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 1; |
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+ |
| id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra | |
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+ |
| 1 | SIMPLE | skewed_bool | REF | flag | flag | 1 | const | 12 | USING WHERE | |
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+ |
1 ROW IN SET (0.00 sec) |
|
MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 0; |
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+ |
| id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra | |
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+ |
| 1 | SIMPLE | skewed_bool | REF | flag | flag | 1 | const | 18768 | USING WHERE | |
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+ |
1 ROW IN SET (0.00 sec) |
|
MariaDB [test]> SELECT COUNT(1), flag FROM skewed_bool GROUP BY flag; |
+----------+------+ |
| COUNT(1) | flag | |
+----------+------+ |
| 36850 | 0 |
|
| 13 | 1 |
|
+----------+------+ |
2 ROWS IN SET (0.01 sec) |