[MDEV-9130] InnoDB uses non-covering index on bool column when estimate is 50% of the table Created: 2015-11-13  Updated: 2016-01-18

Status: Confirmed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - InnoDB
Affects Version/s: 10.0.22, 10.0, 10.1
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: Jiri Kavalik Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 2
Labels: upstream-wontfix
Environment:

Ubuntu 12.04, Centos 6.6


Attachments: File test_skewed_bool_plan.sql    

 Description   

When there is a tinyint column with 0/1 values and only handful of rows contain one of them, optimizer computes the rows extimate for using index on that column as 50% of the rows instead of 95%+ and moreover it actually uses the index instead of a table scan even when there are additional conditions on a non-indexed column (so PK ref access will be needed anyway).

Selecting the "rare" value gives much better estimate (and using index is the right choice in that case).

I tested it between 15k - 1M rows and did observe it in production on 3.5M table. It seemed to behave correctly (optimizer prefering tablescan) somewhere under 15k rows.

I tested it on 10.0.22 an 10.1.6 (debug build I had at hand). Not using histogram-based stats. set global innodb_stats_traditional=off; did not make a difference.

I reported this to MySQL too as I found 5.6 doing the same (did not test 5.7) - http://bugs.mysql.com/bug.php?id=79271

Testcase attached, my results here:

MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 1;
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
| id   | select_type | TABLE       | TYPE | possible_keys | KEY  | key_len | REF   | ROWS | Extra       |
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
|    1 | SIMPLE      | skewed_bool | REF  | flag          | flag | 1       | const |   12 | USING WHERE |
+------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
1 ROW IN SET (0.00 sec)
 
MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 0;
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
| id   | select_type | TABLE       | TYPE | possible_keys | KEY  | key_len | REF   | ROWS  | Extra       |
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
|    1 | SIMPLE      | skewed_bool | REF  | flag          | flag | 1       | const | 18768 | USING WHERE |
+------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
1 ROW IN SET (0.00 sec)
 
MariaDB [test]> SELECT COUNT(1), flag FROM skewed_bool GROUP BY flag;
+----------+------+
| COUNT(1) | flag |
+----------+------+
|    36850 |    0 |
|       13 |    1 |
+----------+------+
2 ROWS IN SET (0.01 sec)



 Comments   
Comment by Jiri Kavalik [ 2015-12-08 ]

Was marked as "Not a Bug" by MySQL..

Comment by Elena Stepanova [ 2016-01-08 ]

Assigning to psergey for further consideration.

Generated at Thu Feb 08 07:32:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.