[MDEV-19600] The optimizer should be able to produce rows=1 estimate for unique index with NULLable columns Created: 2019-05-26  Updated: 2020-01-15  Resolved: 2019-06-05

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.3.16, 10.4.6

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Relates
relates to MDEV-21480 Unique key using ref access though eq... Closed

 Description   

When the optimizer considers a ref access for a full PRIMARY key, it will use an estimate of rows=1. The same will happen for a UNIQUE key, but only if the columns are not NULL-able. If any of the columns are NULLable, the estimate from index statistics is used.

The reason is possible NULL values. A unique index with NULL-able columns may have multiple rows with NULL values.

However, the most common type of join uses null-rejecting predicates, "keypart = ...". For those, NULLable UNIQUE index is the same as non-NULLable, and the optimizer should be able to figure out that it should use an estimate of rows=1.

Testcase:

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (
  pk int not null primary key auto_increment,
  a int,
  b int,
  unique key(a)
);
# 10K of null values
insert into t1 (a,b) select null, 12345 from ten A, ten B, ten C;
# 10 non-null values 
insert into t1 (a,b) select a,a from ten;
analyze table t1;

mysql> explain select * from ten,t1 where ten.a=t1.a;
+------+-------------+-------+------+---------------+------+---------+----------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref      | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+----------+------+-------------+
|    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL     |   10 | Using where |
|    1 | SIMPLE      | t1    | ref  | a             | a    | 5       | j5.ten.a |   45 |             |
+------+-------------+-------+------+---------------+------+---------+----------+------+-------------+

note, t1.rows=45 here.
Even if we use a null-rejecting ref access on a UNIQUE key, and will get at most 1 row.

Index statistics:

mysql> show keys from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t1    |          0 | PRIMARY  |            1 | pk          | A         |        1010 |     NULL | NULL   |      | BTREE      |         |               |
| t1    |          0 | a        |            1 | a           | A         |          22 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

(If you want to suggest to fiddle with innodb_stats_method, check out MDEV-19574 - persistent innodb stats will always use nulls_equal)
(If you want to suggest to use EITS, note that collecting index stats with EITS currently requires a full index scan, so we can't assume EITS is always there ATM).



 Comments   
Comment by Sergei Petrunia [ 2019-05-27 ]

Patch: http://lists.askmonty.org/pipermail/commits/2019-May/013803.html

Comment by Sergei Petrunia [ 2019-05-27 ]

Igor, please review.

Comment by Igor Babaev [ 2019-05-30 ]

OK to push this patch into 5.5 (Here I'm confused: the description says that it affects 5.5 and
all later versions, but the commit was prepared against 10.3).

Comment by Varun Gupta (Inactive) [ 2020-01-15 ]

According to psergey the reason for pushing into 10.3 was that in the older versions the users are already fine with whatever plans they have.

Generated at Thu Feb 08 08:52:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.