[MDEV-6843] EITS: invalid estimates for column IS NULL condition Created: 2014-10-06  Updated: 2014-11-04  Resolved: 2014-11-04

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.14
Fix Version/s: 10.0.15

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: eits, optimizer


 Description   

Create a dataset:

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k(a int);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
create table t1 (a int, b int);
insert into t1 select NULL, a from one_k;
set optimizer_use_condition_selectivity=4;
SET use_stat_tables = PREFERABLY;
set histogram_size=100;
analyze table t1 persistent for all;

Ok, so we've got EITS statistics for t1.A. t1.a has only NULL values. Let's see how selectivity estimates work

MariaDB [test]> explain extended select * from t1 A straight_join t1 B where A.a < 5;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+
|    1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     0.00 | Using where                        |
|    1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |   100.00 | Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+

So far, good. Now, let's try an "IS NULL":

MariaDB [test]> explain extended select * from t1 A straight_join t1 B where A.a is null;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+
|    1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     0.00 | Using where                        |
|    1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |   100.00 | Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------+

We get filtered=0 even if it should have been 100.



 Comments   
Comment by Sergei Petrunia [ 2014-10-06 ]

The reason is this code in get_column_range_cardinality:

  if (col_non_nulls < 1)
    res= 0;

it basically means: 'if there are sufficiently many NULLs in the table, assume #matching_rows=0'. It ignores the possibility that the predicate is satisfied by NULL values.

Comment by Sergei Petrunia [ 2014-11-04 ]

Igor pushed the fix to 10.0 already.

Generated at Thu Feb 08 07:15:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.