[MDEV-5980] EITS: if condition is used for REF access, its selectivity is still in filtered% Created: 2014-03-29  Updated: 2014-04-28  Resolved: 2014-04-28

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.9
Fix Version/s: 10.0.11

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: eits


 Description   

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1(key1 int, col1 int, key(key1));
insert into t1 select A.a, A.a from ten A, ten B, ten C;
 
set histogram_size=100;
set use_stat_tables='preferably';
set optimizer_use_condition_selectivity=4;
analyze table t1 persistent for all;

mysql> explain extended select * from t1 where col1=2;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 1000 |     9.90 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+

This is ok.

mysql> explain extended select * from t1 where key1=2;
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
|    1 | SIMPLE      | t1    | ref  | key1          | key1 | 5       | const |  100 |    10.00 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+

This is not ok. "key1=2" is used for ref access. However, its selectivity is still present in "filtered%". Ref access is employed when a condition is very selective, counting its selectivity twice will cause gross under-estimation of output cardinality.



 Comments   
Comment by Sergei Petrunia [ 2014-03-31 ]
  • so it is ref(const).
  • matching_candidates_in_table() is not called.

We should recongize it as an equality that was bound by ref and so discount its
selectivity (the selectivity comes from ... range... how do we figure out that
ref-bound equality has contributed as a possible range scan?)

Idea: keep array of induced selectivity

table->added_selectivity[field_no] = 0.123456

and if $field_no is used for ref access, than it's not contributing to
selectivity, and we remove its selectivity from the product.

Q: what to do about range accesses that span multiple key parts? It's not
possible to remove one key part from selectivity calculations...

Comment by Sergei Petrunia [ 2014-04-01 ]

Modified the testcase a little bit so that data distribution is not uniform:

create table t1a like t1;
insert into t1a select * from t1;
update t1a set key1=9 where key1=5 limit 50;
update t1a set col1=9 where col1=5 limit 50;
analyze table t1a persistent for all;

explain extended select * from t1a where key1=5;
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
|    1 | SIMPLE      | t1a   | ref  | key1          | key1 | 5       | const |   50 |     5.00 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
 
explain extended select * from t1a where key1=9;
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
|    1 | SIMPLE      | t1a   | ref  | key1          | key1 | 5       | const |  150 |    15.00 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+

Now, will debug this query:

explain extended select * from t1a where key1=5;
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+
|    1 | SIMPLE      | t1a   | ref  | key1          | key1 | 5       | const |   50 |     5.00 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+----------+-------+

== calculate_cond_selectivity_for_table() ==
finds quick_cond_selectivity=0.05
sets it as table->cond_selectivity
never attempts to perform records_in_column_ranges()-based analysis (this is
the expected behavior)

== table_cond_selectivity ==
Starts with

double sel= s->table->cond_selectivity; // = 0.05

Eventually it runs this code:

      if (keyuse->val->const_item())
        sel*= table->field[fldno]->cond_selectivity; 

However, field->cond_selectivity=1 there (because we've used range access
estimates, not records_in_column_ranges() estimates).

The rest of the function (including table_multi_eq_cond_selectivity() call) doesn't modify 'sel'. sel remains 0.05
We return to best_extension_by_limited_search() and see

(gdb) p partial_join_cardinality
  $8 = 2.5
(gdb) print current_record_count
  $9 = 50
(gdb) print pushdown_cond_selectivity
  $10 = 0.050000000000000003
(gdb) print partial_join_cardinality
  $11 = 2.5

Comment by Sergei Petrunia [ 2014-04-21 ]

Patch submitted for review

Generated at Thu Feb 08 07:08:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.