[MDEV-5992] EITS: Selectivity of non-indexed condition is counted twice in table's fanout Created: 2014-03-31  Updated: 2014-04-02  Resolved: 2014-04-02

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.10
Fix Version/s: 10.0.11

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: eits


 Description   

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (key1 int, col1 int, filler1 varchar(100), filler varchar(100), key(key1));
insert into t1 
select A.a + B.a* 10 + C.a * 100, 
       A.a + B.a* 10 + C.a * 100, 
       'filler1-data', 'filler2-data' 
from 
  ten A, ten B, ten C;
set histogram_size=100;
set use_stat_tables='preferably';
set optimizer_use_condition_selectivity=4;
analyze table t1 persistent for all;

Let's check the data distribution. First, we have a non-indexed condition with
50% selectivity.

explain extended select * from t1 where col1 < 500;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 1000 |    50.50 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+

Second, "key1<10" produces a range access which selects 10/1000= 1/100th of the
table:

explain extended select * from t1 where key1<10 ;
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+
|    1 | SIMPLE      | t1    | range | key1          | key1 | 5       | NULL |    9 |   100.00 | Using index condition |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+

Now, let's use both conditions:

explain extended select * from t1 where key1<10 and col1< 500;
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+
|    1 | SIMPLE      | t1    | range | key1          | key1 | 5       | NULL |    9 |    50.50 | Using index condition; Using where |
+------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+

EXPLAIN looks ok. However, in debugger, one can see:

  Breakpoint 31, matching_candidates_in_table (...
(gdb) fini
  Run till exit from #0  matching_candidates_in_table (...
  Value returned is $279 = 4.5445544554455441

4.5 records is 9 records * 0.5 selectivity. Ok so far.

Breakpoint 32, table_cond_selectivity (...
(gdb) fini
  0x000000000069e43e in best_extension_by_limited_search (...
  Value returned is $281 = 0.50495049504950495
(gdb) list
  7597	      double partial_join_cardinality= current_record_count *
  7598	                                        pushdown_cond_selectivity;
(gdb) print partial_join_cardinality
  $283 = 2.294775022056661

Ooops. Selectivity=0.5 has been applied for the second time, and now we've got
9 * 0.5 * 0.5 = 2.25 for estimate.

Let's check if it matters..

create table t2 as select * from t1;
explain extended select * from t1,t2 where t1.key1<10 and t1.col1< 500;

For the last query, put a breakpoint in best_access_path().

  Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5038, remaining_tables=3, idx=0, disable_jbuf=false, record_count=1, pos=0x7fffca5a58b0, loose_scan_pos=0x7ffff7e9fac0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642
(gdb) p s->table->alias.Ptr
  $288 = 0x7fffca4215f0 "t1"
(gdb) c
  Continuing.
  
  Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5360, remaining_tables=2, idx=1, disable_jbuf=false, record_count=2.294775022056661, pos=0x7fffca5a59b8, loose_scan_pos=0x7ffff7e9f8b0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642
(gdb) p s->table->alias.Ptr
  $289 = 0x7fffca5a7270 "t2"

See: table t2 has got record_count=2.29. Half of what should be.



 Comments   
Comment by Sergei Petrunia [ 2014-03-31 ]

So, matching_candidates_in_table() takes into account that
selectivity(t1.col1<500)=0.5. This is correct.

The wrong action is that table_cond_selectivity() returns 1.

The problem is in table_cond_selectivity(). It should return 1.

Comment by Sergei Petrunia [ 2014-03-31 ]

Committed a patch. Need review.

Generated at Thu Feb 08 07:08:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.