Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5992

EITS: Selectivity of non-indexed condition is counted twice in table's fanout

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.10
    • 10.0.11
    • None

    Description

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table t1 (key1 int, col1 int, filler1 varchar(100), filler varchar(100), key(key1));
      insert into t1 
      select A.a + B.a* 10 + C.a * 100, 
             A.a + B.a* 10 + C.a * 100, 
             'filler1-data', 'filler2-data' 
      from 
        ten A, ten B, ten C;
      set histogram_size=100;
      set use_stat_tables='preferably';
      set optimizer_use_condition_selectivity=4;
      analyze table t1 persistent for all;

      Let's check the data distribution. First, we have a non-indexed condition with
      50% selectivity.

      explain extended select * from t1 where col1 < 500;
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 1000 |    50.50 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+

      Second, "key1<10" produces a range access which selects 10/1000= 1/100th of the
      table:

      explain extended select * from t1 where key1<10 ;
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+
      |    1 | SIMPLE      | t1    | range | key1          | key1 | 5       | NULL |    9 |   100.00 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+

      Now, let's use both conditions:

      explain extended select * from t1 where key1<10 and col1< 500;
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+
      |    1 | SIMPLE      | t1    | range | key1          | key1 | 5       | NULL |    9 |    50.50 | Using index condition; Using where |
      +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+

      EXPLAIN looks ok. However, in debugger, one can see:

        Breakpoint 31, matching_candidates_in_table (...
      (gdb) fini
        Run till exit from #0  matching_candidates_in_table (...
        Value returned is $279 = 4.5445544554455441

      4.5 records is 9 records * 0.5 selectivity. Ok so far.

      Breakpoint 32, table_cond_selectivity (...
      (gdb) fini
        0x000000000069e43e in best_extension_by_limited_search (...
        Value returned is $281 = 0.50495049504950495
      (gdb) list
        7597	      double partial_join_cardinality= current_record_count *
        7598	                                        pushdown_cond_selectivity;
      (gdb) print partial_join_cardinality
        $283 = 2.294775022056661

      Ooops. Selectivity=0.5 has been applied for the second time, and now we've got
      9 * 0.5 * 0.5 = 2.25 for estimate.

      Let's check if it matters..

      create table t2 as select * from t1;
      explain extended select * from t1,t2 where t1.key1<10 and t1.col1< 500;

      For the last query, put a breakpoint in best_access_path().

        Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5038, remaining_tables=3, idx=0, disable_jbuf=false, record_count=1, pos=0x7fffca5a58b0, loose_scan_pos=0x7ffff7e9fac0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642
      (gdb) p s->table->alias.Ptr
        $288 = 0x7fffca4215f0 "t1"
      (gdb) c
        Continuing.
        
        Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5360, remaining_tables=2, idx=1, disable_jbuf=false, record_count=2.294775022056661, pos=0x7fffca5a59b8, loose_scan_pos=0x7ffff7e9f8b0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642
      (gdb) p s->table->alias.Ptr
        $289 = 0x7fffca5a7270 "t2"

      See: table t2 has got record_count=2.29. Half of what should be.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.