Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29682

Incorrect #rows computations in DuplicateWeedout strategy choice

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.7(EOL), 10.11
    • 10.11
    • Optimizer
    • None

    Description

      Monty has pointed out that this example performs incorrect computations in DuplicateWeedout

      create table t1(a int, b int);
      insert into t1 select seq,seq from seq_0_to_3;
      create table t2 (p int, a int);
      insert into t2 select seq,seq from seq_1_to_10;
      insert into t2 select seq,seq from seq_10_to_100;
      explain extended select * from t1 where a in (select p from t2);
      

      Reproducible both on 10.7-selectivity and clean 10.11.

      The problem occurs when considering this query plan:

      +------+-------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                                                           |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------------------------------------------------------+
      |    1 | PRIMARY     | t1    | ALL  | NULL          | NULL | NULL    | NULL | 4    |   100.00 |                                                                                 |
      |    1 | PRIMARY     | t2    | ALL  | NULL          | NULL | NULL    | NULL | 101  |   100.00 | Using where; Start temporary; End temporary; Using join buffer (flat, BNL join) |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------------------------------------------------------+
      

      Computations in Duplicate_weedout_picker::check_qep go as follows:

      (gdb) p first_tab
        $12 = 1
      (gdb) p join->positions[first_tab].prefix_record_count
        $13 = 404
      

      (gdb) print sj_outer_fanout
        $15 = 1
      (gdb) print sj_inner_fanout
        $16 = 101
      

      Then

            double full_lookup_cost=
                     COST_MULT(join->positions[first_tab].prefix_record_count,
                               COST_MULT(sj_outer_fanout,
                                         sj_inner_fanout * one_lookup_cost));
      

      That is, we are multiplying:

      • join->positions[first_tab].prefix_record_count = 404
      • sj_outer_fanout=1
      • sj_inner_fanout=101 – Multiplying by 101 again!
      • one_lookup_cost = 0.05 – seems ok

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.