Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16337

Setting join_cache_level=4 changes efficient ref access plan to an inefficient hash join

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 5.5, 10.0, 10.1, 10.2, 10.3
    • 10.4
    • Optimizer
    • None

    Description

      Creating the dataset

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table t1 (a int, b int, c int, key(a));
      create table t10 like t1;
      insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
      

      analyze table ten;
      analyze table t10;
      

      Good query plan

      MariaDB [test]>  set join_cache_level=2;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> 
      MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL       |   10 | Using where |
      |    1 | SIMPLE      | t10   | ref  | a             | a    | 5       | test.ten.a |    1 |             |
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      2 rows in set (0.001 sec)
      

      These are default settings.
      In this case

      • we read 10 rows from table ten
      • and for each we make an index lookup into t10 where we expect to find 1 row.

      Bad query plan

      MariaDB [test]>  set join_cache_level=4;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      | id   | select_type | table | type     | possible_keys | key     | key_len | ref        | rows   | Extra                               |
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      |    1 | SIMPLE      | ten   | ALL      | NULL          | NULL    | NULL    | NULL       |     10 | Using where                         |
      |    1 | SIMPLE      | t10   | hash_ALL | a             | #hash#a | 5       | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      2 rows in set (0.001 sec)
      

      For this case

      • it wants to read 10 rows from table ten
      • put them into a buffer
      • create a hash index on the buffer
      • but then do a full table scan on t10 and read 1M rows

      This is obviously very inefficient.

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              varun Varun Gupta (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.