Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16337

Setting join_cache_level=4 changes efficient ref access plan to an inefficient hash join

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL)
    • 10.4(EOL)
    • Optimizer
    • None

    Description

      Creating the dataset

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table one_k(a int primary key);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
       
      create table t1 (a int, b int, c int, key(a));
      create table t10 like t1;
      insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
      

      analyze table ten;
      analyze table t10;
      

      Good query plan

      MariaDB [test]>  set join_cache_level=2;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> 
      MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL       |   10 | Using where |
      |    1 | SIMPLE      | t10   | ref  | a             | a    | 5       | test.ten.a |    1 |             |
      +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
      2 rows in set (0.001 sec)
      

      These are default settings.
      In this case

      • we read 10 rows from table ten
      • and for each we make an index lookup into t10 where we expect to find 1 row.

      Bad query plan

      MariaDB [test]>  set join_cache_level=4;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      | id   | select_type | table | type     | possible_keys | key     | key_len | ref        | rows   | Extra                               |
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      |    1 | SIMPLE      | ten   | ALL      | NULL          | NULL    | NULL    | NULL       |     10 | Using where                         |
      |    1 | SIMPLE      | t10   | hash_ALL | a             | #hash#a | 5       | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
      +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
      2 rows in set (0.001 sec)
      

      For this case

      • it wants to read 10 rows from table ten
      • put them into a buffer
      • create a hash index on the buffer
      • but then do a full table scan on t10 and read 1M rows

      This is obviously very inefficient.

      Attachments

        Issue Links

          Activity

            varun Varun Gupta (Inactive) created issue -
            varun Varun Gupta (Inactive) made changes -
            Field Original Value New Value
            Fix Version/s 5.5 [ 15800 ]
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            varun Varun Gupta (Inactive) made changes -
            Description Creating the dataset

            {code:sql}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table t1 (a int, b int, c int, key(a));
            create table t10 like t1;
            insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
            {code}


            {code:sql}
            analyze table ten;
            analyze table t10;
            {code}

            Good query plan
            {noformat}
            MariaDB [test]> set join_cache_level=2;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]>
            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            2 rows in set (0.001 sec)

            These are default settings, we read 10 rows from table ten
            and for each we make an index lookup into t10 where we expect to find 1 row.
            {noformat}


            Bad query plan
            {noformat}
            MariaDB [test]> set join_cache_level=4;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            2 rows in set (0.001 sec)

            For this case
            * it wants to read 10 rows from table ten;
            * put them into a buffer
            * create a hash index on the buffer
            * but then do a full table scan on t10 and read 1M rows.

            This is obviously very inefficient.
            {noformat}



            varun Varun Gupta (Inactive) made changes -
            Description Creating the dataset

            {code:sql}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table t1 (a int, b int, c int, key(a));
            create table t10 like t1;
            insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
            {code}


            {code:sql}
            analyze table ten;
            analyze table t10;
            {code}

            Good query plan
            {noformat}
            MariaDB [test]> set join_cache_level=2;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]>
            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            2 rows in set (0.001 sec)

            These are default settings, we read 10 rows from table ten
            and for each we make an index lookup into t10 where we expect to find 1 row.
            {noformat}


            Bad query plan
            {noformat}
            MariaDB [test]> set join_cache_level=4;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            2 rows in set (0.001 sec)

            For this case
            * it wants to read 10 rows from table ten;
            * put them into a buffer
            * create a hash index on the buffer
            * but then do a full table scan on t10 and read 1M rows.

            This is obviously very inefficient.
            {noformat}



            Creating the dataset

            {code:sql}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table t1 (a int, b int, c int, key(a));
            create table t10 like t1;
            insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
            {code}


            {code:sql}
            analyze table ten;
            analyze table t10;
            {code}

            Good query plan
            {noformat}
            MariaDB [test]> set join_cache_level=2;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]>
            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            2 rows in set (0.001 sec)
            {noformat}

            These are default settings.
            In this case
            * we read 10 rows from table ten
            * and for each we make an index lookup into t10 where we expect to find 1 row.


            Bad query plan
            {noformat}
            MariaDB [test]> set join_cache_level=4;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            2 rows in set (0.001 sec)
            {noformat}
            For this case
            * it wants to read 10 rows from table ten
            * put them into a buffer
            * create a hash index on the buffer
            * but then do a full table scan on t10 and read 1M rows

            This is obviously very inefficient.




            varun Varun Gupta (Inactive) made changes -
            varun Varun Gupta (Inactive) made changes -
            varun Varun Gupta (Inactive) made changes -
            Summary Picking an efficient plan by changing ref access to hash join Setting join_cache_level=4 changes efficient ref access plan to an inefficient hash join
            varun Varun Gupta (Inactive) made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 10.4 [ 22408 ]
            psergei Sergei Petrunia made changes -
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 5.5 [ 15800 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.0 [ 16000 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.1 [ 16100 ]
            julien.fritsch Julien Fritsch made changes -
            Assignee Varun Gupta [ varun ] Sergei Petrunia [ psergey ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 87573 ] MariaDB v4 [ 140804 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.2 [ 14601 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.3 [ 22126 ]
            psergei Sergei Petrunia made changes -
            Description Creating the dataset

            {code:sql}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table t1 (a int, b int, c int, key(a));
            create table t10 like t1;
            insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
            {code}


            {code:sql}
            analyze table ten;
            analyze table t10;
            {code}

            Good query plan
            {noformat}
            MariaDB [test]> set join_cache_level=2;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]>
            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            2 rows in set (0.001 sec)
            {noformat}

            These are default settings.
            In this case
            * we read 10 rows from table ten
            * and for each we make an index lookup into t10 where we expect to find 1 row.


            Bad query plan
            {noformat}
            MariaDB [test]> set join_cache_level=4;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            2 rows in set (0.001 sec)
            {noformat}
            For this case
            * it wants to read 10 rows from table ten
            * put them into a buffer
            * create a hash index on the buffer
            * but then do a full table scan on t10 and read 1M rows

            This is obviously very inefficient.




            Creating the dataset

            {code:sql}
            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table one_k(a int primary key);
            insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;

            create table t1 (a int, b int, c int, key(a));
            create table t10 like t1;
            insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;
            {code}


            {code:sql}
            analyze table ten;
            analyze table t10;
            {code}

            Good query plan
            {noformat}
            MariaDB [test]> set join_cache_level=2;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]>
            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
            +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
            2 rows in set (0.001 sec)
            {noformat}

            These are default settings.
            In this case
            * we read 10 rows from table ten
            * and for each we make an index lookup into t10 where we expect to find 1 row.


            Bad query plan
            {noformat}
            MariaDB [test]> set join_cache_level=4;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
            | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
            +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
            2 rows in set (0.001 sec)
            {noformat}
            For this case
            * it wants to read 10 rows from table ten
            * put them into a buffer
            * create a hash index on the buffer
            * but then do a full table scan on t10 and read 1M rows

            This is obviously very inefficient.




            psergei Sergei Petrunia made changes -

            People

              psergei Sergei Petrunia
              varun Varun Gupta (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.