Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL)
-
None
Description
Creating the dataset
create table ten(a int); |
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); |
create table one_k(a int primary key); |
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C; |
|
create table t1 (a int, b int, c int, key(a)); |
create table t10 like t1; |
insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; |
analyze table ten; |
analyze table t10; |
Good query plan
MariaDB [test]> set join_cache_level=2;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [test]>
|
MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
|
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
|
| 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
|
| 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | |
|
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
|
2 rows in set (0.001 sec)
|
These are default settings.
In this case
- we read 10 rows from table ten
- and for each we make an index lookup into t10 where we expect to find 1 row.
Bad query plan
MariaDB [test]> set join_cache_level=4;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
|
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
|
| 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
|
| 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
|
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
|
2 rows in set (0.001 sec)
|
For this case
- it wants to read 10 rows from table ten
- put them into a buffer
- create a hash index on the buffer
- but then do a full table scan on t10 and read 1M rows
This is obviously very inefficient.
Attachments
Issue Links
- relates to
-
MDEV-16307 Incorrect results when using BNLH join instead of BNL join with views
-
- Closed
-
-
MDEV-22383 Use Block Nested Loops Hash Join by default when appropriate
-
- Stalled
-
-
MDEV-35855 Make it possible to enable BNL-H join without hitting regressions
-
- Open
-
Activity
Field | Original Value | New Value |
---|---|---|
Fix Version/s | 5.5 [ 15800 ] | |
Fix Version/s | 10.0 [ 16000 ] | |
Fix Version/s | 10.1 [ 16100 ] | |
Fix Version/s | 10.2 [ 14601 ] | |
Fix Version/s | 10.3 [ 22126 ] |
Description |
Creating the dataset
{code:sql} create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t1 (a int, b int, c int, key(a)); create table t10 like t1; insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; {code} {code:sql} analyze table ten; analyze table t10; {code} Good query plan {noformat} MariaDB [test]> set join_cache_level=2; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ 2 rows in set (0.001 sec) These are default settings, we read 10 rows from table ten and for each we make an index lookup into t10 where we expect to find 1 row. {noformat} Bad query plan {noformat} MariaDB [test]> set join_cache_level=4; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ 2 rows in set (0.001 sec) For this case * it wants to read 10 rows from table ten; * put them into a buffer * create a hash index on the buffer * but then do a full table scan on t10 and read 1M rows. This is obviously very inefficient. {noformat} |
Description |
Creating the dataset
{code:sql} create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t1 (a int, b int, c int, key(a)); create table t10 like t1; insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; {code} {code:sql} analyze table ten; analyze table t10; {code} Good query plan {noformat} MariaDB [test]> set join_cache_level=2; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ 2 rows in set (0.001 sec) These are default settings, we read 10 rows from table ten and for each we make an index lookup into t10 where we expect to find 1 row. {noformat} Bad query plan {noformat} MariaDB [test]> set join_cache_level=4; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ 2 rows in set (0.001 sec) For this case * it wants to read 10 rows from table ten; * put them into a buffer * create a hash index on the buffer * but then do a full table scan on t10 and read 1M rows. This is obviously very inefficient. {noformat} |
Creating the dataset
{code:sql} create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t1 (a int, b int, c int, key(a)); create table t10 like t1; insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; {code} {code:sql} analyze table ten; analyze table t10; {code} Good query plan {noformat} MariaDB [test]> set join_cache_level=2; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ 2 rows in set (0.001 sec) {noformat} These are default settings. In this case * we read 10 rows from table ten * and for each we make an index lookup into t10 where we expect to find 1 row. Bad query plan {noformat} MariaDB [test]> set join_cache_level=4; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ 2 rows in set (0.001 sec) {noformat} For this case * it wants to read 10 rows from table ten * put them into a buffer * create a hash index on the buffer * but then do a full table scan on t10 and read 1M rows This is obviously very inefficient. |
Link |
This issue blocks |
Link |
This issue relates to |
Summary | Picking an efficient plan by changing ref access to hash join | Setting join_cache_level=4 changes efficient ref access plan to an inefficient hash join |
Link |
This issue blocks |
Fix Version/s | 10.4 [ 22408 ] |
Link | This issue relates to MDEV-22383 [ MDEV-22383 ] |
Fix Version/s | 5.5 [ 15800 ] |
Fix Version/s | 10.0 [ 16000 ] |
Fix Version/s | 10.1 [ 16100 ] |
Assignee | Varun Gupta [ varun ] | Sergei Petrunia [ psergey ] |
Workflow | MariaDB v3 [ 87573 ] | MariaDB v4 [ 140804 ] |
Fix Version/s | 10.2 [ 14601 ] |
Fix Version/s | 10.3 [ 22126 ] |
Description |
Creating the dataset
{code:sql} create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t1 (a int, b int, c int, key(a)); create table t10 like t1; insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; {code} {code:sql} analyze table ten; analyze table t10; {code} Good query plan {noformat} MariaDB [test]> set join_cache_level=2; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ 2 rows in set (0.001 sec) {noformat} These are default settings. In this case * we read 10 rows from table ten * and for each we make an index lookup into t10 where we expect to find 1 row. Bad query plan {noformat} MariaDB [test]> set join_cache_level=4; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ 2 rows in set (0.001 sec) {noformat} For this case * it wants to read 10 rows from table ten * put them into a buffer * create a hash index on the buffer * but then do a full table scan on t10 and read 1M rows This is obviously very inefficient. |
Creating the dataset
{code:sql} create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table one_k(a int primary key); insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C; create table t1 (a int, b int, c int, key(a)); create table t10 like t1; insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B; {code} {code:sql} analyze table ten; analyze table t10; {code} Good query plan {noformat} MariaDB [test]> set join_cache_level=2; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | ref | a | a | 5 | test.ten.a | 1 | | +------+-------------+-------+------+---------------+------+---------+------------+------+-------------+ 2 rows in set (0.001 sec) {noformat} These are default settings. In this case * we read 10 rows from table ten * and for each we make an index lookup into t10 where we expect to find 1 row. Bad query plan {noformat} MariaDB [test]> set join_cache_level=4; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a; +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | SIMPLE | t10 | hash_ALL | a | #hash#a | 5 | test.ten.a | 997980 | Using join buffer (flat, BNLH join) | +------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+ 2 rows in set (0.001 sec) {noformat} For this case * it wants to read 10 rows from table ten * put them into a buffer * create a hash index on the buffer * but then do a full table scan on t10 and read 1M rows This is obviously very inefficient. |
Link | This issue relates to MDEV-35855 [ MDEV-35855 ] |