[MDEV-16337] Setting join_cache_level=4 changes efficient ref access plan to an inefficient hash join Created: 2018-05-30  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 2
Labels: None

Issue Links:
Relates
relates to MDEV-16307 Incorrect results when using BNLH joi... Closed
relates to MDEV-22383 Use Block Nested Loops Hash Join by d... Stalled

 Description   

Creating the dataset

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (a int, b int, c int, key(a));
create table t10 like t1;
insert into t10 select A.a +1000*B.a, A.a +1000*B.a,A.a +1000*B.a from one_k A, one_k B;

analyze table ten;
analyze table t10;

Good query plan

MariaDB [test]>  set join_cache_level=2;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> 
MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
|    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL       |   10 | Using where |
|    1 | SIMPLE      | t10   | ref  | a             | a    | 5       | test.ten.a |    1 |             |
+------+-------------+-------+------+---------------+------+---------+------------+------+-------------+
2 rows in set (0.001 sec)

These are default settings.
In this case

  • we read 10 rows from table ten
  • and for each we make an index lookup into t10 where we expect to find 1 row.

Bad query plan

MariaDB [test]>  set join_cache_level=4;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> explain select * from ten, t10 where t10.a=ten.a;
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
| id   | select_type | table | type     | possible_keys | key     | key_len | ref        | rows   | Extra                               |
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
|    1 | SIMPLE      | ten   | ALL      | NULL          | NULL    | NULL    | NULL       |     10 | Using where                         |
|    1 | SIMPLE      | t10   | hash_ALL | a             | #hash#a | 5       | test.ten.a | 997980 | Using join buffer (flat, BNLH join) |
+------+-------------+-------+----------+---------------+---------+---------+------------+--------+-------------------------------------+
2 rows in set (0.001 sec)

For this case

  • it wants to read 10 rows from table ten
  • put them into a buffer
  • create a hash index on the buffer
  • but then do a full table scan on t10 and read 1M rows

This is obviously very inefficient.


Generated at Thu Feb 08 08:28:09 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.