[MDEV-18354] Semi-join optimization is not used inside outer join nest Created: 2019-01-23  Updated: 2019-01-23

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5
Fix Version/s: 5.5

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: outer-join, semi-join, subquery


 Description   

Test dataset:

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table ten2 as select * from ten;
 
create table t1(a int, b int, key(a));
insert into t1 
select 
  A.a + B.a* 10 + C.a * 100,
  A.a + B.a* 10 + C.a * 100
from 
  ten A, ten B, ten C;
 
create table t2 like t1;
insert into t2 select * from t1;

The query is constructed as follows:

  • a non-reducible nested outer join (multiple tables on the inner side)
  • the nested outer join has a semi-join inside it
  • the data distribution is such that an inner-to-outer join order inside the
    join nest would be beneficial: putting the subquery first would allow to use
    a good index on t2.a, and read 10*1=10 rows from there as opposed to 1K rows
    it has in total

explain
select * 
from 
  ten left join
  (t1, t2) on (t2.a in (select ten2.a from ten2) and 
               t2.b=t1.b and 
               t1.b=ten.a);

Explain:

+------+--------------+-------+------+---------------+------+---------+------+------+--------------------------------------------------------+
| id   | select_type  | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                                  |
+------+--------------+-------+------+---------------+------+---------+------+------+--------------------------------------------------------+
|    1 | PRIMARY      | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                                        |
|    1 | PRIMARY      | t1    | ALL  | NULL          | NULL | NULL    | NULL |  760 | Using where; Using join buffer (flat, BNL join)        |
|    1 | PRIMARY      | t2    | ALL  | NULL          | NULL | NULL    | NULL | 1088 | Using where; Using join buffer (incremental, BNL join) |
|    2 | MATERIALIZED | ten2  | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                                        |
+------+--------------+-------+------+---------------+------+---------+------+------+--------------------------------------------------------+

We can see that ten2.id=2, that is, the subquery was not converted into a semi-join.

EXPLAIN in mysql 8:

explain select *  from    ten left join   (t1, t2) on (t2.a in (select ten2.a from ten2) and t1.b=t2.b and t1.b=ten.a);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+------------------------------+
|  1 | SIMPLE      | ten   | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |   10 |   100.00 | NULL                         |
|  1 | SIMPLE      | ten2  | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |   10 |   100.00 | Using where; Start temporary |
|  1 | SIMPLE      | t2    | NULL       | ref  | a             | a    | 5       | j2.ten2.a |    1 |   100.00 | Using where; End temporary   |
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL      | 1000 |   100.00 | Using where                  |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+------------------------------+

here ten2.id=1, and it is put before the table t2, which has t2.type=ref with rows=1 (efficient!)



 Comments   
Comment by Sergei Golubchik [ 2019-01-23 ]

See also MySQL commit 4538867195

Generated at Thu Feb 08 08:43:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.