[MDEV-30022] Lateral Derived optimization is not applied when it should Created: 2022-11-16  Updated: 2022-11-22

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.11
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: optimizer, optimizer-easy


 Description   

Consider this testcase:

create table user (
  user_id int primary key,
  user_name varchar(32)
);
 
insert into user select 
seq, concat('user-', seq) from seq_1_to_10000;
 
create table shipping_address(
  user_id int primary key,
  address text
);
 
insert into shipping_address select 
seq, concat('Address ', seq) from seq_1_to_10000;
 
create view user_info as
select U.user_id, user_name, address
from 
  user U
  left join shipping_address ADDR on U.user_id=ADDR.user_id;
 
create table orders(
  order_id int primary key,
  user_id int,
  amount double,
  key(user_id)
);
 
insert into orders select 
  seq, seq/100, 123 from seq_1_to_20000;
 
 
create view user_info2 as 
select U.user_id, user_name, ORD_TOTAL
from
  user U
  left join (select
               user_id, sum(amount) as ORD_TOTAL
             from orders
             group by user_id
            ) ORD_TOTALS
     on ORD_TOTALS.user_id=U.user_id;
 

Then:

MariaDB [test]> explain select user_name, ORD_TOTAL from user_info2 where user_id in (1,2);
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+
| id   | select_type     | table      | type  | possible_keys | key     | key_len | ref            | rows | Extra       |
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+
|    1 | PRIMARY         | u          | range | PRIMARY       | PRIMARY | 4       | NULL           | 2    | Using where |
|    1 | PRIMARY         | <derived3> | ref   | key0          | key0    | 5       | test.u.user_id | 4    |             |
|    3 | LATERAL DERIVED | orders     | ref   | user_id       | user_id | 5       | test.u.user_id | 49   |             |
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+

Good.
But what if we just want the data for user_id=1?
A much more expensive plan is used:

MariaDB [test]> explain select user_name, ORD_TOTAL from user_info2 where user_id=1;
+------+-------------+------------+-------+---------------+---------+---------+-------+-------+-------------+
| id   | select_type | table      | type  | possible_keys | key     | key_len | ref   | rows  | Extra       |
+------+-------------+------------+-------+---------------+---------+---------+-------+-------+-------------+
|    1 | PRIMARY     | u          | const | PRIMARY       | PRIMARY | 4       | const | 1     |             |
|    1 | PRIMARY     | <derived3> | ALL   | NULL          | NULL    | NULL    | NULL  | 19982 | Using where |
|    3 | DERIVED     | orders     | index | NULL          | user_id | 5       | NULL  | 19982 |             |
+------+-------------+------------+-------+---------------+---------+---------+-------+-------+-------------+
3 rows in set (0.001 sec)


Generated at Thu Feb 08 10:13:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.