[MDEV-17493] Partition pruning doesn't work for nested outer joins Created: 2018-10-18  Updated: 2018-11-06  Resolved: 2018-11-06

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Partitioning
Affects Version/s: 5.5, 10.0, 10.1, 10.2
Fix Version/s: 10.3.11

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: upstream

Issue Links:
Relates
relates to MDEV-10946 Partition pruning not working with LE... Stalled
relates to MDEV-17518 Range optimization doesn't use ON exp... Closed

 Description   

Consider this example

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (a int, b int, c int);
insert into t1 select a,a,a from ten;
create table t2 (a int, b int, c int);
insert into t2 select a,a,a from ten;
 
create table tp (
  part_id int,
  a int
) partition by list (part_id) (
  partition p0 values in (0),
  partition p1 values in (1),
  partition p2 values in (2),
  partition p3 values in (3),
  partition p4 values in (4)
);
insert into tp select mod(a,5), a from ten;

Let's put the partitioned table on the inner side of the outer join, and let the ON expression have a condition that's usable by partition pruning:

 explain partitions  select * from t1 left join tp on (t1.a=tp.a and tp.part_id=3);
+------+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+------+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                                 |
|    1 | SIMPLE      | tp    | p3         | ALL  | NULL          | NULL | NULL    | NULL |    2 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+

tp.part_id=3 gets us tp.partitions=p3. Good so far.

Now, let's change the outer join to have two tables:

mysql> explain partitions
    -> select * from t1 left join (tp join t2) on (t1.a=tp.a and tp.a=t2.b and tp.part_id=3);
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+
| id   | select_type | table | partitions     | type | possible_keys | key  | key_len | ref  | rows | Extra                                                  |
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+
|    1 | SIMPLE      | t1    | NULL           | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                                        |
|    1 | SIMPLE      | tp    | p0,p1,p2,p3,p4 | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; Using join buffer (flat, BNL join)        |
|    1 | SIMPLE      | t2    | NULL           | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; Using join buffer (incremental, BNL join) |
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+

now, tp.partitions includes all partitions even if the ON expression has a condition that restricts it to one.



 Comments   
Comment by Sergei Petrunia [ 2018-10-19 ]

Just to be 100% sure, let's see that the regular range optimizer is able to use the ON expression in this case:

mysql> alter table tp add key(part_id);
Query OK, 0 rows affected (0.28 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain partitions select * from t1 left join (tp force index(part_id) join t2) on (t1.a=tp.a and tp.a=t2.b and tp.part_id=3);
+------+-------------+-------+----------------+------+---------------+---------+---------+-------+------+-------------+
| id   | select_type | table | partitions     | type | possible_keys | key     | key_len | ref   | rows | Extra       |
+------+-------------+-------+----------------+------+---------------+---------+---------+-------+------+-------------+
|    1 | SIMPLE      | t1    | NULL           | ALL  | NULL          | NULL    | NULL    | NULL  |   10 |             |
|    1 | SIMPLE      | tp    | p0,p1,p2,p3,p4 | ref  | part_id       | part_id | 5       | const |    1 | Using where |
|    1 | SIMPLE      | t2    | NULL           | ALL  | NULL          | NULL    | NULL    | NULL  |   10 | Using where |
+------+-------------+-------+----------------+------+---------------+---------+---------+-------+------+-------------+
3 rows in set (0.01 sec)

Indeed, table tp now uses ref access on part_id.

Comment by Sergei Petrunia [ 2018-10-19 ]

... Well, actually it's not the range optimizer. The execution does not enter SQL_SELECT::test_quick_select.
If I change the query to be unusable by ref access, table tp will use full table scan:

explain partitions 
select * 
from 
  t1 left join (tp force index (part_id) join t2) 
      on (t1.a=tp.a and tp.a=t2.b and tp.part_id in (2,3));
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+
| id   | select_type | table | partitions     | type | possible_keys | key  | key_len | ref  | rows | Extra                                                  |
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+
|    1 | SIMPLE      | t1    | NULL           | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                                        |
|    1 | SIMPLE      | tp    | p0,p1,p2,p3,p4 | ALL  | part_id       | NULL | NULL    | NULL |   10 | Using where; Using join buffer (flat, BNL join)        |
|    1 | SIMPLE      | t2    | NULL           | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; Using join buffer (incremental, BNL join) |
+------+-------------+-------+----------------+------+---------------+------+---------+------+------+--------------------------------------------------------+

tp.possible_keys has part_id, though.

Generated at Thu Feb 08 08:36:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.