[MCOL-4715] Mixed inner and outer joins with "null filter" for the table which is not involved into the outer join produces wrong results. Created: 2021-05-11  Updated: 2023-11-14  Resolved: 2022-09-06

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 22.08.1

Type: New Feature Priority: Blocker
Reporter: Gagan Goel (Inactive) Assignee: Denis Khalikov
Resolution: Fixed Votes: 0
Labels: None

Assigned for Review: Roman Roman
Assigned for Testing: Daniel Lee Daniel Lee (Inactive)

 Description   

create table t1 (a int)engine=columnstore;
create table t2 (a int)engine=columnstore;
create table t3 (a int)engine=columnstore;
insert into t1 values (1), (2), (3);
insert into t2 values (2), (3);
insert into t3 values (3);

The following query produces expected result:

MariaDB [test]> select * from t1 left join t2 left join t3 on t2.a=t3.a on t1.a=t3.a;
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL | NULL |
|    2 | NULL | NULL |
|    3 |    3 |    3 |
+------+------+------+

However, if we apply an 'is null' WHERE predicate to the above query, we get incorrect result:

MariaDB [test]> select * from t1 left join t2 left join t3 on t2.a=t3.a on t1.a=t3.a where t2.a is null;
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL | NULL |
|    2 | NULL | NULL |
|    3 | NULL | NULL |
+------+------+------+

The expected result is the below (here, i1, i2, i3 are equivalent InnoDB tables):

MariaDB [test]> select * from i1 left join i2 left join i3 on i2.a=i3.a on i1.a=i3.a where i2.a is null;
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL | NULL |
|    2 | NULL | NULL |
+------+------+------+



 Comments   
Comment by Gagan Goel (Inactive) [ 2021-05-18 ]

Here are 2 additional cases that produce incorrect results:

create table t1 (a int)engine=columnstore;
create table t2 (a int)engine=columnstore;
create table t3 (a int)engine=columnstore;
create table t4 (a int)engine=columnstore;
insert into t1 values (1), (2), (3), (4);
insert into t2 values (2), (3), (4);
insert into t3 values (3), (4);
insert into t4 values (4);

Bad result:

MariaDB [test]> select * from t1 left join t2 on t1.a=t2.a left join t3 left join t4 on t3.a=t4.a on t1.a=t4.a where t3.a is null;
+------+------+------+------+
| a    | a    | a    | a    |
+------+------+------+------+
|    1 | NULL | NULL | NULL |
|    2 |    2 | NULL | NULL |
|    3 |    3 | NULL | NULL |
|    4 |    4 | NULL | NULL |
+------+------+------+------+

Expected result:

MariaDB [inno]> select * from t1 left join t2 on t1.a=t2.a left join t3 left join t4 on t3.a=t4.a on t1.a=t4.a where t3.a is null;
+------+------+------+------+
| a    | a    | a    | a    |
+------+------+------+------+
|    2 |    2 | NULL | NULL |
|    3 |    3 | NULL | NULL |
|    1 | NULL | NULL | NULL |
+------+------+------+------+

Bad result:

MariaDB [test]> select * from t1 left join t2 on t1.a=t2.a left join t3 left join t4 on t3.a=t4.a on t2.a=t4.a where t3.a is null;
+------+------+------+------+
| a    | a    | a    | a    |
+------+------+------+------+
|    1 | NULL | NULL | NULL |
|    2 |    2 | NULL | NULL |
|    3 |    3 | NULL | NULL |
|    4 |    4 | NULL | NULL |
+------+------+------+------+

Expected result:

MariaDB [inno]> select * from t1 left join t2 on t1.a=t2.a left join t3 left join t4 on t3.a=t4.a on t2.a=t4.a where t3.a is null;
+------+------+------+------+
| a    | a    | a    | a    |
+------+------+------+------+
|    2 |    2 | NULL | NULL |
|    3 |    3 | NULL | NULL |
|    1 | NULL | NULL | NULL |
+------+------+------+------+

Comment by alexey vorovich (Inactive) [ 2022-08-05 ]

toddstoffel and tntnatbry
it is true that results are strange and Denis is looking at this . However , the query has a very strange JOIN order

If we correct it to a rational form we are good

// Some comments here
 
 
MariaDB [d1]> select * from t1 left join t2 on t1.a=t2.a left join t3 on t2.a=t3.a where t2.a is null;
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL | NULL |
+------+------+------+
1 row in set (0.023 sec)
 
MariaDB [d1]> select * from t1 left join t2 on t1.a=t2.a left join t3 on t2.a=t3.a 
    -> ;
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL | NULL |
|    2 |    2 | NULL |
|    3 |    3 |    3 |
+------+------+------+
3 rows in set (0.028 sec)

I wonder if this is urgent ?? ??

Comment by Gagan Goel (Inactive) [ 2022-08-05 ]

alexey.vorovich In the original query:

select * from t1 left join t2 left join t3 on t2.a=t3.a on t1.a=t3.a;

Here, t2 left join t3 on t2.a=t3.a join is performed first. The results of this join are then used to perform the left join with t1 using the on clause t1.a=t3.a. So this is not a cartesian join.

Vs in your query:

select * from t1 left join t2 on t1.a=t2.a left join t3 on t2.a=t3.a;

Here, t1 left join t2 on t1.a=t2.a is performed first. The results of this join are then used to perform the left join with t3 on t2.a=t3.a.

These are just two different queries with different join orders, they should both individually give correct results.

Comment by Daniel Lee (Inactive) [ 2022-09-06 ]

Build verified: 22.08 ($5531)

Verified mentioned test case

Generated at Thu Feb 08 02:52:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.