[MDEV-16064] Wrong result set from query with in subquery that uses window function Created: 2018-04-30  Updated: 2023-11-28

Status: Confirmed
Project: MariaDB Server
Component/s: Optimizer - Window functions
Affects Version/s: 10.2.14, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Igor Babaev Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: 11.0-sel

Issue Links:
Duplicate
is duplicated by MDEV-12356 Wrong query result when window functi... Closed
Relates
relates to MDEV-12356 Wrong query result when window functi... Closed

 Description   

The following test case demonstrates the problem:

create table t1 (a int, b int, c int, d int);
create table t2 (e int, f int, g int);
insert into t1 values
(1,1,18,1), (2,1,25,1), (1,3,40,1), (2,1,15,4),
(4,2,24,4), (3,2,23,1), (1,2,40,2), (3,4,17,2),
(5,5,65,1), (2,3,70,3),  (1,4,35,3), (2,3,25,3);
insert into t2 values
(1,2,38), (2,3,15), (1,3,40), (1,4,35),
(2,2,70), (3,4,23), (5,5,12), (5,4,17),
(3,3,17), (4,2,24), (2,5,25), (5,1,65);
 
select * from t1 where (a>1) and
  (a) in
  (
    select sum(e) over (partition by f)
    from t2
    where e<5
  )
;

The output for the above query is:

MariaDB [test]> select * from t1 where (a>1) and
    ->   (a) in
    ->   (
    ->     select sum(e) over (partition by f)
    ->     from t2
    ->     where e<5
    ->   )
    -> ;
Empty set (0.00 sec)

This not correct as we have:

ariaDB [test]>     select sum(e) over (partition by f)
    ->     from t2
    ->     where e<5
    -> ;
+------------------------------+
| sum(e) over (partition by f) |
+------------------------------+
| 7                            |
| 6                            |
| 6                            |
| 4                            |
| 7                            |
| 4                            |
| 6                            |
| 7                            |
| 2                            |
+------------------------------+
 
MariaDB [test]> select a from t1 where (a>1);
+------+
| a    |
+------+
|    2 |
|    2 |
|    4 |
|    3 |
|    3 |
|    5 |
|    2 |
|    2 |
+------+



 Comments   
Comment by Igor Babaev [ 2018-05-03 ]

mysql-8.0.11 cannot handle this query:

mysql> select * from t1 where (a>1) and
    -> (a) in                                                                                                                 
    -> (
    -> select sum(e) over (partition by f)
    -> from t2
    -> where e<5
    -> )
    -> ;
ERROR 1235 (42000): This version of MySQL doesn't yet support 'the combination of this ALL/ANY/SOME/IN subquery with this comparison operator and with contained window functions'

Comment by Igor Babaev [ 2018-05-03 ]

EXPLAIN EXTENDED for this query shows that in-to-exists transformation is applied to the
IN subquery predicate:

MariaDB [test]> explain extended
    -> select * from t1 where (a>1) and
    ->   (a) in
    ->   (
    ->     select sum(e) over (partition by f)
    ->     from t2
    ->     where e<5
    ->   )
    -> ;
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+
| id   | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+
|    1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL |   12 |   100.00 | Using where                  |
|    2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL |   12 |   100.00 | Using where; Using temporary |
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+------------------------------+
2 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> show warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,`test`.`t1`.`d` AS `d` from `test`.`t1` where `test`.`t1`.`a` > 1 and <expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select sum(`test`.`t2`.`e`) over ( partition by `test`.`t2`.`f`) from `test`.`t2` where `test`.`t2`.`e` < 5))) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The transformation is applied because the left part of the IN predicate is of the INTEGER type while the right part is of the DECIMAL part.
However using in-to-exists transformation for IN subqueries with window functions is not valid.

Comment by Igor Babaev [ 2018-05-03 ]

Before using in-to-exists transformation here we have to wrap the subquery into a derived table.
Such wrapping will work for non-correlated IN subqueries.
In general case with a correlated IN subquery it will lead to a correlated derived table that still can be materialized for different values of external references in the same way as derived tables built in the split optimization that is used for non-mergeable derived tables.

Generated at Thu Feb 08 08:26:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.