[MDEV-7599] Wrong result with ALL subquery returning NULL Created: 2015-02-17  Updated: 2017-06-07  Resolved: 2017-05-16

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0, 10.1
Fix Version/s: 5.5.57

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: upstream, verified


 Description   

CREATE TABLE t1 (a INT, KEY(a)) ENGINE=MyISAM;
INSERT INTO t1 VALUES (1),(2);
 
CREATE TABLE t2 (b INT, c INT) ENGINE=MyISAM;
INSERT INTO t2 VALUES (1,6),(2,4);
 
SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
EXPLAIN EXTENDED
SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
 
SET SESSION optimizer_switch = "materialization=off";
SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
EXPLAIN EXTENDED
SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;

MariaDB [test]> SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
Empty set (0.00 sec)
 
MariaDB [test]> EXPLAIN EXTENDED
    -> SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
+------+--------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
| id   | select_type  | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                                  |
+------+--------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
|    1 | PRIMARY      | t1    | index | NULL          | a    | 5       | NULL |    2 |   100.00 | Using where; Using index                               |
|    2 | MATERIALIZED | t1    | index | NULL          | a    | 5       | NULL |    2 |   100.00 | Using index                                            |
|    2 | MATERIALIZED | t2x   | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using join buffer (flat, BNL join)                     |
|    2 | MATERIALIZED | t2y   | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (incremental, BNL join) |
+------+--------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
4 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> SHOW WARNINGS;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t1`.`a` AS `a` from `test`.`t1` where (not(<expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,`test`.`t1`.`a` in ( <materialize> (select min(`test`.`t1`.`a`) from `test`.`t1` join `test`.`t2` `t2x` join `test`.`t2` `t2y` where (`test`.`t2y`.`c` = `test`.`t2x`.`b`) ), <primary_index_lookup>(`test`.`t1`.`a` in <temporary table> on distinct_key where ((`test`.`t1`.`a` = `<subquery2>`.`MIN(a)`)))))))) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MariaDB [test]> SET SESSION optimizer_switch = "materialization=off";
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
+------+
| a    |
+------+
|    2 |
+------+
1 row in set (0.00 sec)
 
MariaDB [test]> EXPLAIN EXTENDED
    -> SELECT * FROM t1 WHERE a != ALL ( SELECT MIN(a) FROM t1, t2 t2x, t2 t2y WHERE t2y.c = t2x.b ) ;
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
| id   | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                                  |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
|    1 | PRIMARY            | t1    | index | NULL          | a    | 5       | NULL |    2 |   100.00 | Using where; Using index                               |
|    2 | DEPENDENT SUBQUERY | t1    | index | NULL          | a    | 5       | NULL |    2 |   100.00 | Using index                                            |
|    2 | DEPENDENT SUBQUERY | t2x   | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using join buffer (flat, BNL join)                     |
|    2 | DEPENDENT SUBQUERY | t2y   | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (incremental, BNL join) |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
4 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> SHOW WARNINGS;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                    |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t1`.`a` AS `a` from `test`.`t1` where (not(<expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select min(`test`.`t1`.`a`) from `test`.`t1` join `test`.`t2` `t2x` join `test`.`t2` `t2y` where (`test`.`t2y`.`c` = `test`.`t2x`.`b`) having trigcond((<cache>(`test`.`t1`.`a`) = <cache>(<ref_null_helper>(min(`test`.`t1`.`a`)))))))))) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

5.5 commit fdd6c111c254c5044cd9b6c2f7e4d0c74f427a79.
Also reproducible on previous 5.5 releases, 10.0, MySQL 5.6, 5.7 (only MySQL returns 2 rows where MariaDB returns 1).



 Comments   
Comment by Elena Stepanova [ 2015-02-19 ]

After fixing, also check test case from MDEV-7602. It is not simplified and hence ugly, but it should be obvious whether it's fixed or not. If not, please re-open MDEV-7602 and assign it to me.

Comment by Elena Stepanova [ 2016-01-25 ]

Another case, which does not require switching off materialization:

CREATE TABLE t1 (f1 varchar(10));
INSERT INTO t1 VALUES ('foo'),('bar');
 
CREATE TABLE t2 (f2 varchar(10), key(f2));
INSERT INTO t2 VALUES ('baz'),('qux');
 
CREATE TABLE t3 (f3 varchar(10));
INSERT INTO t3 VALUES ('abc'),('def');
 
# Join returns no rows (obviously)
SELECT * FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2) ;
 
# Max returns NULL (correctly)
SELECT MAX(t2a.f2) FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2);
 
# But query with ALL returns something
SELECT * FROM t1 WHERE f1 = ALL( 
  SELECT MAX(t2a.f2)
  FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2) 
);
 
EXPLAIN EXTENDED
SELECT * FROM t1 WHERE f1 = ALL( 
  SELECT MAX(t2a.f2)
  FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2) 
);

Actual result

MariaDB [test]> SELECT * FROM t1 WHERE f1 = ALL( 
    ->   SELECT MAX(t2a.f2)
    ->   FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2) 
    -> );
+------+
| f1   |
+------+
| bar  |
+------+
1 row in set (0.00 sec)
 
MariaDB [test]> 
MariaDB [test]> EXPLAIN EXTENDED
    -> SELECT * FROM t1 WHERE f1 = ALL( 
    ->   SELECT MAX(t2a.f2)
    ->   FROM t2 AS t2a INNER JOIN t2 t2b INNER JOIN t3 ON (f3 = t2b.f2) 
    -> );
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
| id   | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                                  |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+
|    1 | PRIMARY            | t1    | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where                                            |
|    2 | DEPENDENT SUBQUERY | t2a   | index | NULL          | f2   | 13      | NULL |    2 |   100.00 | Using index                                            |
|    2 | DEPENDENT SUBQUERY | t2b   | index | f2            | f2   | 13      | NULL |    2 |   100.00 | Using index; Using join buffer (flat, BNL join)        |
|    2 | DEPENDENT SUBQUERY | t3    | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (incremental, BNL join) |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+--------------------------------------------------------+

According to documentation, the last condition with MAX should return NULL, hence no rows should be selected.

Comment by Igor Babaev [ 2017-05-12 ]

With materialization=off the following query returns a wrong result set:

ariaDB [test]> SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1, t2 WHERE t2.c = t2.b ) ;
+------+------+
| b    | c    |
+------+------+
|    2 |    4 |
+------+------+

Using join_cache_level=0 doesn't help

 
MariaDB [test]> set join_cache_level=0;      
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1, t2 WHERE t2.c = t2.b ) ;                 
+------+------+
| b    | c    |
+------+------+
|    2 |    4 |
+------+------+
1 row in set (0.00 sec)
 
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1, t2 WHERE t2.c = t2.b ) ;
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+-------------+
| id   | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+-------------+
|    1 | PRIMARY            | t2    | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|    2 | DEPENDENT SUBQUERY | t2    | ALL   | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|    2 | DEPENDENT SUBQUERY | t1    | index | NULL          | a    | 5       | NULL |    4 |   100.00 | Using index |
+------+--------------------+-------+-------+---------------+------+---------+------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> SHOW WARNINGS;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                            |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` where (not(<expr_cache><`test`.`t2`.`b`>(<in_optimizer>(`test`.`t2`.`b`,<exists>(select min(`test`.`t1`.`a`) from `test`.`t1` join `test`.`t2` where (`test`.`t2`.`c` = `test`.`t2`.`b`) having trigcond((<cache>(`test`.`t2`.`b`) = <cache>(<ref_null_helper>(min(`test`.`t1`.`a`)))))))))) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

At the same time if we prohibit using the index t1(a) the returned result set becomes correct:

MariaDB [test]> SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1 USE INDEX(), t2 WHERE t2.c = t2.b ) ;
Empty set (0.00 sec)
 
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1 USE INDEX(), t2 WHERE t2.c = t2.b ) ;
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id   | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|    1 | PRIMARY            | t2    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|    2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|    2 | DEPENDENT SUBQUERY | t1    | ALL  | NULL          | NULL | NULL    | NULL |    4 |   100.00 |             |
+------+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> SHOW WARNINGS;
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` where (not(<expr_cache><`test`.`t2`.`b`>(<in_optimizer>(`test`.`t2`.`b`,<exists>(select min(`test`.`t1`.`a`) from `test`.`t1` USE INDEX () join `test`.`t2` where (`test`.`t2`.`c` = `test`.`t2`.`b`) having trigcond((<cache>(`test`.`t2`.`b`) = <ref_null_helper>(min(`test`.`t1`.`a`))))))))) |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Comment by Igor Babaev [ 2017-05-12 ]

We see the difference here:

having trigcond((<cache>(`test`.`t2`.`b`) = <cache>(<ref_null_helper>(min(`test`.`t1`.`a`)))))) // wrong result
having trigcond((<cache>(`test`.`t2`.`b`) = <ref_null_helper>(min(`test`.`t1`.`a`))))))    // correct result

Let's find out why caching <ref_null_helper>(min(`test`.`t1`.`a`))) matters.

Here's the explanation.
When <ref_null_helper>(min(`test`.`t1`.`a`)) is <ref_null_helper>(min(`test`.`t1`.`a`)) evaluated it sets the was_null flag of Item_in_subselect to true. When the value of <ref_null_helper>(min(`test`.`t1`.`a`)) is taken from cache nobody set was_null to true. So Item_in_subselec thinks that its subquery does not return any rows.
If we add an additional row with two different components to t2 this additional row will appear in the result set:

MariaDB [test]> INSERT INTO t2 VALUES (8,9);
Query OK, 1 row affected (0.00 sec)
 
MariaDB [test]> SELECT * FROM t2 WHERE b != ALL ( SELECT MIN(a) FROM t1, t2 WHERE t2.c = t2.b ) ;
+------+------+
| b    | c    |
+------+------+
|    2 |    4 |
|    8 |    9 |
+------+------+

Comment by Igor Babaev [ 2017-05-13 ]

Now let's find out why for the query that does not use indexes the value of <ref_null_helper>(min(`test`.`t1`.`a`)) is not cached.

Comment by Igor Babaev [ 2017-05-13 ]

Ok, here's what happens.
With indexes activated the function opt_sum_query() applies the MIN/MAX
optimization to MIN(a) and substitutes it for a constant item. Later
the Item_in_subselect with the subquery

(SELECT MIN(a) FROM t1, t2 WHERE t2.c = t2.b)

is converted into an EXISTS subquery. This conversion injects a new condition into the subquery. The subquery becomes correlated and this invalidates the applied
MIN/MAX optimization. So any references to MIN(a) cannot be considered as references
to a constant and should not be cached.

This bug appeared in 5.3 when the code for the cost base choice between
materialization and in-to-exists transformation of non-correlated
IN subqueries was introduced. Before this code in-to-exists
transformations were always performed before the call of opt_sum_query().

Comment by Igor Babaev [ 2017-05-16 ]

The fix for this bug was pushed into the 5.5 tree.
It should be applied upstream as it is.

Generated at Thu Feb 08 07:20:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.