[MDEV-11078] NULL NOT IN (non-empty subquery) should never return results Created: 2016-10-18  Updated: 2017-03-05  Resolved: 2017-03-05

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Subquery
Affects Version/s: 10.1.18, 5.5, 10.0, 10.1, 10.2
Fix Version/s: 5.5.55

Type: Bug Priority: Major
Reporter: Roel Harbers Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

OS X El Capitan, MariaDB installed through Homebrew


Sprint: 10.0.30

 Description   

I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):

    MariaDB [(none)]> select 1 from dual where null not in (1);
    Empty set (0.00 sec)
 
    MariaDB [(none)]> select 1 from dual where null not in (select 1);
    Empty set (0.00 sec)
 
    MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
    Empty set (0.00 sec)

This is as I expected, since `NULL` is not equal or unequal to anything.

    MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
    +---+
    | 1 |
    +---+
    | 1 |
    +---+
    1 row in set (0.00 sec)

Huh? How can that be? Maybe it's because I can't read the mysql.user table?

    MariaDB [(none)]> select distinct 1 from mysql.user;
    +---+
    | 1 |
    +---+
    | 1 |
    +---+
    1 row in set (0.00 sec)

No, it returns a value. When selecting the one from a actual table, not `dual`, or an inline list expression, the behaviour is different.

Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.



 Comments   
Comment by Elena Stepanova [ 2016-10-18 ]

Thanks for the report and test case.

Comment by Varun Gupta (Inactive) [ 2017-02-19 ]

create table t1(a int);
insert into t1 value (1);
 
EXPLAIN
select 1 from t1 where  null not in (select 1 from t1);
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	t1	system	NULL	NULL	NULL	NULL	1	
2	SUBQUERY	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible WHERE noticed after reading const tables

Comment by Sergei Petrunia [ 2017-02-22 ]

Results of investigation discussion:
We get to the wrong query result, because the optimizer gets Impossible where after reading const tables for the subquery.

This should not happen (as the subquery produces rows). However it does happen: IN->EXISTS rewrite changes the subquery to be

exists (select 1 from t1 where trig_cond(null=1))

The trig_cond is "enabled" at optimization phase, the optimizer sees it as constant, evaluates it, finds it to be NULL, and concludes that the subquery has "Impossible WHERE".

Comment by Sergei Petrunia [ 2017-02-22 ]

That is,

Subquery's left expression is NULL. This means that trig_cond should be disabled when we evaluate the subquery. However, subquery optimization code assumes it to be enabled, and produces a wrong query result.

Comment by Sergei Petrunia [ 2017-02-22 ]

More details:

I stop here in the subquery:

  Breakpoint 5, make_join_select (join=0x7fffcdcf15c0,...)
(gdb) p join->select_lex->select_number
  $45 = 2

(gdb) p dbug_print_item(cond)
  $48 = 0x15674c0 <dbug_item_print_buf> "trigcond(<cache>((NULL = 1)))"
(gdb) p cond->val_int()
  $49 = 0
(gdb) p cond->const_item()
  $50 = false
(gdb) p cond->used_tables()
  $51 = 0

Then, the trigcond is moved into join->exec_const_cond, and we reach this code:

        if (join->exec_const_cond && !join->exec_const_cond->is_expensive() &&
            !join->exec_const_cond->val_int())
        {
          DBUG_PRINT("info",("Found impossible WHERE condition"));

where is_expensive()=false, and cond->val_int()==0.

Comment by Sergei Petrunia [ 2017-02-22 ]

So, for trigcond(const_expr), we have

1. const_item() = FALSE
2. used_tables() = 0

#1 seems to be correct (as the value may change during the statement execution, depending on whether trigger is ON or OFF)

#2 seems to be incorrect. Should it be OUTER_REF_TABLE_BIT instead?

Comment by Varun Gupta (Inactive) [ 2017-02-22 ]

According to documentation:
trigcond(X) := TRUE when the "linked" outer expression oe_i is NULL

For our query the above condition holds "SELECT 1 from dual where NULL in (select t1 from t1)"
but the condition join->exec_const_cond->val_int() return 0 which is wrong.

Generated at Thu Feb 08 07:47:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.