Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11078

NULL NOT IN (non-empty subquery) should never return results

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.18, 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL)
    • 5.5.55
    • None
    • OS X El Capitan, MariaDB installed through Homebrew
    • 10.0.30

    Description

      I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):

          MariaDB [(none)]> select 1 from dual where null not in (1);
          Empty set (0.00 sec)
       
          MariaDB [(none)]> select 1 from dual where null not in (select 1);
          Empty set (0.00 sec)
       
          MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
          Empty set (0.00 sec)
      

      This is as I expected, since `NULL` is not equal or unequal to anything.

          MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
          +---+
          | 1 |
          +---+
          | 1 |
          +---+
          1 row in set (0.00 sec)
      

      Huh? How can that be? Maybe it's because I can't read the mysql.user table?

          MariaDB [(none)]> select distinct 1 from mysql.user;
          +---+
          | 1 |
          +---+
          | 1 |
          +---+
          1 row in set (0.00 sec)
      

      No, it returns a value. When selecting the one from a actual table, not `dual`, or an inline list expression, the behaviour is different.

      Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.

      Attachments

        Activity

          ComaVN Roel Harbers created issue -
          ComaVN Roel Harbers made changes -
          Field Original Value New Value
          Description I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):

              MariaDB [(none)]> select 1 from dual where null not in (1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
              Empty set (0.00 sec)

          This is as I expected, since `NULL` is not equal or unequal to anything.

              MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)

          Huh? How can that be? Maybe it's because there are several rows with value `1` in the subquery:

              MariaDB [(none)]> select 1 from dual where null not in (select distinct 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)

          No, doesn't matter. When selecting the one from a actual table, not `dual`, the behaviour is different.

          Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.
          I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):


          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
              Empty set (0.00 sec)
          {noformat}

          This is as I expected, since `NULL` is not equal or unequal to anything.

          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          Huh? How can that be? Maybe it's because there are several rows with value `1` in the subquery:

          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (select distinct 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          No, doesn't matter. When selecting the one from a actual table, not `dual`, the behaviour is different.

          Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.
          ComaVN Roel Harbers made changes -
          Description I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):


          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
              Empty set (0.00 sec)
          {noformat}

          This is as I expected, since `NULL` is not equal or unequal to anything.

          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          Huh? How can that be? Maybe it's because there are several rows with value `1` in the subquery:

          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (select distinct 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          No, doesn't matter. When selecting the one from a actual table, not `dual`, the behaviour is different.

          Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.
          I'm confused by the behaviour of `NULL` in a `NOT IN` expression. (MariaDB 10.1.18):


          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1);
              Empty set (0.00 sec)

              MariaDB [(none)]> select 1 from dual where null not in (select 1 from dual);
              Empty set (0.00 sec)
          {noformat}

          This is as I expected, since `NULL` is not equal or unequal to anything.

          {noformat}
              MariaDB [(none)]> select 1 from dual where null not in (select 1 from mysql.user);
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          Huh? How can that be? Maybe it's because I can't read the mysql.user table?

          {noformat}
              MariaDB [(none)]> select distinct 1 from mysql.user;
              +---+
              | 1 |
              +---+
              | 1 |
              +---+
              1 row in set (0.00 sec)
          {noformat}

          No, it returns a value. When selecting the one from a actual table, not `dual`, or an inline list expression, the behaviour is different.

          Is this a bug in MariaDB? An old MySQL server (5.5.40) I tried this with gave the expected empty set for all queries.
          elenst Elena Stepanova made changes -
          Status Open [ 1 ] Confirmed [ 10101 ]

          Thanks for the report and test case.

          elenst Elena Stepanova added a comment - Thanks for the report and test case.
          elenst Elena Stepanova made changes -
          Fix Version/s 5.5 [ 15800 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 10.1 [ 16100 ]
          Fix Version/s 10.2 [ 14601 ]
          Affects Version/s 5.5 [ 15800 ]
          Affects Version/s 10.0 [ 16000 ]
          Affects Version/s 10.1 [ 16100 ]
          Affects Version/s 10.2 [ 14601 ]
          Assignee Sergei Petrunia [ psergey ]
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Varun Gupta [ varun ]
          varun Varun Gupta (Inactive) made changes -
          Status Confirmed [ 10101 ] In Progress [ 3 ]
          varun Varun Gupta (Inactive) added a comment - - edited

          create table t1(a int);
          insert into t1 value (1);
           
          EXPLAIN
          select 1 from t1 where  null not in (select 1 from t1);
          id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
          1	PRIMARY	t1	system	NULL	NULL	NULL	NULL	1	
          2	SUBQUERY	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible WHERE noticed after reading const tables
          
          

          varun Varun Gupta (Inactive) added a comment - - edited create table t1(a int ); insert into t1 value (1);   EXPLAIN select 1 from t1 where null not in ( select 1 from t1); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1 system NULL NULL NULL NULL 1 2 SUBQUERY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables
          varun Varun Gupta (Inactive) made changes -
          Sprint 10.0.30 [ 140 ]
          varun Varun Gupta (Inactive) made changes -
          Priority Minor [ 4 ] Major [ 3 ]

          Results of investigation discussion:
          We get to the wrong query result, because the optimizer gets Impossible where after reading const tables for the subquery.

          This should not happen (as the subquery produces rows). However it does happen: IN->EXISTS rewrite changes the subquery to be

          exists (select 1 from t1 where trig_cond(null=1))
          

          The trig_cond is "enabled" at optimization phase, the optimizer sees it as constant, evaluates it, finds it to be NULL, and concludes that the subquery has "Impossible WHERE".

          psergei Sergei Petrunia added a comment - Results of investigation discussion: We get to the wrong query result, because the optimizer gets Impossible where after reading const tables for the subquery. This should not happen (as the subquery produces rows). However it does happen: IN->EXISTS rewrite changes the subquery to be exists (select 1 from t1 where trig_cond(null=1)) The trig_cond is "enabled" at optimization phase, the optimizer sees it as constant, evaluates it, finds it to be NULL, and concludes that the subquery has "Impossible WHERE".

          That is,

          Subquery's left expression is NULL. This means that trig_cond should be disabled when we evaluate the subquery. However, subquery optimization code assumes it to be enabled, and produces a wrong query result.

          psergei Sergei Petrunia added a comment - That is, Subquery's left expression is NULL. This means that trig_cond should be disabled when we evaluate the subquery. However, subquery optimization code assumes it to be enabled, and produces a wrong query result.

          More details:

          I stop here in the subquery:

            Breakpoint 5, make_join_select (join=0x7fffcdcf15c0,...)
          (gdb) p join->select_lex->select_number
            $45 = 2
          

          (gdb) p dbug_print_item(cond)
            $48 = 0x15674c0 <dbug_item_print_buf> "trigcond(<cache>((NULL = 1)))"
          (gdb) p cond->val_int()
            $49 = 0
          (gdb) p cond->const_item()
            $50 = false
          (gdb) p cond->used_tables()
            $51 = 0
          

          Then, the trigcond is moved into join->exec_const_cond, and we reach this code:

                  if (join->exec_const_cond && !join->exec_const_cond->is_expensive() &&
                      !join->exec_const_cond->val_int())
                  {
                    DBUG_PRINT("info",("Found impossible WHERE condition"));
          

          where is_expensive()=false, and cond->val_int()==0.

          psergei Sergei Petrunia added a comment - More details: I stop here in the subquery: Breakpoint 5, make_join_select (join=0x7fffcdcf15c0,...) (gdb) p join->select_lex->select_number $45 = 2 (gdb) p dbug_print_item(cond) $48 = 0x15674c0 <dbug_item_print_buf> "trigcond(<cache>((NULL = 1)))" (gdb) p cond->val_int() $49 = 0 (gdb) p cond->const_item() $50 = false (gdb) p cond->used_tables() $51 = 0 Then, the trigcond is moved into join->exec_const_cond, and we reach this code: if (join->exec_const_cond && !join->exec_const_cond->is_expensive() && !join->exec_const_cond->val_int()) { DBUG_PRINT("info",("Found impossible WHERE condition")); where is_expensive()=false, and cond->val_int()==0.

          So, for trigcond(const_expr), we have

          1. const_item() = FALSE
          2. used_tables() = 0

          #1 seems to be correct (as the value may change during the statement execution, depending on whether trigger is ON or OFF)

          #2 seems to be incorrect. Should it be OUTER_REF_TABLE_BIT instead?

          psergei Sergei Petrunia added a comment - So, for trigcond(const_expr) , we have 1. const_item() = FALSE 2. used_tables() = 0 #1 seems to be correct (as the value may change during the statement execution, depending on whether trigger is ON or OFF) #2 seems to be incorrect. Should it be OUTER_REF_TABLE_BIT instead?

          According to documentation:
          trigcond(X) := TRUE when the "linked" outer expression oe_i is NULL

          For our query the above condition holds "SELECT 1 from dual where NULL in (select t1 from t1)"
          but the condition join->exec_const_cond->val_int() return 0 which is wrong.

          varun Varun Gupta (Inactive) added a comment - According to documentation: trigcond(X) := TRUE when the "linked" outer expression oe_i is NULL For our query the above condition holds "SELECT 1 from dual where NULL in (select t1 from t1)" but the condition join->exec_const_cond->val_int() return 0 which is wrong.
          varun Varun Gupta (Inactive) made changes -
          Assignee Varun Gupta [ varun ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          varun Varun Gupta (Inactive) made changes -
          Assignee Sergei Petrunia [ psergey ] Varun Gupta [ varun ]
          varun Varun Gupta (Inactive) made changes -
          Assignee Varun Gupta [ varun ] Sergei Petrunia [ psergey ]
          varun Varun Gupta (Inactive) made changes -
          Assignee Sergei Petrunia [ psergey ] Varun Gupta [ varun ]
          varun Varun Gupta (Inactive) made changes -
          Status In Review [ 10002 ] Stalled [ 10000 ]
          varun Varun Gupta (Inactive) made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          varun Varun Gupta (Inactive) made changes -
          Assignee Varun Gupta [ varun ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          varun Varun Gupta (Inactive) made changes -
          Fix Version/s 5.5.55 [ 22311 ]
          Fix Version/s 10.2 [ 14601 ]
          Fix Version/s 5.5 [ 15800 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 10.1 [ 16100 ]
          Assignee Sergei Petrunia [ psergey ] Varun Gupta [ varun ]
          Resolution Fixed [ 1 ]
          Status In Review [ 10002 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 77893 ] MariaDB v4 [ 151084 ]

          People

            varun Varun Gupta (Inactive)
            ComaVN Roel Harbers
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.