[MDEV-4836] Wrong result on <not null date column> IS NULL (old documented hack stopped working) Created: 2013-08-02  Updated: 2013-09-03  Resolved: 2013-09-03

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.5.33

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: optimizer

Issue Links:
PartOf
is part of MDEV-4817 Optimizer fails to optimize expressio... Closed
Relates
relates to MDEV-4817 Optimizer fails to optimize expressio... Closed

 Description   

Due to the fix for MDEV-4817, the ancient bugfeature related to DATE/DATETIME and IS NULL stopped working.

Here is the story: http://bugs.mysql.com/bug.php?id=940
Here is the doc: http://dev.mysql.com/doc/refman/5.5/en/comparison-operators.html

For DATE and DATETIME columns that are declared as NOT NULL, you can find the special date '0000-00-00' by using a statement like this:

SELECT * FROM tbl_name WHERE date_column IS NULL
This is needed to get some ODBC applications to work because ODBC does not support a '0000-00-00' date value.

Here is how it looks:

CREATE TABLE t1 (id INT, d DATE NOT NULL);
INSERT INTO t1 VALUES (1,'0000-00-00'),(2,'0000-00-00');
 
SELECT * FROM t1 WHERE d IS NULL;
# +------+------------+
# | id   | d          |
# +------+------------+
# |    1 | 0000-00-00 |
# |    2 | 0000-00-00 |
# +------+------------+
# 2 rows in set (0.01 sec)

But it doesn't work any longer for scenarios affected by MDEV-4817 fix:

CREATE TABLE t1 (id INT, d DATE NOT NULL);
INSERT INTO t1 VALUES (1,'0000-00-00'),(2,'0000-00-00');
CREATE TABLE t2 (i INT);
SELECT * FROM t1 LEFT JOIN t2 ON (id=i) WHERE NULL OR d IS NULL;
# Empty set (0.01 sec)

EXPLAIN EXTENDED SELECT * FROM t1 LEFT JOIN t2 ON (id=i) WHERE NULL OR d IS NULL;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
|    1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 |                                                 |
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t1`.`id` AS `id`,`test`.`t1`.`d` AS `d`,`test`.`t2`.`i` AS `i` from `test`.`t1` join `test`.`t2` where ((`test`.`t1`.`id` = `test`.`t2`.`i`) and (`test`.`t1`.`d` = 0)) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+



 Comments   
Comment by Elena Stepanova [ 2013-08-08 ]

5.5.32 is not affected, the regression was introduced later (I mistakenly put 5.5.32 as a fix version earlier, thanks for correcting it).

Comment by Sergei Petrunia [ 2013-08-21 ]

As Igor has pointed out:
the part of code responsible for replacing "not_null_date_col IS NULL" with " = '0000-00-00' " is located in sql_select.cc, internal_remove_eq_conds() , search for " fix to replace 'NULL' dates with '0' ..."

It seems, the server has this implicit convention:

do not evaluate parts of WHERE/ON condition before optimize_cond() has been run on it.

I'm wondering whether that also applies to HAVING, select list, and other parts where one could have conditions.

Comment by Sergei Petrunia [ 2013-08-21 ]

The IS NULL -> 0000-00-00 conversion is done for WHERE clause, but not for select_list:

MariaDB [j3]> SELECT *, d IS NULL FROM t1 WHERE d IS NULL;
-------------------------

id d d IS NULL

-------------------------

1 0000-00-00 0
2 0000-00-00 0

-------------------------
2 rows in set (0.00 sec)

Comment by Sergei Petrunia [ 2013-08-23 ]

Committed a patch

Comment by Elena Stepanova [ 2013-08-26 ]

The following test case still fails on /~maria-captains/maria/5.5-test1/ revno 3862:

CREATE TABLE t1 (i1 INT, d1 DATE NOT NULL);
INSERT INTO t1 VALUES (1,'2012-12-21'),(2,'0000-00-00');

CREATE TABLE t2 (i2 INT, j2 INT);
INSERT INTO t2 VALUES (1,10),(2,20);

SELECT * FROM t1 LEFT JOIN t2 ON i1 = j2 WHERE d1 IS NULL AND 1 OR i1 = i2;

It produces an empty result, while there should be a row
i1 d1 i2 j2
2 0000-00-00 NULL NULL

Comment by Sergei Petrunia [ 2013-08-26 ]

I think I have figured out why the last testcase doesn't work. In the patch, I wrote:

if (item->const_item())
{
// psergey-todo: here: walk in search of convertible items...
if (!item->is_expensive() && !cond_is_datetime_is_null(item) &&
item->val_int() == 0)

In the testcase we see that "item" is "d1 IS NULL AND 1". That is, cond_is_datetime_is_null(item) returns FALSE, and the item is evaluated and removed.

We should not evaluate items that contain not_null_datecol IS NULL, at all.

Comment by Sergei Petrunia [ 2013-08-26 ]

> We should not evaluate items that contain not_null_datecol IS NULL, at all.

This is not 100% correct. datecol IS NULL substitution only works for items that are in the WHERE clause, and are directly reachable from root condition by descending the AND/OR tree. Here's how it works:

explain extended
SELECT * FROM t1 WHERE d1 IS NULL;
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 2 100.00 Using where
Warnings:
Note 1003 select `test`.`t1`.`i1` AS `i1`,`test`.`t1`.`d1` AS `d1` from `test`.`t1` where (`test`.`t1`.`d1` = 0)

Now, let's wrap d1 IS NULL into a function:

explain extended
SELECT * FROM t1 WHERE IF((d1 IS NULL), 1,2);
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 2 100.00
Warnings:
Note 1003 select `test`.`t1`.`i1` AS `i1`,`test`.`t1`.`d1` AS `d1` from `test`.`t1` where 1

We don't see IF (...) in the WHERE because d1 IS NULL was not substituted. Then. optimize_cond removed it as a constant part of WHERE.

Let's try the same on a nullable INT column:

explain extended
SELECT * FROM t1 WHERE IF((i1 IS NULL), 1,2);
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 2 100.00 Using where
Warnings:
Note 1003 select `test`.`t1`.`i1` AS `i1`,`test`.`t1`.`d1` AS `d1` from `test`.`t1` where if(isnull(`test`.`t1`.`i1`),1,2)

Here, IF(...) was not removed.

Comment by Sergei Petrunia [ 2013-08-26 ]

Committed a fix for this, pushed into bzr+ssh://bazaar.launchpad.net/~maria-captains/maria/5.5-test1

Comment by Elena Stepanova [ 2013-08-26 ]

The following test case returns 2 rows on the work tree, revno 3863, while 5.3, 5.5.32, 5.6.13 all return empty result set, which I think is a correct result. When the view is not used, the work tree also returns an empty result set.

CREATE TABLE t1 (i1 INT) ENGINE=MyISAM;
INSERT INTO t1 VALUES (NULL);

CREATE TABLE t2 (i2 INT, a INT, b INT) ENGINE=MyISAM;
CREATE ALGORITHM=TEMPTABLE VIEW v2 AS SELECT * FROM t2;

INSERT INTO t2 VALUES (NULL,1,2),(NULL,2,3);
SELECT * FROM t1 LEFT JOIN v2 ON i1 = i2 WHERE a < b;

Comment by Sergei Petrunia [ 2013-08-27 ]

The EXPLAINs are:

MariaDB [j11]> explain extended SELECT * FROM t1 LEFT JOIN t2 ON i1 = i2 WHERE a < b;
-------------------------------------------------------------------------------

id select_type table type possible_keys key key_len ref rows filtered Extra

-------------------------------------------------------------------------------

1 SIMPLE t1 system NULL NULL NULL NULL 1 100.00  
1 SIMPLE t2 ALL NULL NULL NULL NULL 2 100.00 Using where

-------------------------------------------------------------------------------
Message: select NULL AS i1,t2.i2 AS i2,t2.a AS a,t2.b AS b from t2 where ((t2.i2 = NULL) and (t2.a < t2.b))

MariaDB [j12]> explain extended SELECT * FROM t1 LEFT JOIN v2 ON i1 = i2 WHERE a < b;
-------------------------------------------------------------------------------------

id select_type table type possible_keys key key_len ref rows filtered Extra

-------------------------------------------------------------------------------------

1 PRIMARY t1 system NULL NULL NULL NULL 1 100.00  
1 PRIMARY <derived2> ref key0 key0 5 const 0 0.00 Using where
2 DERIVED t2 ALL NULL NULL NULL NULL 2 100.00  

-------------------------------------------------------------------------------------
3 rows in set, 1 warning (0.01 sec)
Message: select NULL AS i1,v2.i2 AS i2,v2.a AS a,v2.b AS b from v2 where ((v2.i2 = NULL) and (v2.a < v2.b))

EXPLAIN EXTENDED line is the same with/without the VIEW.

Comment by Sergei Petrunia [ 2013-08-27 ]

Debugging, I find the difference to be in the first
sub_select/evaluate_join_record call:

(gdb) p dbug_print_item(select_cond)
$107 = 0x14b53a0 "((`j11`.`t2`.`i2` = NULL) and (`j11`.`t2`.`a` < `j11`.`t2`.`b`))"

(gdb) p dbug_print_item(select_cond)
$16 = 0x14b43a0 "(`v2`.`a` < `v2`.`b`)"

When the VIEW is used, "t2.i2 = NULL" is not present.

Comment by Sergei Petrunia [ 2013-08-27 ]

I am looking at the trees from:

psergey@askmonty.org-20130826173804-fl2mdyqvottvjvr7 (rev 3863, latest
5.5-test2)
psergey@askmonty.org-20130731093701-10tmxhe668f3u1lx (revno 3842, before any
outer join fixes)

and observe the result difference with inner join:
new:
MariaDB [j11]> SELECT * FROM t1 JOIN v2 ON i1 = i2 WHERE a < b;
------------------+

i1 i2 a b

------------------+

NULL NULL 2 3
NULL NULL 1 2

------------------+
2 rows in set (0.01 sec)

old:
MariaDB [j12]> SELECT * FROM t1 JOIN v2 ON i1 = i2 WHERE a < b;
Empty set (0.00 sec)

Comment by Sergei Petrunia [ 2013-08-28 ]

Branching off this problem to MDEV-4959

Comment by Sergei Petrunia [ 2013-08-28 ]

The problem in MDEV-4959 was fixed. I've merged the fix and pushed to 5.5-test1 tree. I need another round of testing.

Generated at Thu Feb 08 06:59:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.