[MDEV-30254] Value changed after adding DISTINCT [10.1.10, 10.11.1] Created: 2022-12-17  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.1.10, 10.11.1
Fix Version/s: 10.4, 10.5, 10.6, 10.11

Type: Bug Priority: Major
Reporter: qaqcatz Assignee: Alexander Barkov
Resolution: Unresolved Votes: 0
Labels: distinct-optimization, join
Environment:

ubuntu 18.04


Issue Links:
Relates
relates to MDEV-30250 Value changed after relaxing ON condi... Confirmed

 Description   

Description:
In theory, the result of sql2(DISTINCT) ⊆ the result of sql1:

SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`); -- sql1
SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT DISTINCT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`); -- sql2

However, the value 20000120000000 changed to 2001 after adding DISTINCT, seems like a logical bug:

mysql> select version();
+-----------------------------------------+
| version()                               |
+-----------------------------------------+
| 10.11.1-MariaDB-1:10.11.1+maria~ubu2204 |
+-----------------------------------------+
1 row in set (0.00 sec)
 
mysql> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
+----------------+
| f3             |
+----------------+
| 20000120000000 |
| 20000120000000 |
+----------------+
2 rows in set, 4 warnings (0.00 sec)
 
mysql> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT DISTINCT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
+------+
| f3   |
+------+
| 2001 |
| 2001 |
+------+
2 rows in set, 4 warnings (0.00 sec)

How to repeat:

drop table if exists t;
CREATE TABLE t (c1 DECIMAL(40,20));
INSERT INTO t VALUES (-0),(120);
 
SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT DISTINCT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);

Hope these can be helpful for your debugging:
We look for the first version of the bug in all docker images (with format x.x.x) of https://hub.docker.com/_/mariadb/tags
We found that the bug first occurred in mariadb:10.1.10, it cannot be reproduced in mariadb:10.1.2 (we can't find more intermediate versions in dockerhub):

MariaDB [TEST]> select version();
+--------------------------+
| version()                |
+--------------------------+
| 10.1.10-MariaDB-1~jessie |
+--------------------------+
1 row in set (0.00 sec)
 
MariaDB [TEST]> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
+----------------+
| f3             |
+----------------+
| 20000120000000 |
+----------------+
1 row in set, 1 warning (0.00 sec)
 
MariaDB [TEST]> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT DISTINCT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
+------+
| f3   |
+------+
| 2001 |
+------+
1 row in set, 2 warnings (0.00 sec)
 
MariaDB [TEST]> select version();
+-----------------------------------+
| version()                         |
+-----------------------------------+
| 10.1.2-MariaDB-1~wheezy-wsrep-log |
+-----------------------------------+
1 row in set (0.00 sec)
 
MariaDB [TEST]> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
Empty set, 8 warnings (0.00 sec)
 
MariaDB [TEST]> SELECT (1^`f5`) AS `f3` FROM (SELECT c1 AS `f4` FROM t) AS `t1` JOIN (SELECT DISTINCT (DATE_ADD(c1, INTERVAL 1 DAY_SECOND)) AS `f5` FROM t) AS `t3` ON (FROM_DAYS(1)) NOT IN (`f4`,`f5`);
Empty set, 4 warnings (0.00 sec)



 Comments   
Comment by Sergei Golubchik [ 2023-01-09 ]

Likely the same as MDEV-30250, let's retest after MDEV-30250 is fixed

Generated at Thu Feb 08 10:14:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.