[MDEV-4219] A simple select query returns random data (upstream bug#68473) Created: 2013-03-01  Updated: 2013-07-17  Resolved: 2013-07-17

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.1, 5.5.29, 5.1.67, 5.2.14, 5.3.12
Fix Version/s: 5.5.32

Type: Bug Priority: Major
Reporter: Patryk Pomykalski Assignee: Timour Katchaounov (Inactive)
Resolution: Fixed Votes: 0
Labels: optimizer, upstream


 Description   

Originally filed at http://bugs.mysql.com/bug.php?id=68473 by Balázs Hinel:

Here's the SQL query to create and fill the table:

CREATE TABLE `faulty` (
`a` int(11) unsigned NOT NULL AUTO_INCREMENT,
`b` int(11) unsigned NOT NULL,
`c` datetime NOT NULL,
PRIMARY KEY (`a`),
UNIQUE KEY `b_and_c` (`b`,`c`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 ;
 
INSERT INTO `faulty` (`b`, `c`) VALUES
(1801, '2013-02-15 09:00:00'),
(1802, '2013-02-28 09:00:00'),
(1802, '2013-03-01 09:00:00'),
(5,    '1990-02-15 09:00:00'),
(5,    '2013-02-15 09:00:00'),
(5,    '2009-02-15 17:00:00');

After this is done, try running this query:

SELECT DISTINCT b, c FROM faulty WHERE b='1802' ORDER BY c;

It should return the rows where the b attribute is 1802, but it returns random data instead:

mysql> SELECT DISTINCT b, c FROM faulty WHERE b='1802' ORDER BY c;
+------------+---------------------+
| b          | c                   |
+------------+---------------------+
| 2795961176 | 0000-42-94 96:72:97 |
| 2795961176 | 0000-42-94 96:72:97 |
+------------+---------------------+
2 rows in set (0.00 sec)

If you change the number 1802 to number 5 in the query, it is working properly:

mysql> SELECT DISTINCT b, c FROM faulty WHERE b='5' ORDER BY c;
+---+---------------------+
| b | c                   |
+---+---------------------+
| 5 | 1990-02-15 09:00:00 |
| 5 | 2009-02-15 17:00:00 |
| 5 | 2013-02-15 09:00:00 |
+---+---------------------+
3 rows in set (0.00 sec)

Some more info...
I tested various versions including mysql 5.5.0, 5.5.8, 5.0.92, mariadb 5.1, 5.2, 10.0.1 and all have some form of this bug. (sometimes random data, sometimes it's repeated row:

+------+---------------------+
| b    | c                   |
+------+---------------------+
| 1802 | 2013-03-01 09:00:00 |
| 1802 | 2013-03-01 09:00:00 |
+------+---------------------+

Similar queries and 3 different explains:

mysql [localhost] {msandbox} (test) > explain SELECT DISTINCT b, c FROM faulty WHERE b='1802' ORDER BY c;
+------+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------------+
| id   | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows | Extra                                                 |
+------+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------------+
|    1 | SIMPLE      | faulty | range | b_and_c       | b_and_c | 12      | NULL |    2 | Using where; Using index for group-by; Using filesort |
+------+-------------+--------+-------+---------------+---------+---------+------+------+-------------------------------------------------------+
1 row in set (0.00 sec)
 
mysql [localhost] {msandbox} (test) > explain SELECT DISTINCT b, c FROM faulty WHERE b='5' ORDER BY c;
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
| id   | select_type | table  | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
|    1 | SIMPLE      | faulty | ref  | b_and_c       | b_and_c | 4       | const |    3 | Using where; Using index |
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)
 
mysql [localhost] {msandbox} (test) > explain SELECT DISTINCT b, c FROM faulty WHERE b=1802 ORDER BY c;
+------+-------------+--------+-------+---------------+---------+---------+------+------+---------------------------------------+
| id   | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows | Extra                                 |
+------+-------------+--------+-------+---------------+---------+---------+------+------+---------------------------------------+
|    1 | SIMPLE      | faulty | range | b_and_c       | b_and_c | 12      | NULL |    2 | Using where; Using index for group-by |
+------+-------------+--------+-------+---------------+---------+---------+------+------+---------------------------------------+
1 row in set (0.00 sec)
 
mysql [localhost] {msandbox} (test) > explain SELECT DISTINCT b, c FROM faulty WHERE b=5 ORDER BY c;
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
| id   | select_type | table  | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
|    1 | SIMPLE      | faulty | ref  | b_and_c       | b_and_c | 4       | const |    3 | Using where; Using index |
+------+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

With order by c desc yet another plan and the result is ok:

mysql [localhost] {msandbox} (test) > explain SELECT DISTINCT b, c FROM faulty WHERE b='1802' ORDER BY c DESC;
+------+-------------+--------+-------+---------------+---------+---------+------+------+------------------------------------------------------------------------+
| id   | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows | Extra                                                                  |
+------+-------------+--------+-------+---------------+---------+---------+------+------+------------------------------------------------------------------------+
|    1 | SIMPLE      | faulty | range | b_and_c       | b_and_c | 12      | NULL |    2 | Using where; Using index for group-by; Using temporary; Using filesort |
+------+-------------+--------+-------+---------------+---------+---------+------+------+------------------------------------------------------------------------+
1 row in set (0.00 sec)



 Comments   
Comment by Patryk Pomykalski [ 2013-03-01 ]

The problem is with QUICK_GROUP_MIN_MAX_SELECT and only when condition is a string. In that case b = '1802' isn't removed from group_by by remove_const (comparison in test_if_equality_quarantees_uniqueness fails: item_cmp_type(l->cmp_type(), r->cmp_type()) == l->cmp_type()). item_cmp_type(INT_RESULT, STRING_RESULT) returns REAL_RESULT.
I think test_if_equality_quarantees_uniqueness could be optimized to return true if right item result is a string: (item_cmp_type(l->cmp_type(), r->cmp_type()) == l->cmp_type() || r->cmp_type() == STRING_RESULT).

Comment by Timour Katchaounov (Inactive) [ 2013-07-16 ]

Add the test case from MDEV-4790 (marked as duplicate of this bug):

CREATE TABLE t1 (a INT, b INT);
INSERT INTO t1 (a, b) VALUES (1,1), (1,2), (1,3);
INSERT INTO t1 SELECT a + 1, b FROM t1;
INSERT INTO t1 SELECT a + 2, b FROM t1;

CREATE INDEX break_it ON t1 (a, b);

SELECT distinct a, b FROM t1 where a = '3' ORDER BY b;

Generated at Thu Feb 08 06:54:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.