[MDEV-6263] Wrong result when using IN subquery with order by Created: 2014-05-22  Updated: 2014-05-28  Resolved: 2014-05-28

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.37, 10.0.11
Fix Version/s: 5.5.38, 10.0.12

Type: Bug Priority: Major
Reporter: jose antonio Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu Precise, also Ubuntu 14.04


Attachments: File prueba.sql    

 Description   

I got a wrong result in two different machines

I tried after uploading the prueba.sql file the following query:

SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;

I get:

+----+--------+
| id | nombre |
+----+--------+
|  1 | row 1  |
|  1 | row 1  |
|  1 | row 1  |
|  2 | row 2  |
|  2 | row 2  |
|  3 | row 3  |
+----+--------+

but if I remove the order by clause I get:

+----+--------+
| id | nombre |
+----+--------+
|  1 | row 1  |
|  2 | row 2  |
|  3 | row 3  |
+----+--------+

I don't put much more information i think it's very reproductible since i got it in two different machines.



 Comments   
Comment by Elena Stepanova [ 2014-05-22 ]

Hi,

Could you please provide SHOW CREATE TABLE results for `agentes` and `otra` tables?

Also,

EXPLAIN EXTENDED SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC

could be useful.

Thanks.

Comment by jose antonio [ 2014-05-22 ]

MariaDB [test]> show create table agentes;
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                    |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| agentes | CREATE TABLE `agentes` (
  `id` int(11) NOT NULL,
  `nombre` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> show create table otra;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                         |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| otra  | CREATE TABLE `otra` (
  `id_algo` int(11) NOT NULL,
  `id_agente` int(11) NOT NULL,
  PRIMARY KEY (`id_algo`,`id_agente`),
  KEY `another_data` (`id_agente`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
+------+-------------+---------+--------+---------------+---------+---------+---------------------+------+----------+---------------------------------------------------------+
| id   | select_type | table   | type   | possible_keys | key     | key_len | ref                 | rows | filtered | Extra                                                   |
+------+-------------+---------+--------+---------------+---------+---------+---------------------+------+----------+---------------------------------------------------------+
|    1 | PRIMARY     | otra    | index  | another_data  | PRIMARY | 8       | NULL                |    6 |   100.00 | Using index; Using temporary; Using filesort; LooseScan |
|    1 | PRIMARY     | agentes | eq_ref | PRIMARY       | PRIMARY | 4       | test.otra.id_agente |    1 |   100.00 |                                                         |
+------+-------------+---------+--------+---------------+---------+---------+---------------------+------+----------+---------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
 

Comment by Elena Stepanova [ 2014-05-24 ]

Hi Jose,

Thank you.
As a workaround, please try to set optimizer_switch='semijoin=off' or optimizer_switch='loosescan=off'

Comment by jose antonio [ 2014-05-26 ]

Neither helps.

MariaDB [test]> set optimizer_switch='semijoin=off';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> set optimizer_switch='loosescan=off';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> explain extended SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
+------+--------------+---------+-------+---------------+---------+---------+------+------+----------+-----------------------------+
| id   | select_type  | table   | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                       |
+------+--------------+---------+-------+---------------+---------+---------+------+------+----------+-----------------------------+
|    1 | PRIMARY      | agentes | ALL   | NULL          | NULL    | NULL    | NULL |    6 |   100.00 | Using where; Using filesort |
|    2 | MATERIALIZED | otra    | index | another_data  | PRIMARY | 8       | NULL |    6 |   100.00 | Using index                 |
+------+--------------+---------+-------+---------------+---------+---------+------+------+----------+-----------------------------+
2 rows in set, 1 warning (0.00 sec)
 
MariaDB [test]> SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
+----+--------+
| id | nombre |
+----+--------+
|  1 | row 1  |
|  1 | row 1  |
|  1 | row 1  |
|  2 | row 2  |
|  2 | row 2  |
|  3 | row 3  |
+----+--------+
6 rows in set (0.00 sec)

Comment by Elena Stepanova [ 2014-05-26 ]

Hi Jose,

I suppose you have query_cache enabled. If so, after you had set semijoin=off or loosescan=off, either run the query as SELECT SQL_NO_CACHE * FROM ..., or execute FLUSH TABLES, or disable the query cache (set global query_cache_size=0).

Comment by Elena Stepanova [ 2014-05-26 ]

Test case (exactly the same as attached + query from the description, just all together in the form suitable for MTR):

--source include/have_innodb.inc
 
CREATE TABLE IF NOT EXISTS `agentes` (
  `id` int(11) NOT NULL,
  `nombre` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
 
INSERT INTO `agentes` (`id`, `nombre`) VALUES
(1, 'row 1'),(2, 'row 2'),(3, 'row 3'),
(4, 'row 4'),(5, 'row 5'),(6, 'row 6');
 
CREATE TABLE IF NOT EXISTS `otra` (
  `id_algo` int(11) NOT NULL,
  `id_agente` int(11) NOT NULL,
  PRIMARY KEY (`id_algo`,`id_agente`), 
  KEY `another_data` (`id_agente`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
 
INSERT INTO `otra` (`id_algo`, `id_agente`) VALUES
(1, 1),(1, 2),(2, 1),(2, 2),(2, 3),(3, 1);
 
SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
 
SELECT * FROM agentes WHERE id in (select distinct id_agente from otra);

Actual result:

SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
id	nombre
1	row 1
1	row 1
1	row 1
2	row 2
2	row 2
3	row 3
SELECT * FROM agentes WHERE id in (select distinct id_agente from otra);
id	nombre
1	row 1
2	row 2
3	row 3

Expected result:

SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
id	nombre
1	row 1
2	row 2
3	row 3
SELECT * FROM agentes WHERE id in (select distinct id_agente from otra);
id	nombre
1	row 1
2	row 2
3	row 3

Comment by jose antonio [ 2014-05-27 ]

The problem seems to be semijoin, with it disabled and SQL_NO_CACHE It got the right result.

Thanks.

Comment by Sergei Petrunia [ 2014-05-28 ]

Query plan without ORDER BY:

MariaDB [j21]> explain select * FROM agentes WHERE id in (select distinct id_agente from otra);
+------+--------------+-------------+--------+---------------+--------------+---------+------+------+-------------+
| id   | select_type  | table       | type   | possible_keys | key          | key_len | ref  | rows | Extra       |
+------+--------------+-------------+--------+---------------+--------------+---------+------+------+-------------+
|    1 | PRIMARY      | agentes     | ALL    | PRIMARY       | NULL         | NULL    | NULL |    6 |             |
|    1 | PRIMARY      | <subquery2> | eq_ref | distinct_key  | distinct_key | 4       | func |    1 |             |
|    2 | MATERIALIZED | otra        | index  | another_data  | PRIMARY      | 8       | NULL |    6 | Using index |
+------+--------------+-------------+--------+---------------+--------------+---------+------+------+-------------+

This seems ok.

Query plan with ORDER BY:

MariaDB [j21]> explain SELECT * FROM agentes WHERE id in (select distinct id_agente from otra) ORDER BY nombre ASC;
+------+-------------+---------+--------+---------------+---------+---------+--------------------+------+---------------------------------------------------------+
| id   | select_type | table   | type   | possible_keys | key     | key_len | ref                | rows | Extra                                                   |
+------+-------------+---------+--------+---------------+---------+---------+--------------------+------+---------------------------------------------------------+
|    1 | PRIMARY     | otra    | index  | another_data  | PRIMARY | 8       | NULL               |    6 | Using index; Using temporary; Using filesort; LooseScan |
|    1 | PRIMARY     | agentes | eq_ref | PRIMARY       | PRIMARY | 4       | j21.otra.id_agente |    1 |                                                         |
+------+-------------+---------+--------+---------------+---------+---------+--------------------+------+---------------------------------------------------------+

This seems to be wrong. otra.PRIMARY is defined as

  PRIMARY KEY (`id_algo`,`id_agente`), 

We need to produce a distinct stream of otra.id_agente. It is not possible to do with LooseScan - there are different values of id_algo, and index scan will produce data ordered by id_algo, id_agente.

Somehow, the presence of ORDER BY causes invalid query plan to be chosen. Will investigate further.

Generated at Thu Feb 08 07:10:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.