[MDEV-17419] Subquery with group by returns wrong results Created: 2018-10-10  Updated: 2018-10-17  Resolved: 2018-10-17

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.9
Fix Version/s: 10.3.11

Type: Bug Priority: Major
Reporter: Stijn Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: None
Environment:

Tested on Windows (laptop) and CentOS (server). I don't think this bug has anything to do with the platform/hardware.


Attachments: File my.ini     File server-centos.cnf    

 Description   

In MariaDB 10.3.9, while using a subquery with group by, we get incorrect join results. The same query works fine on MariaDB 10.1.32. It also works fine without the group by.

We can reproduce this with:

CREATE TABLE `user` (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
	`username` VARCHAR(50) NULL DEFAULT '0',
	PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
;
 
CREATE TABLE `audit` (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
	`userid` INT UNSIGNED NOT NULL,
	`logindate` DATETIME NOT NULL,
	PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
;
 
insert into user (id, username) VALUES (1,"user1"), (2, "user2");
insert into audit (id, userid, logindate) VALUES (1,1,"2015-06-19 12:17:02.828"),
  (2,1,"2016-06-19 12:17:02.828"),
  (3,2,"2017-06-19 12:17:02.828"),
  (4,2,"2018-06-19 12:17:02.828");
  
select *
from user u
left join (select * from audit au group by au.userid) as auditlastlogin on u.id=auditlastlogin.userid;

Results in 10.1.32 (correct):

"id"	"username"	"id"	"userid"	"logindate"
"1"	"user1"	"1"	"1"	"2015-06-19 12:17:02"
"2"	"user2"	"3"	"2"	"2017-06-19 12:17:02"

On 10.3.9 (wrong):

"id"	"username"	"id"	"userid"	"logindate"
"1"	"user1"	"1"	"1"	"2015-06-19 12:17:02"
"1"	"user1"	"3"	"2"	"2017-06-19 12:17:02"
"2"	"user2"	"1"	"1"	"2015-06-19 12:17:02"
"2"	"user2"	"3"	"2"	"2017-06-19 12:17:02"

The same thing happens for:

select *
from user u, (select * from audit au group by au.userid) as auditlastlogin
where u.id=auditlastlogin.userid;



 Comments   
Comment by Alice Sherepa [ 2018-10-10 ]

Please add your .cnf file(s), I could not reproduce it with the default settings

MariaDB [test]> CREATE TABLE `user` (
    -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
    -> `username` VARCHAR(50) NULL DEFAULT '0',
    -> PRIMARY KEY (`id`)
    -> )
    -> COLLATE='utf8_general_ci'
    -> ;
Query OK, 0 rows affected (0.029 sec)
 
MariaDB [test]> CREATE TABLE `audit` (
    -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
    -> `userid` INT UNSIGNED NOT NULL,
    -> `logindate` DATETIME NOT NULL,
    -> PRIMARY KEY (`id`)
    -> )
    -> COLLATE='utf8_general_ci'
    -> ;
Query OK, 0 rows affected (0.027 sec)
 
MariaDB [test]> insert into user (id, username) VALUES (1,"user1"), (2, "user2");
Query OK, 2 rows affected (0.006 sec)
Records: 2  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into audit (id, userid, logindate) VALUES (1,1,"2015-06-19 12:17:02.828"),
    -> (2,1,"2016-06-19 12:17:02.828"),
    -> (3,2,"2017-06-19 12:17:02.828"),
    -> (4,2,"2018-06-19 12:17:02.828");
Query OK, 4 rows affected (0.008 sec)
Records: 4  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select *
    -> from user u
    -> left join (select * from audit au group by au.userid) as auditlastlogin on u.id=auditlastlogin.userid;
+----+----------+------+--------+---------------------+
| id | username | id   | userid | logindate           |
+----+----------+------+--------+---------------------+
|  1 | user1    |    1 |      1 | 2015-06-19 12:17:02 |
|  2 | user2    |    3 |      2 | 2017-06-19 12:17:02 |
+----+----------+------+--------+---------------------+
2 rows in set (0.001 sec)
 
MariaDB [test]> select version();
+----------------+
| version()      |
+----------------+
| 10.3.9-MariaDB |
+----------------+
1 row in set (0.000 sec)
 
MariaDB [test]> select *
    -> from user u, (select * from audit au group by au.userid) as auditlastlogin
    -> where u.id=auditlastlogin.userid;
+----+----------+----+--------+---------------------+
| id | username | id | userid | logindate           |
+----+----------+----+--------+---------------------+
|  1 | user1    |  1 |      1 | 2015-06-19 12:17:02 |
|  2 | user2    |  3 |      2 | 2017-06-19 12:17:02 |
+----+----------+----+--------+---------------------+
2 rows in set (0.001 sec)

Comment by Alice Sherepa [ 2018-10-11 ]

I can reproduce it in MariaDB 10.3.9, 10.2.18 with "set join_cache_level=8; set join_buffer_size = 6291456;",
fixed by 1eca49577e979220f3ab663a7e46e0eb70d728c4 commit by Igor Babaev

commit 1eca49577e979220f3ab663a7e46e0eb70d728c4
Author: Igor Babaev <igor@askmonty.org>
Date:   Sun Oct 7 10:19:19 2018 -0700
 
    MDEV-17382 Hash join algorithm should not be used to join materialized
               derived table / view by equality
    
    Now rows of a materialized derived table are always put into a
    temporary table before join operation. If BNLH is used to join this
    table with the result of a partial join then both operands of the
    join are actually put into main memory. In most cases this is not
    efficient.
    We could avoid this by sending the rows of the derived table directly
    to the join operation. However this kind of data flow is not supported
    yet.
    Fixed by not allowing usage of hash join algorithm to join a materialized
    derived table if it's joined by an equality predicate of the form
    f=e where f is a field of the derived table.

Comment by Igor Babaev [ 2018-10-17 ]

The test case for this bug was pushed into 10.3.
The bug was fixed by the patch for mdev-17382.

Generated at Thu Feb 08 08:36:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.