Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17419

Subquery with group by returns wrong results

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.3.9
    • 10.3.11
    • Optimizer
    • None
    • Tested on Windows (laptop) and CentOS (server). I don't think this bug has anything to do with the platform/hardware.

    Description

      In MariaDB 10.3.9, while using a subquery with group by, we get incorrect join results. The same query works fine on MariaDB 10.1.32. It also works fine without the group by.

      We can reproduce this with:

      CREATE TABLE `user` (
      	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
      	`username` VARCHAR(50) NULL DEFAULT '0',
      	PRIMARY KEY (`id`)
      )
      COLLATE='utf8_general_ci'
      ;
       
      CREATE TABLE `audit` (
      	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
      	`userid` INT UNSIGNED NOT NULL,
      	`logindate` DATETIME NOT NULL,
      	PRIMARY KEY (`id`)
      )
      COLLATE='utf8_general_ci'
      ;
       
      insert into user (id, username) VALUES (1,"user1"), (2, "user2");
      insert into audit (id, userid, logindate) VALUES (1,1,"2015-06-19 12:17:02.828"),
        (2,1,"2016-06-19 12:17:02.828"),
        (3,2,"2017-06-19 12:17:02.828"),
        (4,2,"2018-06-19 12:17:02.828");
        
      select *
      from user u
      left join (select * from audit au group by au.userid) as auditlastlogin on u.id=auditlastlogin.userid;
      
      

      Results in 10.1.32 (correct):

      "id"	"username"	"id"	"userid"	"logindate"
      "1"	"user1"	"1"	"1"	"2015-06-19 12:17:02"
      "2"	"user2"	"3"	"2"	"2017-06-19 12:17:02"
      

      On 10.3.9 (wrong):

      "id"	"username"	"id"	"userid"	"logindate"
      "1"	"user1"	"1"	"1"	"2015-06-19 12:17:02"
      "1"	"user1"	"3"	"2"	"2017-06-19 12:17:02"
      "2"	"user2"	"1"	"1"	"2015-06-19 12:17:02"
      "2"	"user2"	"3"	"2"	"2017-06-19 12:17:02"
      

      The same thing happens for:

      select *
      from user u, (select * from audit au group by au.userid) as auditlastlogin
      where u.id=auditlastlogin.userid;
      

      Attachments

        Activity

          alice Alice Sherepa added a comment -

          Please add your .cnf file(s), I could not reproduce it with the default settings

          MariaDB [test]> CREATE TABLE `user` (
              -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
              -> `username` VARCHAR(50) NULL DEFAULT '0',
              -> PRIMARY KEY (`id`)
              -> )
              -> COLLATE='utf8_general_ci'
              -> ;
          Query OK, 0 rows affected (0.029 sec)
           
          MariaDB [test]> CREATE TABLE `audit` (
              -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
              -> `userid` INT UNSIGNED NOT NULL,
              -> `logindate` DATETIME NOT NULL,
              -> PRIMARY KEY (`id`)
              -> )
              -> COLLATE='utf8_general_ci'
              -> ;
          Query OK, 0 rows affected (0.027 sec)
           
          MariaDB [test]> insert into user (id, username) VALUES (1,"user1"), (2, "user2");
          Query OK, 2 rows affected (0.006 sec)
          Records: 2  Duplicates: 0  Warnings: 0
           
          MariaDB [test]> insert into audit (id, userid, logindate) VALUES (1,1,"2015-06-19 12:17:02.828"),
              -> (2,1,"2016-06-19 12:17:02.828"),
              -> (3,2,"2017-06-19 12:17:02.828"),
              -> (4,2,"2018-06-19 12:17:02.828");
          Query OK, 4 rows affected (0.008 sec)
          Records: 4  Duplicates: 0  Warnings: 0
           
          MariaDB [test]> select *
              -> from user u
              -> left join (select * from audit au group by au.userid) as auditlastlogin on u.id=auditlastlogin.userid;
          +----+----------+------+--------+---------------------+
          | id | username | id   | userid | logindate           |
          +----+----------+------+--------+---------------------+
          |  1 | user1    |    1 |      1 | 2015-06-19 12:17:02 |
          |  2 | user2    |    3 |      2 | 2017-06-19 12:17:02 |
          +----+----------+------+--------+---------------------+
          2 rows in set (0.001 sec)
           
          MariaDB [test]> select version();
          +----------------+
          | version()      |
          +----------------+
          | 10.3.9-MariaDB |
          +----------------+
          1 row in set (0.000 sec)
           
          MariaDB [test]> select *
              -> from user u, (select * from audit au group by au.userid) as auditlastlogin
              -> where u.id=auditlastlogin.userid;
          +----+----------+----+--------+---------------------+
          | id | username | id | userid | logindate           |
          +----+----------+----+--------+---------------------+
          |  1 | user1    |  1 |      1 | 2015-06-19 12:17:02 |
          |  2 | user2    |  3 |      2 | 2017-06-19 12:17:02 |
          +----+----------+----+--------+---------------------+
          2 rows in set (0.001 sec)
          

          alice Alice Sherepa added a comment - Please add your .cnf file(s), I could not reproduce it with the default settings MariaDB [test]> CREATE TABLE `user` ( -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT, -> `username` VARCHAR(50) NULL DEFAULT '0', -> PRIMARY KEY (`id`) -> ) -> COLLATE='utf8_general_ci' -> ; Query OK, 0 rows affected (0.029 sec)   MariaDB [test]> CREATE TABLE `audit` ( -> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT, -> `userid` INT UNSIGNED NOT NULL, -> `logindate` DATETIME NOT NULL, -> PRIMARY KEY (`id`) -> ) -> COLLATE='utf8_general_ci' -> ; Query OK, 0 rows affected (0.027 sec)   MariaDB [test]> insert into user (id, username) VALUES (1,"user1"), (2, "user2"); Query OK, 2 rows affected (0.006 sec) Records: 2 Duplicates: 0 Warnings: 0   MariaDB [test]> insert into audit (id, userid, logindate) VALUES (1,1,"2015-06-19 12:17:02.828"), -> (2,1,"2016-06-19 12:17:02.828"), -> (3,2,"2017-06-19 12:17:02.828"), -> (4,2,"2018-06-19 12:17:02.828"); Query OK, 4 rows affected (0.008 sec) Records: 4 Duplicates: 0 Warnings: 0   MariaDB [test]> select * -> from user u -> left join (select * from audit au group by au.userid) as auditlastlogin on u.id=auditlastlogin.userid; +----+----------+------+--------+---------------------+ | id | username | id | userid | logindate | +----+----------+------+--------+---------------------+ | 1 | user1 | 1 | 1 | 2015-06-19 12:17:02 | | 2 | user2 | 3 | 2 | 2017-06-19 12:17:02 | +----+----------+------+--------+---------------------+ 2 rows in set (0.001 sec)   MariaDB [test]> select version(); +----------------+ | version() | +----------------+ | 10.3.9-MariaDB | +----------------+ 1 row in set (0.000 sec)   MariaDB [test]> select * -> from user u, (select * from audit au group by au.userid) as auditlastlogin -> where u.id=auditlastlogin.userid; +----+----------+----+--------+---------------------+ | id | username | id | userid | logindate | +----+----------+----+--------+---------------------+ | 1 | user1 | 1 | 1 | 2015-06-19 12:17:02 | | 2 | user2 | 3 | 2 | 2017-06-19 12:17:02 | +----+----------+----+--------+---------------------+ 2 rows in set (0.001 sec)
          alice Alice Sherepa added a comment -

          I can reproduce it in MariaDB 10.3.9, 10.2.18 with "set join_cache_level=8; set join_buffer_size = 6291456;",
          fixed by 1eca49577e979220f3ab663a7e46e0eb70d728c4 commit by Igor Babaev

          commit 1eca49577e979220f3ab663a7e46e0eb70d728c4
          Author: Igor Babaev <igor@askmonty.org>
          Date:   Sun Oct 7 10:19:19 2018 -0700
           
              MDEV-17382 Hash join algorithm should not be used to join materialized
                         derived table / view by equality
              
              Now rows of a materialized derived table are always put into a
              temporary table before join operation. If BNLH is used to join this
              table with the result of a partial join then both operands of the
              join are actually put into main memory. In most cases this is not
              efficient.
              We could avoid this by sending the rows of the derived table directly
              to the join operation. However this kind of data flow is not supported
              yet.
              Fixed by not allowing usage of hash join algorithm to join a materialized
              derived table if it's joined by an equality predicate of the form
              f=e where f is a field of the derived table.
          

          alice Alice Sherepa added a comment - I can reproduce it in MariaDB 10.3.9, 10.2.18 with "set join_cache_level=8; set join_buffer_size = 6291456;", fixed by 1eca49577e979220f3ab663a7e46e0eb70d728c4 commit by Igor Babaev commit 1eca49577e979220f3ab663a7e46e0eb70d728c4 Author: Igor Babaev <igor@askmonty.org> Date: Sun Oct 7 10:19:19 2018 -0700   MDEV-17382 Hash join algorithm should not be used to join materialized derived table / view by equality Now rows of a materialized derived table are always put into a temporary table before join operation. If BNLH is used to join this table with the result of a partial join then both operands of the join are actually put into main memory. In most cases this is not efficient. We could avoid this by sending the rows of the derived table directly to the join operation. However this kind of data flow is not supported yet. Fixed by not allowing usage of hash join algorithm to join a materialized derived table if it's joined by an equality predicate of the form f=e where f is a field of the derived table.

          The test case for this bug was pushed into 10.3.
          The bug was fixed by the patch for mdev-17382.

          igor Igor Babaev (Inactive) added a comment - The test case for this bug was pushed into 10.3. The bug was fixed by the patch for mdev-17382.

          People

            igor Igor Babaev (Inactive)
            stijn Stijn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.