Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13694

Wrong result upon GROUP BY with orderby_uses_equalities=on

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
    • 10.6.0
    • Optimizer
    • None

    Description

      Originally reported at StackOverflow

      Test case

      CREATE TABLE person (
        PersonID MEDIUMINT(8) UNSIGNED AUTO_INCREMENT,
        FullName VARCHAR(100),
        Furigana VARCHAR(100),
        Sex ENUM('M','F'),
        HouseholdID MEDIUMINT(8) UNSIGNED DEFAULT 0,
        Relation VARCHAR(6),
        Title VARCHAR(6),
        CellPhone VARCHAR(30),
        Email VARCHAR(70),
        Birtdate DATE DEFAULT '0000-00-00',
        Country VARCHAR(30),
        URL VARCHAR(150),
        Organization TINYINT(1) DEFAULT 0,
        Remarks TEXT,
        Photo TINYINT(1) DEFAULT 0,
        UpdDate DATE DEFAULT '0000-00-00',
        PRIMARY KEY (PersonID),
        INDEX(Furigana),
        INDEX(FullName),
        INDEX(Email),
        INDEX(Organization,Furigana)
      ) ;
       
      CREATE TABLE percat (
        PersonID MEDIUMINT(8) DEFAULT 0,
        CategoryID MEDIUMINT(8) DEFAULT 0,
        PRIMARY KEY (PersonID, CategoryID),
        INDEX (CategoryID)
      ) ;
       
      CREATE TABLE action (
        ActionID MEDIUMINT(8) UNSIGNED AUTO_INCREMENT,
        PersonID MEDIUMINT(8) UNSIGNED DEFAULT 0,
        ActionTypeID MEDIUMINT(8) UNSIGNED DEFAULT 0,
        ActionDate DATE DEFAULT '0000-00-00',
        Description TEXT,
        PRIMARY KEY (ActionID),
        INDEX (PersonID),
        INDEX (ActionDate),
        INDEX (ActionTypeID)
      ) ;
       
      INSERT INTO person (PersonID) VALUES 
      (58),(96),(273),(352);
       
      INSERT INTO percat VALUES 
      (58,9),(273,1),(273,9),(273,14),(352,1),(352,13);
       
      INSERT INTO action (PersonID, ActionTypeID) VALUES
      (58,3),(96,3),(273,3),(352,3);
       
      SELECT person.PersonID,
      GROUP_CONCAT(CategoryID ORDER BY CategoryID SEPARATOR ',') AS categories
      FROM person LEFT JOIN percat ON person.PersonID=percat.PersonID
      WHERE person.PersonID IN (SELECT PersonID FROM action WHERE ActionTypeID=3)
      GROUP BY person.PersonID;
       
      DROP TABLE action, percat, person;
      

      Result with orderby_uses_equalities=on

      SELECT person.PersonID,
      GROUP_CONCAT(CategoryID ORDER BY CategoryID SEPARATOR ',') AS categories
      FROM person LEFT JOIN percat ON person.PersonID=percat.PersonID
      WHERE person.PersonID IN (SELECT PersonID FROM action WHERE ActionTypeID=3)
      GROUP BY person.PersonID;
      PersonID	categories
      352	1,1,1,1,13,13,13,13
      

      Result with orderby_uses_equalities=off

      SELECT person.PersonID,
      GROUP_CONCAT(CategoryID ORDER BY CategoryID SEPARATOR ',') AS categories
      FROM person LEFT JOIN percat ON person.PersonID=percat.PersonID
      WHERE person.PersonID IN (SELECT PersonID FROM action WHERE ActionTypeID=3)
      GROUP BY person.PersonID;
      PersonID	categories
      58	9
      96	NULL
      273	1,9,14
      352	1,13
      

      The second result is correct.

      Attachments

        Issue Links

          Activity

            varun Varun Gupta (Inactive) added a comment - Patch addressing the review http://lists.askmonty.org/pipermail/commits/2019-September/014014.html

            Ok to push the last version of the patch.

            psergei Sergei Petrunia added a comment - Ok to push the last version of the patch.

            On version earlier than 10.6, the query plan is like

            MariaDB [test]> EXPLAIN SELECT t1.a
                -> FROM t1
                -> WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
                -> ORDER BY t1.a DESC;
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            | id   | select_type  | table       | type   | possible_keys | key     | key_len | ref       | rows | Extra                           |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            |    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL      |    3 | Using temporary; Using filesort |
            |    1 | PRIMARY      | t1          | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a |    1 | Using index                     |
            |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL    | NULL    | NULL      |    3 | Using where                     |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            3 rows in set (0.00 sec)
            
            

            With 10.6 onwards, the query plan is

            MariaDB [test]> EXPLAIN SELECT t1.a
                -> FROM t1
                -> WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
                -> ORDER BY t1.a DESC;
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            | id   | select_type  | table       | type   | possible_keys | key     | key_len | ref       | rows | Extra          |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            |    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL      | 3    | Using filesort |
            |    1 | PRIMARY      | t1          | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a | 1    | Using index    |
            |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL    | NULL    | NULL      | 3    | Using where    |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            3 rows in set (0.004 sec)
            

            The sjm scan table uses filesort. There is no usage of temp table from 10.6 onwards for sorting

            varun Varun Gupta (Inactive) added a comment - On version earlier than 10.6, the query plan is like MariaDB [test]> EXPLAIN SELECT t1.a -> FROM t1 -> WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) -> ORDER BY t1.a DESC; +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 3 | Using temporary; Using filesort | | 1 | PRIMARY | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.a | 1 | Using index | | 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ 3 rows in set (0.00 sec) With 10.6 onwards, the query plan is MariaDB [test]> EXPLAIN SELECT t1.a -> FROM t1 -> WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) -> ORDER BY t1.a DESC; +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 3 | Using filesort | | 1 | PRIMARY | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.a | 1 | Using index | | 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ 3 rows in set (0.004 sec) The sjm scan table uses filesort. There is no usage of temp table from 10.6 onwards for sorting

            The initially reported wrong result was fixed in 10.2+ by the patch for MDEV-13994.

            elenst Elena Stepanova added a comment - The initially reported wrong result was fixed in 10.2+ by the patch for MDEV-13994 .

            So a bit more summary regarding the changes introduces in this mdev.
            Commit hash for MDEV-13994: dcbf2823c7d64380f06372d77d1522e97fb8f066

            On 10.2 before MDEV-13994 was pushed

            MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            | id   | select_type  | table       | type   | possible_keys | key     | key_len | ref       | rows | Extra          |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            |    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL      |    3 | Using filesort |
            |    1 | PRIMARY      | t1          | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a |    1 |                |
            |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL    | NULL    | NULL      |    3 | Using where    |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            3 rows in set (0.00 sec)
            

            MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +-----+--------------------+
            | a   | group_concat(t1.b) |
            +-----+--------------------+
            | 273 | 3,3,3              |
            +-----+--------------------+
            1 row in set (0.00 sec)
            

            The plan here is using filesort on the SJM scan table.
            This returns incorrect results on 10.2 onwards.

            On 10.2 after patch of MDEV-13994

            MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            | id   | select_type  | table       | type   | possible_keys | key     | key_len | ref       | rows | Extra                           |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            |    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL      |    3 | Using temporary; Using filesort |
            |    1 | PRIMARY      | t1          | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a |    1 |                                 |
            |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL    | NULL    | NULL      |    3 | Using where                     |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+
            3 rows in set (0.03 sec)
            
            

            MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +-----+--------------------+
            | a   | group_concat(t1.b) |
            +-----+--------------------+
            | 273 | 3                  |
            |  96 | 2                  |
            |  58 | 1                  |
            +-----+--------------------+
            3 rows in set (0.00 sec)
            

            The plan here use temp table for filesort. So here the fix was to disable using filesort on the first table if the first table was a SJM scan table.
            This fixed the wrong results as we are using a different execution path now.

            On 10.6 after MDEV-13694 was pushed

            MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            | id   | select_type  | table       | type   | possible_keys | key     | key_len | ref       | rows | Extra          |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            |    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL      | 3    | Using filesort |
            |    1 | PRIMARY      | t1          | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a | 1    |                |
            |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL    | NULL    | NULL      | 3    | Using where    |
            +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+
            3 rows in set (0.002 sec)
            
            

            MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1  WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC;
            +-----+--------------------+
            | a   | group_concat(t1.b) |
            +-----+--------------------+
            | 273 | 3                  |
            |  96 | 2                  |
            |  58 | 1                  |
            +-----+--------------------+
            3 rows in set (0.003 sec)
            

            On 10.6 the plan uses filesort on the SJM scan table and also gives the correct result.
            This would ensure performance gains if the ORDER BY clause is resolved by the first table.

            varun Varun Gupta (Inactive) added a comment - So a bit more summary regarding the changes introduces in this mdev. Commit hash for MDEV-13994 : dcbf2823c7d64380f06372d77d1522e97fb8f066 On 10.2 before MDEV-13994 was pushed MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 3 | Using filesort | | 1 | PRIMARY | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.a | 1 | | | 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ 3 rows in set (0.00 sec) MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +-----+--------------------+ | a | group_concat(t1.b) | +-----+--------------------+ | 273 | 3,3,3 | +-----+--------------------+ 1 row in set (0.00 sec) The plan here is using filesort on the SJM scan table. This returns incorrect results on 10.2 onwards. On 10.2 after patch of MDEV-13994 MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 3 | Using temporary; Using filesort | | 1 | PRIMARY | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.a | 1 | | | 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+---------------------------------+ 3 rows in set (0.03 sec) MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +-----+--------------------+ | a | group_concat(t1.b) | +-----+--------------------+ | 273 | 3 | | 96 | 2 | | 58 | 1 | +-----+--------------------+ 3 rows in set (0.00 sec) The plan here use temp table for filesort. So here the fix was to disable using filesort on the first table if the first table was a SJM scan table. This fixed the wrong results as we are using a different execution path now . On 10.6 after MDEV-13694 was pushed MariaDB [test]> EXPLAIN SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 3 | Using filesort | | 1 | PRIMARY | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.a | 1 | | | 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where | +------+--------------+-------------+--------+---------------+---------+---------+-----------+------+----------------+ 3 rows in set (0.002 sec) MariaDB [test]> SELECT t1.a, group_concat(t1.b) FROM t1 WHERE t1.a IN (SELECT a FROM t2 WHERE b=3) GROUP BY t1.a DESC; +-----+--------------------+ | a | group_concat(t1.b) | +-----+--------------------+ | 273 | 3 | | 96 | 2 | | 58 | 1 | +-----+--------------------+ 3 rows in set (0.003 sec) On 10.6 the plan uses filesort on the SJM scan table and also gives the correct result. This would ensure performance gains if the ORDER BY clause is resolved by the first table.

            People

              varun Varun Gupta (Inactive)
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.