Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3926

Wrong result with GROUP BY ... WITH ROLLUP

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 5.5.28a
    • None
    • None
    • None
    • Windows server 2003, 64-bit

    Description

      The following query returns different result between MYSQL and MARIADB (5.5.28):

      CREATE TABLE t(a int,b int,c int, amount1 int, amount2 int);
      INSERT INTO t (a, b, c, amount1, amount2) values (1,1,1,1,1);
      INSERT INTO t (a, b, c, amount1, amount2) values (2,2,2,2,1);
      INSERT INTO t (a, b, c, amount1, amount2) values (2,3,3,3,1);
       
      SELECT *,
          IF(`t_reorder`.`b` IS NULL, @ccount1:=amount1, @ccount1) AS `a_amount1`,
          IF(`t_reorder`.`b` IS NULL, @dcount1:=amount2, @dcount1) AS `a_amount2`
      FROM (
      	SELECT *
      		FROM (
        		SELECT `a`, `b`,`c`,
      			SUM(`amount1`) AS `amount1`, SUM(`amount2`) AS `amount2`
      		FROM `t`
      		WHERE (c IN (1,2,3)) AND ((`a` IS NOT NULL) AND (`b` IS NOT NULL))
      		GROUP BY `a`, `b` WITH ROLLUP) `t_rollup`
      	ORDER BY `a`, `b`
      )`t_reorder`;
       
      mysql.exe  Ver 14.14 Distrib 5.5.28, for Win32 (x86) 
      (Correct)
      +------+------+------+---------+---------+-----------+-----------+
      | a    | b    | c    | amount1 | amount2 | a_amount1 | a_amount2 |
      +------+------+------+---------+---------+-----------+-----------+
      | NULL | NULL |    3 |       6 |       3 |         6 |         3 |
      |    1 | NULL |    1 |       1 |       1 |         1 |         1 |
      |    1 |    1 |    1 |       1 |       1 |         1 |         1 |
      |    2 | NULL |    3 |       5 |       2 |         5 |         2 |
      |    2 |    2 |    2 |       2 |       1 |         5 |         2 |
      |    2 |    3 |    3 |       3 |       1 |         5 |         2 |
      +------+------+------+---------+---------+-----------+-----------+
       
      mysql.exe  Ver 15.1 Distrib 5.5.28-MariaDB, for Win32 (x86)
      (Wrong values on a_amount1 and a_amount2)
      ORDER BY `a`, `b` does not have any effect to this qyery
      +------+------+------+---------+---------+-----------+-----------+
      | a    | b    | c    | amount1 | amount2 | a_amount1 | a_amount2 |
      +------+------+------+---------+---------+-----------+-----------+
      |    1 |    1 |    1 |       1 |       1 |         6 |         3 |
      |    1 | NULL |    1 |       1 |       1 |         1 |         1 |
      |    2 |    2 |    2 |       2 |       1 |         1 |         1 |
      |    2 |    3 |    3 |       3 |       1 |         1 |         1 |
      |    2 | NULL |    3 |       5 |       2 |         5 |         2 |
      | NULL | NULL |    3 |       6 |       3 |         6 |         3 |
      +------+------+------+---------+---------+-----------+-----------+

      MariaDB seems to ignore ORDER BY statement after subquery having GROUP BY ... WITH ROLLUP.

      Optimizer_switch settings as per default.

      Attachments

        Issue Links

          Activity

            I think this thread on MySQL illustrates our discussion :

            http://dev.mysql.com/doc/refman/5.1/en/example-maximum-column-group-row.html

            And the last post confirms what you explained : "that can break in any future update".

            abalam Yann-Gaël GAUTHERON added a comment - I think this thread on MySQL illustrates our discussion : http://dev.mysql.com/doc/refman/5.1/en/example-maximum-column-group-row.html And the last post confirms what you explained : "that can break in any future update".

            It's not hypothetical, things do change, and in MySQL too, sometimes you don't even need a major upgrade for that. Here is just one fairly recent example for you: http://bugs.mysql.com/bug.php?id=67846
            It's not quite the same query as yours, but it hits the same non-determinism of GROUP BY behavior that you currently rely on.

            elenst Elena Stepanova added a comment - It's not hypothetical, things do change, and in MySQL too, sometimes you don't even need a major upgrade for that. Here is just one fairly recent example for you: http://bugs.mysql.com/bug.php?id=67846 It's not quite the same query as yours, but it hits the same non-determinism of GROUP BY behavior that you currently rely on.

            Yes and the last post on your link says it's beginning from the 5.1.66 of MySQL.
            We are still on 5.1.62 here in production, so we'll have this problem very soon, we must work at this shortly.

            Thank you again for all,
            and keep good work.

            abalam Yann-Gaël GAUTHERON added a comment - Yes and the last post on your link says it's beginning from the 5.1.66 of MySQL. We are still on 5.1.62 here in production, so we'll have this problem very soon, we must work at this shortly. Thank you again for all, and keep good work.

            We tried a standard workaround on a concrete example :

            Method A = SELECT SQL_NO_CACHE * FROM ( SELECT * FROM t ORDER BY d DESC ) x GROUP BY id;
            Method B = SELECT SQL_NO_CACHE * FROM t WHERE d IN ( SELECT MAX(d) FROM t GROUP BY id ) GROUP BY id;

            Method A on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0068 sec, NOT SQL Standard
            Method B on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0675 sec
            Method A on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec, NOT SQL Standard, ORDER BY ignored, NOT the result expected
            Method B on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec

            Times are the average of 5 trials with SQL_NO_CACHE enabled.

            1. Is the method B the STANDARD SQL method to do the same ?
            2. Does the time found on MariaDB 5.5 illustrate correctly the optimisation you talked about ?
            3. If the method B isn't correct STANDARD SQL, can you guide us to a specific MariaDB forum or community we can find any help on this ?

            Thank you for your help !

            For information, here are the respective explains :
            Method A on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0068 sec, NOT SQL Standard :
            id select_type table type possible_keys key key_len ref rows Extra
            1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1427 Using temporary; Using filesort
            2 DERIVED t ALL NULL NULL NULL NULL 1533 Using filesort

            Method B on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0675 sec :
            id select_type table type possible_keys key key_len ref rows Extra
            1 PRIMARY t index NULL id 4 NULL 1323 Using where
            2 DEPENDENT SUBQUERY t index NULL id 4 NULL 1

            Method A on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec, NOT SQL Standard, ORDER BY ignored, NOT the result expected :
            id select_type table type possible_keys key key_len ref rows Extra
            1 SIMPLE t index NULL id 4 NULL 1234

            Method B on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec :
            id select_type table type possible_keys key key_len ref rows Extra
            1 PRIMARY t ALL d NULL NULL NULL 1411 Using temporary; Using filesort
            1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 4 database.t.d 1
            2 MATERIALIZED t index NULL id 4 NULL 1411

            abalam Yann-Gaël GAUTHERON added a comment - We tried a standard workaround on a concrete example : Method A = SELECT SQL_NO_CACHE * FROM ( SELECT * FROM t ORDER BY d DESC ) x GROUP BY id; Method B = SELECT SQL_NO_CACHE * FROM t WHERE d IN ( SELECT MAX(d) FROM t GROUP BY id ) GROUP BY id; Method A on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0068 sec, NOT SQL Standard Method B on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0675 sec Method A on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec, NOT SQL Standard, ORDER BY ignored, NOT the result expected Method B on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec Times are the average of 5 trials with SQL_NO_CACHE enabled. 1. Is the method B the STANDARD SQL method to do the same ? 2. Does the time found on MariaDB 5.5 illustrate correctly the optimisation you talked about ? 3. If the method B isn't correct STANDARD SQL, can you guide us to a specific MariaDB forum or community we can find any help on this ? Thank you for your help ! For information, here are the respective explains : Method A on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0068 sec, NOT SQL Standard : id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1427 Using temporary; Using filesort 2 DERIVED t ALL NULL NULL NULL NULL 1533 Using filesort Method B on Production machine, MySQL 5.1.62, Linux Gentoo, 400 rows, 0.0675 sec : id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t index NULL id 4 NULL 1323 Using where 2 DEPENDENT SUBQUERY t index NULL id 4 NULL 1 Method A on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec, NOT SQL Standard, ORDER BY ignored, NOT the result expected : id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t index NULL id 4 NULL 1234 Method B on a beta machine, MariaDB 5.5.28, Linux Gentoo, 400 rows, 0.0043 sec : id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t ALL d NULL NULL NULL 1411 Using temporary; Using filesort 1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 4 database.t.d 1 2 MATERIALIZED t index NULL id 4 NULL 1411

            google for "groupwise max" - you will see lots of different solutions to this problem.

            you can also ask your question on irc, freenode, #maria channel. Or email your question to maria-discuss@lists.launchpad.net

            serg Sergei Golubchik added a comment - google for "groupwise max" - you will see lots of different solutions to this problem. you can also ask your question on irc, freenode, #maria channel. Or email your question to maria-discuss@lists.launchpad.net

            People

              Unassigned Unassigned
              mitrpaka P Ka
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.