[MDEV-20922] Adding an order by changes the query results - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.3.17
Fix Version/s: 10.1.44, 10.2.31, 10.3.22, 10.4.12, 10.5.1
Component/s: Optimizer
Labels:
None
Environment:
Ubuntu 2018

Description

CREATE TABLE revenue(id int, month int, year int, value int);

INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);

SELECT

    anon.month_and_year,

    (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue

FROM (

    SELECT

        id, value,

        concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year

    FROM revenue

) as anon

GROUP BY anon.month_and_year

ORDER BY average_revenue;

Produces

+----------------+-----------------+

| month_and_year | average_revenue |

+----------------+-----------------+

| 1-2000         |        100.0000 |

| 2-2000         |        200.0000 |

| 1-2000         |        300.0000 |

| 2-2000         |        400.0000 |

+----------------+-----------------+

Removing the order by clause gives:

+----------------+-----------------+

| month_and_year | average_revenue |

+----------------+-----------------+

| 1-2000         |        200.0000 |

| 2-2000         |        300.0000 |

+----------------+-----------------+

Turning off the derived_merge optimization gives:

set session optimizer_switch="derived_merge=off";

+----------------+-----------------+

| month_and_year | average_revenue |

+----------------+-----------------+

| 1-2000         |        200.0000 |

| 2-2000         |        300.0000 |

+----------------+-----------------+

The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

Also, here are a few more interesting observations:

Removing the DISTINCT removes the duplicate rows
Removing the ORDER BY removes the duplicate rows

Interestingly, when the derived_merged optimization is on (the default), the rewritten query seems correct:

EXPLAIN EXTENDED

SELECT

    anon.month_and_year,

    (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue

FROM (

    SELECT

        id, value,

        concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year

    FROM revenue

) as anon

GROUP BY anon.month_and_year

ORDER BY average_revenue;

produces:

+------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+

| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |

+------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+

|    1 | SIMPLE      | revenue | ALL  | NULL          | NULL | NULL    | NULL |    8 |   100.00 | Using temporary; Using filesort |

+------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+

SHOW WARNINGS;

+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |

+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Note  | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |

+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

If I run the rewritten query (the one from above), it produces the correct result:

select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`)

+----------------+-----------------+

| month_and_year | average_revenue |

+----------------+-----------------+

| 1-2000         |        200.0000 |

| 2-2000         |        300.0000 |

+----------------+-----------------+

But the original query produces an incorrect result. This is particularly puzzling.

Attachments

Issue Links

relates to

MDEV-17775 Different results of query with 'derived_merge=on';

Confirmed

MDEV-20010 Equal on two RANK window functions create wrong result

Closed

MDEV-21565 main.group_by failed in buildbot with wrong result

Closed

Activity

Ascending order - Click to sort in descending order

Varun Gupta (Inactive) added a comment - 2019-12-03 13:49

Patch
http://lists.askmonty.org/pipermail/commits/2019-December/014081.html

Varun Gupta (Inactive) added a comment - 2019-12-03 13:49 Patch http://lists.askmonty.org/pipermail/commits/2019-December/014081.html

Varun Gupta (Inactive) added a comment - 2019-12-18 19:00

New Patch
http://lists.askmonty.org/pipermail/commits/2019-December/014111.html

Varun Gupta (Inactive) added a comment - 2019-12-18 19:00 New Patch http://lists.askmonty.org/pipermail/commits/2019-December/014111.html

Varun Gupta (Inactive) added a comment - 2019-12-31 05:31

Introduced val_*_result functions for Item_direct_view_ref to make sure to get the value from the item it is referring to.

Varun Gupta (Inactive) added a comment - 2019-12-31 05:31 Introduced val_*_result functions for Item_direct_view_ref to make sure to get the value from the item it is referring to.

Oleksandr Byelkin added a comment - 2020-01-03 17:05

Test is not stable because order by the same value, please fix it.

Oleksandr Byelkin added a comment - 2020-01-03 17:05 Test is not stable because order by the same value, please fix it.

Varun Gupta (Inactive) added a comment - 2020-01-07 05:59

Made the test stable in this commit https://github.com/MariaDB/server/commit/1adc559370cc53ca69e225739a942287eba1b974

Varun Gupta (Inactive) added a comment - 2020-01-07 05:59 Made the test stable in this commit https://github.com/MariaDB/server/commit/1adc559370cc53ca69e225739a942287eba1b974

People

Assignee:: Varun Gupta (Inactive)

Reporter:: Bugra Gedik

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2019-10-29 22:30

Updated:: 2020-01-26 04:15

Resolved:: 2020-01-07 05:59

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server