Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20922

Adding an order by changes the query results

Details

    Description

      CREATE TABLE revenue(id int, month int, year int, value int);
      INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
      

      SELECT 
          anon.month_and_year, 
          (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue 
      FROM (
          SELECT 
              id, value, 
              concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year 
          FROM revenue
      ) as anon 
      GROUP BY anon.month_and_year 
      ORDER BY average_revenue;
      

      Produces

      +----------------+-----------------+
      | month_and_year | average_revenue |
      +----------------+-----------------+
      | 1-2000         |        100.0000 |
      | 2-2000         |        200.0000 |
      | 1-2000         |        300.0000 |
      | 2-2000         |        400.0000 |
      +----------------+-----------------+
      

      Removing the order by clause gives:

      +----------------+-----------------+
      | month_and_year | average_revenue |
      +----------------+-----------------+
      | 1-2000         |        200.0000 |
      | 2-2000         |        300.0000 |
      +----------------+-----------------+
      

      Turning off the derived_merge optimization gives:

      set session optimizer_switch="derived_merge=off";
      

      +----------------+-----------------+
      | month_and_year | average_revenue |
      +----------------+-----------------+
      | 1-2000         |        200.0000 |
      | 2-2000         |        300.0000 |
      +----------------+-----------------+
      

      The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

      Also, here are a few more interesting observations:

      • Removing the DISTINCT removes the duplicate rows
      • Removing the ORDER BY removes the duplicate rows

      Interestingly, when the derived_merged optimization is on (the default), the rewritten query seems correct:

      EXPLAIN EXTENDED 
      SELECT 
          anon.month_and_year, 
          (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue 
      FROM (
          SELECT 
              id, value, 
              concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year 
          FROM revenue
      ) as anon 
      GROUP BY anon.month_and_year 
      ORDER BY average_revenue;
      

      produces:

      +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
      | id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
      +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
      |    1 | SIMPLE      | revenue | ALL  | NULL          | NULL | NULL    | NULL |    8 |   100.00 | Using temporary; Using filesort |
      +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
      

      SHOW WARNINGS;
      

      +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
      +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Note  | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
      +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      

      If I run the rewritten query (the one from above), it produces the correct result:

      select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`)
      

       
      +----------------+-----------------+
      | month_and_year | average_revenue |
      +----------------+-----------------+
      | 1-2000         |        200.0000 |
      | 2-2000         |        300.0000 |
      +----------------+-----------------+
      

      But the original query produces an incorrect result. This is particularly puzzling.

      Attachments

        Issue Links

          Activity

            bugra.gedik Bugra Gedik created issue -
            bugra.gedik Bugra Gedik made changes -
            Field Original Value New Value
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off {derived_merge} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the `derived_merge` optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the `derived_merge` optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows
            bugra.gedik Bugra Gedik made changes -
            Summary Adding an order by changes query results Adding an order by changes the query results
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Some additional details:

            {code:sql}

            {code}
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Some additional details:

            {code:sql}

            {code}
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED SELECT anon.month_and_year, (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue FROM ( SELECT id, value, CONCAT(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year FROM revenue ) as anon GROUP BY anon.month_and_year ORDER BY average_revenue;
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query, it produce the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces a different result. This is particularly puzzling.
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED SELECT anon.month_and_year, (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue FROM ( SELECT id, value, CONCAT(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year FROM revenue ) as anon GROUP BY anon.month_and_year ORDER BY average_revenue;
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query, it produce the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces a different result. This is particularly puzzling.
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED SELECT anon.month_and_year, (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue FROM ( SELECT id, value, CONCAT(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year FROM revenue ) as anon GROUP BY anon.month_and_year ORDER BY average_revenue;
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED SELECT anon.month_and_year, (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue FROM ( SELECT id, value, CONCAT(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year FROM revenue ) as anon GROUP BY anon.month_and_year ORDER BY average_revenue;
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}
            produces:
            {code}
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}
            produces:
            {code}
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}
            produces:
            {code}
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code:sql}
            select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`)
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            varun Varun Gupta (Inactive) made changes -
            Assignee Varun Gupta [ varun ]
            varun Varun Gupta (Inactive) made changes -
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            varun Varun Gupta (Inactive) made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            varun Varun Gupta (Inactive) made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.2 [ 14601 ]
            bugra.gedik Bugra Gedik made changes -
            Description {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}
            produces:
            {code}
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code:sql}
            select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`)
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 3-2000 | 200.0000 |
            | 1-2000 | 200.0000 |
            | 2-2000 | 266.6667 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            {code:sql}
            CREATE TABLE revenue(id int, month int, year int, value int);
            INSERT INTO revenue values (1, 1, 2000, 100), (2, 2, 2000, 200), (3, 1, 2000, 300), (4, 2, 2000, 400);
            {code}

            {code:sql}
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}

            Produces

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 100.0000 |
            | 2-2000 | 200.0000 |
            | 1-2000 | 300.0000 |
            | 2-2000 | 400.0000 |
            +----------------+-----------------+
            {code}

            Removing the order by clause gives:
            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            Turning off the {{derived_merge}} optimization gives:
            {code:sql}
            set session optimizer_switch="derived_merge=off";
            {code}

            {code}
            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            The issue seems to be somewhat similar to MDEV-17775, but in my case there is no join.

            Also, here are a few more interesting observations:
             * Removing the DISTINCT removes the duplicate rows
             * Removing the ORDER BY removes the duplicate rows

            Interestingly, when the {{derived_merged}} optimization is on (the default), the rewritten query seems correct:

            {code:sql}
            EXPLAIN EXTENDED
            SELECT
                anon.month_and_year,
                (SUM(anon.value) / COUNT(DISTINCT anon.id)) AS average_revenue
            FROM (
                SELECT
                    id, value,
                    concat(CAST(month AS CHAR(2)), '-', CAST(year AS CHAR(4))) AS month_and_year
                FROM revenue
            ) as anon
            GROUP BY anon.month_and_year
            ORDER BY average_revenue;
            {code}
            produces:
            {code}
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            | 1 | SIMPLE | revenue | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | Using temporary; Using filesort |
            +------+-------------+---------+------+---------------+------+---------+------+------+----------+---------------------------------+
            {code}

            {code:sql}
            SHOW WARNINGS;
            {code}

            {code}
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Level | Code | Message |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            | Note | 1003 | select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) |
            +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            {code}

            If I run the rewritten query (the one from above), it produces the correct result:
            {code:sql}
            select concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) AS `month_and_year`,sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`) AS `average_revenue` from `test`.`revenue` group by concat(cast(`test`.`revenue`.`month` as char(2) charset utf8mb4),'-',cast(`test`.`revenue`.`year` as char(4) charset utf8mb4)) order by sum(`test`.`revenue`.`value`) / count(distinct `test`.`revenue`.`id`)
            {code}

            {code}

            +----------------+-----------------+
            | month_and_year | average_revenue |
            +----------------+-----------------+
            | 1-2000 | 200.0000 |
            | 2-2000 | 300.0000 |
            +----------------+-----------------+
            {code}

            But the original query produces an incorrect result. This is particularly puzzling.
            varun Varun Gupta (Inactive) made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]
            varun Varun Gupta (Inactive) made changes -
            Priority Critical [ 2 ] Major [ 3 ]
            varun Varun Gupta (Inactive) added a comment - Patch http://lists.askmonty.org/pipermail/commits/2019-December/014081.html
            varun Varun Gupta (Inactive) made changes -
            Assignee Varun Gupta [ varun ] Igor Babaev [ igor ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            varun Varun Gupta (Inactive) added a comment - New Patch http://lists.askmonty.org/pipermail/commits/2019-December/014111.html
            varun Varun Gupta (Inactive) made changes -
            igor Igor Babaev (Inactive) made changes -
            Assignee Igor Babaev [ igor ] Varun Gupta [ varun ]

            Introduced val_*_result functions for Item_direct_view_ref to make sure to get the value from the item it is referring to.

            varun Varun Gupta (Inactive) added a comment - Introduced val_*_result functions for Item_direct_view_ref to make sure to get the value from the item it is referring to.
            varun Varun Gupta (Inactive) made changes -
            Fix Version/s 10.1.44 [ 23912 ]
            Fix Version/s 10.2.31 [ 24017 ]
            Fix Version/s 10.3.22 [ 24018 ]
            Fix Version/s 10.4.12 [ 24019 ]
            Fix Version/s 10.5.1 [ 24029 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Resolution Fixed [ 1 ]
            Status In Review [ 10002 ] Closed [ 6 ]

            Test is not stable because order by the same value, please fix it.

            sanja Oleksandr Byelkin added a comment - Test is not stable because order by the same value, please fix it.
            sanja Oleksandr Byelkin made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Stalled [ 10000 ]
            varun Varun Gupta (Inactive) added a comment - Made the test stable in this commit https://github.com/MariaDB/server/commit/1adc559370cc53ca69e225739a942287eba1b974
            varun Varun Gupta (Inactive) made changes -
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            varun Varun Gupta (Inactive) made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 100689 ] MariaDB v4 [ 156914 ]

            People

              varun Varun Gupta (Inactive)
              bugra.gedik Bugra Gedik
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.