Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7195

AVG() loses precision in INT context

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 10.0.14, 10.1.0
    • 10.1.8
    • OTHER
    • None

    Description

      AVG(bigint_expression) loses precision when used in integer context.
      This script demonstrates the problem using bit shift, to provide integer context:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (
        auto SERIAL,
        fld1 bigint unsigned NOT NULL,
        companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
        UNIQUE fld1 (fld1)
      );
      INSERT INTO t1 VALUES (1,0x7FFFFFFFFFFFFFFF,00);
      INSERT INTO t1 VALUES (2,0x7FFFFFFFFFFFFFFE,37); 
      INSERT INTO t1 VALUES (3,0x7FFFFFFFFFFFFFFC,37);  
      SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;

      The result is:

      +-----------+--------------------------+---------------------+---------------------+
      | companynr | AVG(fld1)                | avg1                | avg2                |
      +-----------+--------------------------+---------------------+---------------------+
      |        00 | 9223372036854775807.0000 | 9223372036854775808 | 9223372036854775807 |
      |        37 | 9223372036854775805.0000 | 9223372036854775808 | 9223372036854775805 |
      +-----------+--------------------------+---------------------+---------------------+

      Notice, the values in the column avg1 lost precision.
      If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

      The problem is in this piece of code in item_sum.cc, which uses double routines:

      longlong Item_avg_field::val_int()
      {
        return (longlong) rint(val_real());
      }

      It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.

      Attachments

        Activity

          bar Alexander Barkov created issue -
          bar Alexander Barkov made changes -
          Field Original Value New Value
          Description AVG(bigint_expression looses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr,AVG(fld1),AVG(fld1)<<0 AS avg1,CAST(AVG(fld1) AS UNSIGNED) <<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          AVG(bigint_expression) looses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr,AVG(fld1),AVG(fld1)<<0 AS avg1,CAST(AVG(fld1) AS UNSIGNED) <<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          bar Alexander Barkov made changes -
          Summary AVG() looses precision in INT context AVG() loses precision in INT context
          bar Alexander Barkov made changes -
          Description AVG(bigint_expression) looses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr,AVG(fld1),AVG(fld1)<<0 AS avg1,CAST(AVG(fld1) AS UNSIGNED) <<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr,AVG(fld1),AVG(fld1)<<0 AS avg1,CAST(AVG(fld1) AS UNSIGNED) <<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          bar Alexander Barkov made changes -
          Description AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr,AVG(fld1),AVG(fld1)<<0 AS avg1,CAST(AVG(fld1) AS UNSIGNED) <<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          bar Alexander Barkov made changes -
          Description AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.
          AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

          The problem is in this piece of code in item_sum.cc, which uses double routines:
          {code}
          longlong Item_avg_field::val_int()
          {
            return (longlong) rint(val_real());
          }
          {code}
          It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.
          bar Alexander Barkov made changes -
          Description AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+---------------------------+---------------------+----------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+---------------------------+---------------------+----------------------+
          | 00 | 18446744073709551615.0000 | 9223372036854775808 | 18446744073709551615 |
          | 37 | 18446744073709551613.0000 | 9223372036854775808 | 18446744073709551613 |
          +-----------+---------------------------+---------------------+----------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

          The problem is in this piece of code in item_sum.cc, which uses double routines:
          {code}
          longlong Item_avg_field::val_int()
          {
            return (longlong) rint(val_real());
          }
          {code}
          It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.
          AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+--------------------------+---------------------+---------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+--------------------------+---------------------+---------------------+
          | 00 | 9223372036854775807.0000 | 9223372036854775808 | 9223372036854775807 |
          | 37 | 9223372036854775805.0000 | 9223372036854775808 | 9223372036854775805 |
          +-----------+--------------------------+---------------------+---------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

          The problem is in this piece of code in item_sum.cc, which uses double routines:
          {code}
          longlong Item_avg_field::val_int()
          {
            return (longlong) rint(val_real());
          }
          {code}
          It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.
          bar Alexander Barkov made changes -
          Description AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0xFFFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0xFFFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0xFFFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+--------------------------+---------------------+---------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+--------------------------+---------------------+---------------------+
          | 00 | 9223372036854775807.0000 | 9223372036854775808 | 9223372036854775807 |
          | 37 | 9223372036854775805.0000 | 9223372036854775808 | 9223372036854775805 |
          +-----------+--------------------------+---------------------+---------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

          The problem is in this piece of code in item_sum.cc, which uses double routines:
          {code}
          longlong Item_avg_field::val_int()
          {
            return (longlong) rint(val_real());
          }
          {code}
          It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.
          AVG(bigint_expression) loses precision when used in integer context.
          This script demonstrates the problem using bit shift, to provide integer context:

          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (
            auto SERIAL,
            fld1 bigint unsigned NOT NULL,
            companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
            UNIQUE fld1 (fld1)
          );
          INSERT INTO t1 VALUES (1,0x7FFFFFFFFFFFFFFF,00);
          INSERT INTO t1 VALUES (2,0x7FFFFFFFFFFFFFFE,37);
          INSERT INTO t1 VALUES (3,0x7FFFFFFFFFFFFFFC,37);
          SELECT companynr, AVG(fld1), AVG(fld1)<<0 AS avg1, CAST(AVG(fld1) AS UNSIGNED)<<0 AS avg2 FROM t1 GROUP BY companynr;
          {code}
          The result is:
          {noformat}
          +-----------+--------------------------+---------------------+---------------------+
          | companynr | AVG(fld1) | avg1 | avg2 |
          +-----------+--------------------------+---------------------+---------------------+
          | 00 | 9223372036854775807.0000 | 9223372036854775808 | 9223372036854775807 |
          | 37 | 9223372036854775805.0000 | 9223372036854775808 | 9223372036854775805 |
          +-----------+--------------------------+---------------------+---------------------+
          {noformat}
          Notice, the values in the column avg1 lost precision.
          If I wrap AVG into CAST(AS UNSIGNED), like in the column avg2, then the result is Ok.

          The problem is in this piece of code in item_sum.cc, which uses double routines:
          {code}
          longlong Item_avg_field::val_int()
          {
            return (longlong) rint(val_real());
          }
          {code}
          It should use DECIMAL routines when working with DECIMAL or INTEGER arguments.
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 58714 ] MariaDB v3 [ 64772 ]
          bar Alexander Barkov made changes -
          Component/s OTHER [ 10125 ]
          Fix Version/s 10.1.8 [ 19605 ]
          Fix Version/s 10.1 [ 16100 ]
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 64772 ] MariaDB v4 [ 148540 ]

          People

            bar Alexander Barkov
            bar Alexander Barkov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.