[MCOL-1822] Change the default to use double when overflow occurs in SUM() and AVG() Created: 2018-10-18  Updated: 2019-05-29  Resolved: 2019-03-07

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.2
Fix Version/s: 1.2.3

Type: New Feature Priority: Major
Reporter: David Hall (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Duplicate
duplicates MCOL-564 Aggregation overflow error Closed
is duplicated by MCOL-148 sum() and avg() return incorrect resu... Closed
Problem/Incident
causes MCOL-2233 SUBSTRING_INDEX() returns incorrect v... Closed
Relates
Sprint: 2018-20, 2018-21, 2019-01, 2019-02, 2019-03

 Description   

The SUM and AVG functions have logic to overflow from integer to double arithmetic when overflow occurs in int64. However, this logic is #defined out of the standard build.

The default logic should be to overflow into double if necessary, rather than aborting the query.



 Comments   
Comment by David Hall (Inactive) [ 2018-12-17 ]

While the logic is there for aggregation, it is not there for Window Function usage. This will need to be added.

We're adding this as an XML entry, so upgrade will also need to be touched.

Update: since we decided to go always on long double, there is no XML entry.

Comment by David Hall (Inactive) [ 2019-01-03 ]

The logic found in aggregation doesn't work. This will take a bit longer than expected.

Comment by David Hall (Inactive) [ 2019-03-05 ]

The solution is to use long double (80-bit float) for all SUM/AVG calculations. The code has been modified to handle long double everywhere except actual persistence in a table.

Comment by David Hall (Inactive) [ 2019-03-05 ]

If SUM creates a number of magnitude larger than can be held in an int, and then decimal arithmetic is done on the result, an overflow error will occur unless infinidb_double_for_decimal_math is set.

Example:
select s/d from (select b.f3, sum(a.f3) s, count(distinct a.f5) d from aggoverflowa a join aggoverflowb b on a.id=b.id group by 1) s1 order by 1;

Where f3 is INT or DECIMAL in a.

s/d tries to do decimal math. if sum(a.f3) > MAX(long int), a decimal overflow error is thrown.

Creating an automatic overflow here is difficult, since the datatype of the operation is determined in the connector, while the change to long double is determined in ExeMgr – too late to go back.

Since long double is now enabled everywhere, infinidb_double_for_decimal_math now causes long double to be used, allowing for more accuracy, at least on Linux.

Comment by Daniel Lee (Inactive) [ 2019-03-07 ]

Build verified: 1.2.3-1 nightly

server commit:
61f32f2
engine commit:
fddd108

Also performed regression tests.

Generated at Thu Feb 08 02:31:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.