[MDEV-26645] test main.func_math fails on MariaDB 10.6+ Created: 2021-09-20  Updated: 2022-03-29  Resolved: 2022-02-18

Status: Closed
Project: MariaDB Server
Component/s: Data types
Affects Version/s: 10.5, 10.6
Fix Version/s: 10.5.16, 10.6.8, 10.7.4, 10.8.3

Type: Bug Priority: Minor
Reporter: Otto Kekäläinen Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Attachments: File func_math_tests_MDEV-26645.diff     File mariadb-gcc12-x86_64.log    
Issue Links:
Relates
relates to MDEV-21977 main.func_math fails due to undefined... Closed
relates to MDEV-20966 main.func_math fails in builbot on De... Stalled

 Description   

While working on 10.6 and 10.7 I've noticed that this test is permanently failing:

main.func_math                           w1 [ fail ]
        Test ended at 2021-09-18 22:58:28
 
CURRENT_TEST: main.func_math
mysqltest: At line 425: query 'SELECT 9223372036854775807 + 9223372036854775807' succeeded - should have failed with error ER_DATA_OUT_OF_RANGE (1690)...
 
The result from queries just before the failure was:
< snip >
ERROR 22003: DOUBLE value is out of range in 'exp(750)'
SELECT POW(10, 309);
ERROR 22003: DOUBLE value is out of range in 'pow(10,309)'
SELECT COT(0);
ERROR 22003: DOUBLE value is out of range in 'cot(0)'
SELECT DEGREES(1e307);
ERROR 22003: DOUBLE value is out of range in 'degrees(1e307)'
SELECT 9223372036854775808 + 9223372036854775808;
ERROR 22003: BIGINT UNSIGNED value is out of range in '9223372036854775808 + 9223372036854775808'
SELECT 18446744073709551615 + 1;
ERROR 22003: BIGINT UNSIGNED value is out of range in '18446744073709551615 + 1'
SELECT 1 + 18446744073709551615;
ERROR 22003: BIGINT UNSIGNED value is out of range in '1 + 18446744073709551615'
SELECT -2 + CAST(1 AS UNSIGNED);
ERROR 22003: BIGINT UNSIGNED value is out of range in '-2 + cast(1 as unsigned)'
SELECT CAST(1 AS UNSIGNED) + -2;
ERROR 22003: BIGINT UNSIGNED value is out of range in 'cast(1 as unsigned) + -2'
SELECT -9223372036854775808 + -9223372036854775808;
ERROR 22003: BIGINT value is out of range in '-9223372036854775808 + -9223372036854775808'
SELECT 9223372036854775807 + 9223372036854775807;

Example of a full log at https://launchpadlibrarian.net/559043027/buildlog_ubuntu-hirsute-s390x.mariadb-10.7_1%3A10.7.0~ubuntu21.04.1~1631999241.a60c5e60c84+10.7.gitlab.ci.benchmark_BUILDING.txt.gz

Applies to platforms: s390x, ppc64
(of the Launchpad platforms it does not apply to armhf, arm64, amd64)



 Comments   
Comment by Alice Sherepa [ 2021-09-20 ]

also on 10.5 https://buildbot.mariadb.org/#/builders/309/builds/937/steps/6/logs/stdio

Comment by Sarah Julia Kriesch [ 2021-10-30 ]

It also fails for openSUSE.

Comment by Christian Boltz [ 2021-11-01 ]

The patch func_math_tests_MDEV-26645.diff I just attached "fixes" the test failure - but as you can see in the patched expected results, there are three queries that end up with a mathematically wrong result (probably an undetected overflow).

Therefore: please use that patch to get an idea which queries don't fail as expected, but DO NOT apply it (maybe except as a workaround to un-break the build in distribution packages for s390x and ppc64 until this bug is fixed).

BTW: This bug also seems to depend on the version of toolchain, libraries etc. - I see it when building for openSUSE Tumbleweed (rolling release), but in openSUSE Leap (stable release, which has older versions of everything) the queries fail as expected (= the tests succeed).

Comment by Otto Kekäläinen [ 2021-11-28 ]

This test failure is still in 10.6, and visible on newly uploaded 10.6 in Debian builders on several non-amd64 archs:

Overview on latest build status in Debian at: https://buildd.debian.org/status/package.php?p=mariadb-10.6

Comment by Danilo Spinella [ 2022-01-26 ]

This fails on 10.6.5 on x86_64 too, when compiling MariaDB using GCC-12 on openSUSE Tumbleweed. This is the log from the tests: mariadb-gcc12-x86_64.log.

Comment by Michael Widenius [ 2022-02-16 ]

I write the code that test for the overflow and it seams to work on most machines.
To fix this, I would need access to a machine where it fails so that I can test this in a debugger.
This couldeasily be the compiler that is trying to optimize an arithmetic operation that it shouldn't.

Comment by Marko Mäkelä [ 2022-02-18 ]

This is similar to MDEV-21977, which affected integer division.

An integer overflow in an expression like a+b or a-b is undefined behavior. The compiler is allowed to assume that no such overflow is possible, and optimize away some code accordingly. Apparently, such optimizations were implemented in GCC a little earlier for the POWER and s390x back-ends, and with GCC 12 it starts to affect the AMD64 backend as well.

The fix is simple: Remove some special handling for WITH_UBSAN.

Comment by Otto Kekäläinen [ 2022-03-02 ]

I backported this to 10.6 in Debian and it worked so far on both Launchpad and official Debian builders. Thanks!
https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/4a96913a88cd868885813f20eb66524cbae85229

Generated at Thu Feb 08 09:46:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.