Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15430

type_float.test floating point error clang-4

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Not a Bug
    • 10.2(EOL), 10.3(EOL)
    • 10.2.17, 10.2.18
    • Tests
    • None

    Description

      When compiling with clang 4.0.1 in Travis CI, we get the following errors:

      leftmain.type_datetime_hires 'innodb'        w5 [ fail ]
              Test ended at 2018-02-26 16:38:32
       
      CURRENT_TEST: main.type_datetime_hires
      --- /home/travis/build/ottok/mariadb/mysql-test/r/type_datetime_hires.result	2018-02-26 16:10:37.771159137 +0000
      +++ /home/travis/build/ottok/mariadb/mysql-test/r/type_datetime_hires.reject	2018-02-26 16:38:32.047346337 +0000
      @@ -15,14 +15,14 @@
       0000-00-00 00:00:00.000
       2010-12-11 00:20:03.123
       2010-12-11 01:02:03.456
      -2010-12-11 03:04:05.789
      +2010-12-11 03:04:05.785
       2010-12-11 15:47:11.123
       select truncate(a, 6) from t1;
       truncate(a, 6)
       0.000000
       20101211002003.120000
       20101211010203.457031
      -20101211030405.790000
      +20101211030405.785000
       20101211154711.120000
       select a DIV 1 from t1;
       a DIV 1
      @@ -33,21 +33,21 @@
       20101211154711
       select group_concat(distinct a) from t1;
       group_concat(distinct a)
      -0000-00-00 00:00:00.000,2010-12-11 00:20:03.123,2010-12-11 01:02:03.456,2010-12-11 03:04:05.789,2010-12-11 15:47:11.123
      +0000-00-00 00:00:00.000,2010-12-11 00:20:03.123,2010-12-11 01:02:03.456,2010-12-11 03:04:05.785,2010-12-11 15:47:11.123
       alter table t1 engine=innodb;
       select * from t1 order by a;
       a
       0000-00-00 00:00:00.000
       2010-12-11 00:20:03.123
       2010-12-11 01:02:03.456
      -2010-12-11 03:04:05.789
      +2010-12-11 03:04:05.785
       2010-12-11 15:47:11.123
       select * from t1 order by a+0;
       a
       0000-00-00 00:00:00.000
       2010-12-11 00:20:03.123
       2010-12-11 01:02:03.456
      -2010-12-11 03:04:05.789
      +2010-12-11 03:04:05.785
       2010-12-11 15:47:11.123
       drop table t1;
       create table t1 (a datetime(4)) engine=innodb;
       
      select_pkeycache                    w4 [ fail ]
              Test ended at 2018-02-26 16:39:27
       
      CURRENT_TEST: main.select_pkeycache
      --- /home/travis/build/ottok/mariadb/mysql-test/r/select_pkeycache.result	2018-02-26 16:10:37.735158872 +0000
      +++ /home/travis/build/ottok/mariadb/mysql-test/r/select_pkeycache.reject	2018-02-26 16:39:27.235746730 +0000
      @@ -2133,7 +2133,6 @@
       wss_type
       select wss_type from t1 where wss_type ='102935229216544093';
       wss_type
      -102935229216544093
       select wss_type from t1 where wss_type =102935229216544093;
       wss_type
       102935229216544093
       
      select                              w1 [ fail ]
              Test ended at 2018-02-26 16:41:12
       
      CURRENT_TEST: main.select
      --- /home/travis/build/ottok/mariadb/mysql-test/r/select.result	2018-02-26 16:10:37.735158872 +0000
      +++ /home/travis/build/ottok/mariadb/mysql-test/r/select.reject	2018-02-26 16:41:12.140507768 +0000
      @@ -2133,7 +2133,6 @@
       wss_type
       select wss_type from t1 where wss_type ='102935229216544093';
       wss_type
      -102935229216544093
       select wss_type from t1 where wss_type =102935229216544093;
       wss_type
       102935229216544093
      

      These are all related failures in mysys/dtoa.c when converting from a string to a floating point number. For large values there seems to be a loss of precision.

      The full list of test failures is:
      main.type_datetime_hires main.select_pkeycache main.select main.select_jcl6 main.type_float main.func_str main.type_time_hires main.type_timestamp_hires

      Attachments

        Issue Links

          Activity

            cvicentiu Vicențiu Ciorbaru created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Labels Tests
            elenst Elena Stepanova made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            danblack Daniel Black added a comment -

            I have a feeling the case of this was the same as what I was investigating in MDEV-14419

            danblack Daniel Black added a comment - I have a feeling the case of this was the same as what I was investigating in MDEV-14419
            elenst Elena Stepanova made changes -
            Labels Tests
            elenst Elena Stepanova made changes -
            Component/s Tests [ 10800 ]
            otto Otto Kekäläinen made changes -
            teodor Teodor Mircea Ionita (Inactive) made changes -
            Assignee Vicentiu Ciorbaru [ cvicentiu ] Teodor Mircea Ionita [ teodor ]

            I ran the following:

            time watch -bde "./mtr --force --max-test-fail=10 --parallel=4 --mysqld="--thread_stack=500000" main.type_datetime_hires main.select_pkeycache main.select main.select_jcl6 main.type_float main.func_str main.type_time_hires main.type_timestamp_hires"
             
            real	244m49.072s
            user	198m23.313s
            

            On latest 10.3 Release with -03 with each full run taking around 9 seconds, which amounts to approx 1632 runs. No failures so far. Clang is Apple LLVM version 9.1.0 (clang-902.0.39.1).

            Maybe try on 10.2? Should I be using some extra flags? Maybe also try with Clang on Linux?

            teodor Teodor Mircea Ionita (Inactive) added a comment - - edited I ran the following: time watch -bde "./mtr --force --max-test-fail=10 --parallel=4 --mysqld="--thread_stack=500000" main.type_datetime_hires main.select_pkeycache main.select main.select_jcl6 main.type_float main.func_str main.type_time_hires main.type_timestamp_hires"   real 244m49.072s user 198m23.313s On latest 10.3 Release with -03 with each full run taking around 9 seconds, which amounts to approx 1632 runs. No failures so far. Clang is Apple LLVM version 9.1.0 (clang-902.0.39.1). Maybe try on 10.2? Should I be using some extra flags? Maybe also try with Clang on Linux?

            No-show for branch 10.2 too:

            [10.2|CHERRY-PICKING|●11] $ time watch -bde "./mtr --force --max-test-fail=10 --parallel=4 --mysqld="--thread_stack=500000" main.type_datetime_hires main.select_pkeycache main.select main.select_jcl6 main.type_float main.func_str main.type_time_hires main.type_timestamp_hires"
             
            real	220m51.917s
            user	140m0.388s
            sys	145m17.313s
            

            Going to setup a build on Ubuntu 16.04 LTS next with Clang5 (what Travis uses) and test on that in the same manner. Unfortunately the macOS builds on Travis haven't been succeeding for a long while now, maybe it would be worth looking at MDEV-15778 sometime before 10.3 GA.

            teodor Teodor Mircea Ionita (Inactive) added a comment - No-show for branch 10.2 too: [10.2|CHERRY-PICKING|●11] $ time watch -bde "./mtr --force --max-test-fail=10 --parallel=4 --mysqld="--thread_stack=500000" main.type_datetime_hires main.select_pkeycache main.select main.select_jcl6 main.type_float main.func_str main.type_time_hires main.type_timestamp_hires"   real 220m51.917s user 140m0.388s sys 145m17.313s Going to setup a build on Ubuntu 16.04 LTS next with Clang5 (what Travis uses) and test on that in the same manner. Unfortunately the macOS builds on Travis haven't been succeeding for a long while now, maybe it would be worth looking at MDEV-15778 sometime before 10.3 GA.

            teodor You could try runnin a Ubuntu 14.04 virtual machine and reproduce it there. It should be very easy to reproduce what Travis-CI does, as the setup and logs are fully public and defines the entire environment.

            otto Otto Kekäläinen added a comment - teodor You could try runnin a Ubuntu 14.04 virtual machine and reproduce it there. It should be very easy to reproduce what Travis-CI does, as the setup and logs are fully public and defines the entire environment.

            otto I was just doing that , replicating the environment from Travis, only with a 16.04 since I have it readily available, only needed some apt upgrade. If I get a no-show on that too, I will set-up a 14.04 then.

            teodor Teodor Mircea Ionita (Inactive) added a comment - otto I was just doing that , replicating the environment from Travis, only with a 16.04 since I have it readily available, only needed some apt upgrade. If I get a no-show on that too, I will set-up a 14.04 then.

            teodor What cmake configure line are you running? Make sure to mimic the one from Travis. I recall reproducing this required -DCMAKE_BUILD_TYPE=RelWithDebInfo.

            Also, this is not a race condition, it was always reproducible, so there's no point in repeating a test if it's not reproducible on the first run.

            cvicentiu Vicențiu Ciorbaru added a comment - teodor What cmake configure line are you running? Make sure to mimic the one from Travis. I recall reproducing this required -DCMAKE_BUILD_TYPE=RelWithDebInfo. Also, this is not a race condition, it was always reproducible, so there's no point in repeating a test if it's not reproducible on the first run.
            otto Otto Kekäläinen made changes -
            Attachment travis failing build.png [ 45540 ]

            See screenshot - this is the build that contains the this permanently failing test.

            otto Otto Kekäläinen added a comment - See screenshot - this is the build that contains the this permanently failing test.
            otto Otto Kekäläinen added a comment - - edited

            @teodor I have protected branches enabled in my own repository, so you can always compare the code to the 10.3 I have to see what the code looked like before it started to error (assuming the problem is the code, not an underlying dependency that updated and introduced this): https://travis-ci.org/ottok/mariadb/branches

            (Protected branch: My own 10.3 branch will always be green, as I cannot push on it any commits that did not pass Travis. My work happens in ok-* branches that may fail occasionally, and because of this bug currently always.)

            otto Otto Kekäläinen added a comment - - edited @teodor I have protected branches enabled in my own repository, so you can always compare the code to the 10.3 I have to see what the code looked like before it started to error (assuming the problem is the code, not an underlying dependency that updated and introduced this): https://travis-ci.org/ottok/mariadb/branches (Protected branch: My own 10.3 branch will always be green, as I cannot push on it any commits that did not pass Travis. My work happens in ok-* branches that may fail occasionally, and because of this bug currently always.)

            Dumping the results I have so far with cross-compiler testing:

                * Only repro with clang4 on Ubuntu 14.04 and 16.04
                    * clang version 4.0.1-svn305264-1~exp1 (branches/release_40); 14.04
                    * clang version 4.0.0-1ubuntu1~16.04.2 (tags/RELEASE_400/rc1)
                * No show:
                * macOS 10.13 clang-9.1
                * On 14.04:
                    * clang3.3-9,5
                    * gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
                    * gcc 4.4
                * On 16.04
                    * gcc-5.4 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
                    * gcc-4.7 (Ubuntu/Linaro 4.7.4-3ubuntu12) 4.7.4
                    * clang version 5.0.0-3~16.04.1 (tags/RELEASE_500/final)
                * On 17.10
                    * gcc-7 (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
                    * gcc-6 (Ubuntu 6.4.0-8ubuntu1) 6.4.0 20171010
            

            Working on disabling affected tests for affected clang version in Travis. Another option suggested by otto is to just drop support for this version.

            teodor Teodor Mircea Ionita (Inactive) added a comment - Dumping the results I have so far with cross-compiler testing: * Only repro with clang4 on Ubuntu 14.04 and 16.04 * clang version 4.0.1-svn305264-1~exp1 (branches/release_40); 14.04 * clang version 4.0.0-1ubuntu1~16.04.2 (tags/RELEASE_400/rc1) * No show: * macOS 10.13 clang-9.1 * On 14.04: * clang3.3-9,5 * gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4 * gcc 4.4 * On 16.04 * gcc-5.4 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 * gcc-4.7 (Ubuntu/Linaro 4.7.4-3ubuntu12) 4.7.4 * clang version 5.0.0-3~16.04.1 (tags/RELEASE_500/final) * On 17.10 * gcc-7 (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0 * gcc-6 (Ubuntu 6.4.0-8ubuntu1) 6.4.0 20171010 Working on disabling affected tests for affected clang version in Travis. Another option suggested by otto is to just drop support for this version.
            danblack Daniel Black added a comment -

            If you want to try clang for the main branches there's a repo and packages: http://apt.llvm.org/.

            Is https://github.com/MariaDB/server/pull/505/commits/9fce41be75a1620f11bdcbfd305c4ede1919ae16 the workaround you need?

            Note "cross-compiler" has a build/target architecture difference which isn't the same as what you are doing.

            danblack Daniel Black added a comment - If you want to try clang for the main branches there's a repo and packages: http://apt.llvm.org/ . Is https://github.com/MariaDB/server/pull/505/commits/9fce41be75a1620f11bdcbfd305c4ede1919ae16 the workaround you need? Note "cross-compiler" has a build/target architecture difference which isn't the same as what you are doing.

            That could work, however, here is the alternative otto mentioned:

            https://travis-ci.org/shinnok/server/builds/382495667
            https://github.com/shinnok/server/commit/1b4fc3985dd368e2fab92c930f2a97c7d3c5837d
            http://apt.llvm.org/trusty/pool/main/l/ - has clang6 too

            Dropping clang4 makes the config a tad cleaner and VERSION no. can be the same to keep gcc on par with clang (we have the issue recorded in Jira after all). I would even go one step further and do this also:

            https://github.com/MariaDB/server/commit/8e6f1b9f1e555cec2faaa14c950984de4e1be5bc

            Just to make the build setup less confusing. See example here:

            https://travis-ci.org/shinnok/server/builds/382504284

            teodor Teodor Mircea Ionita (Inactive) added a comment - That could work, however, here is the alternative otto mentioned: https://travis-ci.org/shinnok/server/builds/382495667 https://github.com/shinnok/server/commit/1b4fc3985dd368e2fab92c930f2a97c7d3c5837d http://apt.llvm.org/trusty/pool/main/l/ - has clang6 too Dropping clang4 makes the config a tad cleaner and VERSION no. can be the same to keep gcc on par with clang (we have the issue recorded in Jira after all). I would even go one step further and do this also: https://github.com/MariaDB/server/commit/8e6f1b9f1e555cec2faaa14c950984de4e1be5bc Just to make the build setup less confusing. See example here: https://travis-ci.org/shinnok/server/builds/382504284

            Compiler bug within clang 4

            cvicentiu Vicențiu Ciorbaru added a comment - Compiler bug within clang 4
            cvicentiu Vicențiu Ciorbaru made changes -
            Fix Version/s 10.2.17 [ 23111 ]
            Fix Version/s 10.2.18 [ 23112 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Resolution Not a Bug [ 6 ]
            Status Open [ 1 ] Closed [ 6 ]

            I removed the skiplists for these tests in https://salsa.debian.org/mariadb-team/mariadb-10.5/-/commit/bde2cf481fa48a0dd85b9ad40e27ad5005ad1122 as we nowadays only run the main test suite as part of the builds (see debian/rules).

            otto Otto Kekäläinen added a comment - I removed the skiplists for these tests in https://salsa.debian.org/mariadb-team/mariadb-10.5/-/commit/bde2cf481fa48a0dd85b9ad40e27ad5005ad1122 as we nowadays only run the main test suite as part of the builds (see debian/rules).
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 85769 ] MariaDB v4 [ 153872 ]

            People

              teodor Teodor Mircea Ionita (Inactive)
              cvicentiu Vicențiu Ciorbaru
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.