[MCOL-4084] Integrated release builds cause random results for some queries Created: 2020-06-18  Updated: 2023-10-25  Resolved: 2023-10-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: Storage Manager
Affects Version/s: 1.5.2
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Patrick LeBlanc (Inactive) Assignee: Leonid Fedorov
Resolution: Not a Bug Votes: 0
Labels: None

Attachments: Text File buildWarningsStrictAliasing.txt    
Issue Links:
Problem/Incident
causes MCOL-4073 DBT3 query #3 returns incorrect results Closed
causes MCOL-4074 DBT3 query #4 returns incorrect results Closed
causes MCOL-4075 DBT3 query #5 returns incorrect result Closed
causes MCOL-4076 DBT3 query #7 returns incorrect results Closed
causes MCOL-4077 DBT3 query #9 returns incorrect results Closed
causes MCOL-4078 DBT3 query #10 returns incorrect results Closed
causes MCOL-4079 DBT3 query #13 returns incorrect results Closed
causes MCOL-4080 DBT3 query #14 returns incorrect results Closed
causes MCOL-4081 DBT3 query #16 returns incorrect results Closed
causes MCOL-4082 DBT3 query #19 returns incorrect results Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
MCOL-4075 DBT3 query #5 returns incorrect result Sub-Task Closed  

 Description   

We're not sure why yet, but doing an integrated build (CS is built as part of the server build) with CMAKE_BUILD_TYPE=Release, causes some queries with joins and aggregations to return random wrong results.

Easiest reproduction is SELECT COUNT FROM LINEITEM, ORDERS WHERE L_ORDERNUM=O_ORDERNUM

The correct result is the number of rows in LINEITEM table. When reproducing you will see much smaller number.

Switching the build type to RelWithDebInfo makes it work, and doing a non-integrated build (CS repo is built separately from the server) makes it work.

For now I've set our ci/cd system to do RelWithDebInfo builds so we don't create broken pkgs.

The goal of this top-level ticket is to find and fix the cause of the problem with integrated release builds.

Existing tickets fixed by this build type change can be made subtasks of this ticket. I would suggest if the build type fixes it, then those existing tickets can also be marked fixed. They'd be considered symptoms of this ticket, not a separate problem.



 Comments   
Comment by Gregory Dorman (Inactive) [ 2020-06-19 ]

Isn't this supposed to be "closed"?

Comment by Sergei Golubchik [ 2020-06-19 ]

As far as I understand this ticket is about fixing the actual problem with Release builds. Not about avoding it by using RelWithDebInfo.

If some gcc optimization causes this bug to appear and the workaround is to reduce the optimization level, then it's almost certain you'll see this bug again on a newer gcc version or on a different compiler.

Comment by Daniel Lee (Inactive) [ 2020-06-19 ]

This issue caused DBT3 queries to return incorrect results (see "causes" in "issue links" above). I have closed the tickets for the DBT3 queries. Once this issue is fixed, I have a MTR test case to validate it.

Comment by Ben Thompson (Inactive) [ 2020-08-10 ]

The differences in build flags appears to be tied to "Release" build type not implementing the -fno-strict-aliasing option. This throws numerous warnings if enabled. within columnstore and a few other places in integrated builds. Should we consider building Release build type with the -fno-strict-aliasing option as well?

Attaching every warning for strict aliasing in my local build:
buildWarningsStrictAliasing.txt

Generated at Thu Feb 08 02:47:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.