[MCOL-5621] Aggreate function LEAST returning wrong value Created: 2023-11-29  Updated: 2023-12-02

Status: Open
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 23.02.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Magnus Månsson Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

I have installed Columnstore via Docker (docker pull mariadb/columnstore:23.02.3). After a week I found code that wasn't working properly and I have tracked it down to the aggregate function LEAST() in Columnstore.

I have not tested with 23.02.4 since Python is compiled with optimize flags that aren't supported by our CPUs (https://github.com/mariadb-corporation/mariadb-columnstore-docker/issues/31.

I have simplified the SQL causing the issue and tried it in InnoDB vs Columnstore engines, with different results. I have also managed to rewrite the query for Columnstore so that it actually returns a proper result.

Here is a first example, using InnoDB. The column hindex returns 1 as it should. It comes from LEAST(1, 2).

CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=InnoDB;
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
 
SELECT favorite_points, csum, LEAST(favorite_points, csum) AS hindex
FROM (
  SELECT user_id, favorite_points,
    SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum
  FROM _test
  GROUP BY user_id, favorite_points
) AS a
;

Second example is identical, except that I am using ENGINE=Columnstore. Now LEAST(1,2) returns 2 instead.

CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=Columnstore;
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
 
SELECT favorite_points, csum, LEAST(favorite_points, csum) AS hindex
FROM (
  SELECT user_id, favorite_points,
    SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum
  FROM _test
  GROUP BY user_id, favorite_points
) AS a
;

A third example where I rewrote the query by moving the LEAST() statement. This works properly.

CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=Columnstore;
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
INSERT into _test (user_id, favorite_points) VALUES (123, 1);
 
SELECT favorite_points, csum, hindex
FROM (
  SELECT user_id, favorite_points,
    SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum,
    LEAST(favorite_points, SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC)) AS hindex
  FROM _test
  GROUP BY user_id, favorite_points
) AS a
;

I intentionally didn't include any DROP TABLE statements above, to avoid accidents.


Generated at Thu Feb 08 02:59:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.