Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
23.02.3
-
None
-
None
-
None
Description
I have installed Columnstore via Docker (docker pull mariadb/columnstore:23.02.3). After a week I found code that wasn't working properly and I have tracked it down to the aggregate function LEAST() in Columnstore.
I have not tested with 23.02.4 since Python is compiled with optimize flags that aren't supported by our CPUs (https://github.com/mariadb-corporation/mariadb-columnstore-docker/issues/31.
I have simplified the SQL causing the issue and tried it in InnoDB vs Columnstore engines, with different results. I have also managed to rewrite the query for Columnstore so that it actually returns a proper result.
Here is a first example, using InnoDB. The column hindex returns 1 as it should. It comes from LEAST(1, 2).
CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=InnoDB; |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
|
|
SELECT favorite_points, csum, LEAST(favorite_points, csum) AS hindex |
FROM ( |
SELECT user_id, favorite_points, |
SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum |
FROM _test |
GROUP BY user_id, favorite_points |
) AS a |
;
|
Second example is identical, except that I am using ENGINE=Columnstore. Now LEAST(1,2) returns 2 instead.
CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=Columnstore; |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
|
|
SELECT favorite_points, csum, LEAST(favorite_points, csum) AS hindex |
FROM ( |
SELECT user_id, favorite_points, |
SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum |
FROM _test |
GROUP BY user_id, favorite_points |
) AS a |
;
|
A third example where I rewrote the query by moving the LEAST() statement. This works properly.
CREATE TABLE _test (user_id INT, favorite_points INT) ENGINE=Columnstore; |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
INSERT into _test (user_id, favorite_points) VALUES (123, 1); |
|
|
SELECT favorite_points, csum, hindex |
FROM ( |
SELECT user_id, favorite_points, |
SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC) AS csum, |
LEAST(favorite_points, SUM(COUNT(*)) OVER (PARTITION BY user_id ORDER BY favorite_points DESC)) AS hindex |
FROM _test |
GROUP BY user_id, favorite_points |
) AS a |
;
|
I intentionally didn't include any DROP TABLE statements above, to avoid accidents.