[MCOL-4719] ColumnStore ignores where clause: inline view with a window function Created: 2021-05-12  Updated: 2023-06-12  Resolved: 2021-08-04

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 5.5.1
Fix Version/s: 6.2.1, 6.2.2

Type: Bug Priority: Blocker
Reporter: Edward Stoever Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 1
Labels: None
Environment:

10.5.9-6-MariaDB-enterprise-log
Columnstore 5.5


Sprint: 2021-9, 2021-10

 Description   

When columnstore_select_handler=on this query should return only 2 rows. Instead it returns all rows.

create database test1;
use test1;
 
CREATE TABLE test_table (
category CHAR(1),
count INTEGER(1)
) ENGINE=COLUMNSTORE;
 
INSERT INTO test_table (category, count) VALUES ('A', 1);
INSERT INTO test_table (category, count) VALUES ('A', 2);
INSERT INTO test_table (category, count) VALUES ('B', 3);
INSERT INTO test_table (category, count) VALUES ('B', 4);
 
set columnstore_select_handler=off;
select * FROM (
SELECT count / SUM(count) OVER (PARTITION BY category) AS ratio
FROM test_table
) a
where ratio > .5;
ratio
0.6667
0.5714
 
set columnstore_select_handler=on;
select * FROM (
SELECT count / SUM(count) OVER (PARTITION BY category) AS ratio
FROM test_table
) a
where ratio > .5;
ratio
0.6667
0.3333
0.5714
0.4286



 Comments   
Comment by David Hall (Inactive) [ 2021-07-19 ]

Another, perhaps more palatable, workaround:
select * FROM ( SELECT (count / SUM(count) OVER (PARTITION BY category)) AS ratio FROM test_table ) a having ratio > .5;
--------

ratio

--------

0.6667
0.5714

--------

Comment by David Hall (Inactive) [ 2021-07-19 ]

I have run this test on 1.1, 1.2, and 1.4. All showed the same deviant behavior.

Comment by David Hall (Inactive) [ 2021-07-21 ]

Analysis:
Window Functions and aggregates aren't complete until well after filtering is done. This is why you can't use a window function in a WHERE clause. If you try, it will generate an error:

SELECT SUM(count) OVER (PARTITION BY category) AS x FROM test_table where SUM(count) OVER (PARTITION BY category) < 6;
ERROR 4015 (HY000): Window function is allowed only in SELECT list and ORDER BY clause

Optimization attempts to push filters from the outer select to the inner select. This isn't done if the filter is on a result of an aggregation or window function. When the inner select doesn't have the operator, it notices that the filter is on the window function and doesn't optimize.

But when the inner function has an operator that includes the window function, the code only sees the operator and attempts to optimize, causing an illegal filter on the result of a window function which appears to turn into a noop in the lower code.

The fix is to have the code delve into the operator (or function or whatever) to see what windows or aggregates might lurk inside and suppress the optimization accordingly.

Comment by David Hall (Inactive) [ 2021-07-22 ]

QA
I added the following two queries to mtr using the above defined table:

# a subselect with an arithmetic operator on a window function
select * FROM (SELECT count / SUM(count) OVER (PARTITION BY category) AS ratio FROM test_table) a where ratio > .5;
+--------+
| ratio  |
+--------+
| 0.6667 |
| 0.5714 |
+--------+

# a subselect with a function containing an arithmetic operator on a window function
select * FROM (SELECT round(0+SUM(count) OVER (PARTITION BY category),3) AS x FROM test_table ) a where x>3;
+-------+
| x     |
+-------+
| 7.000 |
| 7.000 |
+-------+

You should be able to create a subquery with any number of nested functions and operators with internal aggregates and window functions.

Comment by Daniel Lee (Inactive) [ 2021-08-04 ]

Build verified: 6.2.1-1 ( #2921)

Generated at Thu Feb 08 02:52:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.