[MCOL-37] Five windowing functions that are returning incorrect results Created: 2016-05-04  Updated: 2016-11-29  Resolved: 2016-11-29

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.0.5

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: David Thompson (Inactive)
Resolution: Fixed Votes: 0
Labels: relnote

Sprint: 2016-22, 2016-23

 Description   

There was an open bug from Calpont time, #5309. I retested all windows functions using the latest Calpont build and confirmed they are still not working. Here is the email that I sent out to group on April 29, 2016.

Hi, All;

I just want to provide everyone with an update on couple items regarding windowing testing for ColumnStore.

1) Windowing test cases, part of the autopilot test suite, has been fixed and is now working
a) I have an Oracle 11g XE instance setup and I was able to generate reference results for a 1mg TPCH database
b) I was able to run the test on the latest InfiniDB released by MariaDB (4.6.6-2). I have yet tried MariaDB ColumnStore.

2) Status of Windowing functions support on InfiniDB 4.6.6-2 (not ColumnStore)
My test results indicated that the following 5 functions are producing incorrect results.
NTH_VALUE
FIRST_VALUE
LAST_VALUE
LEAD
LAG
I reinstalled the latest release from InfiniDB (4.6.0-1) and still got the same incorrect results. I even regenerated the Oracle reference results and that did not help. I searched the old Bugzilla database from Calpont and there was bug 5309 with an OPEN status. The target milestone for the bug was set to release 5.0, which was never released, on March 9th, 2014. Therefore, we can conclude that the above functions are currently not working correctly and they need to be fixed in ColumnStore.

Let me know if you have questions

Thanks



 Comments   
Comment by David Thompson (Inactive) [ 2016-10-12 ]

Please clarify with specific examples. I tested some simple use cases and they seemed to work ok. I assume that there are specific syntax or data use cases that might not work?

Comment by Daniel Lee (Inactive) [ 2016-10-28 ]

Build tested: 1.0.4-1 (beta1)

I did another round of tests and the 5 mentioned windowing functions still producing the non-matching results compared to Oracle's results.

The test uses a TPCH 1 mb dataset. For each of the 5 functions, there are lots of queries for the test and many of them failed to match. Below are the first failed query for each of the 5 functions.

select o_custkey, First_value(o_orderkey) OVER ( ORDER BY o_custkey ASC ROWS 15 PRECEDING) from orders;
select o_custkey, Lag(o_orderkey,1) OVER (PARTITION BY abs(o_custkey)+10 ORDER BY o_custkey ) from orders;
select o_custkey, Last_value(o_orderkey) OVER ( ORDER BY o_custkey ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) from orders;
select o_custkey, Lead(o_orderkey,1) OVER (PARTITION BY abs(o_custkey)+10 ORDER BY o_custkey ) from orders;
select o_custkey, Nth_value(o_custkey,2) OVER ( ) from orders;

I think we can start with investigating into these 5 queries. If issues found and fixed, I can run the test again to see if all other queries will be fixed.

When testing, one thing to be careful is that the order of the returned values could be different between Oracle and ColumnStore. We have been sorting both sets of results for comparison.

Let me know if additional info is needed.

Comment by David Thompson (Inactive) [ 2016-11-08 ]

A problem here is that the results are going to be inconsistent between 2 reference systems and even within different instances of columnstore. This is because order by in a window function is generally supposed to be performed on a meaningfully ordered column that will provide a consistent order. By ordering on the customer key this is not meaningfully ordered since it's not at all unique and so the source data may be in arbitrary order resulting in different but valid results.

For testing i'd suggest we order by orderKey to be completely safe. Partitioning over custKey is a good example.

Comment by David Thompson (Inactive) [ 2016-11-15 ]

Moving to in test to verify, no code changes were made.https://mariadb.com/kb/en/mariadb/columnstore-window-functions/ documents behavior.

Generated at Thu Feb 08 02:18:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.