[MCOL-521] add distributed regression aggregate and window functions Created: 2017-01-18 Updated: 2018-11-05 Resolved: 2018-11-05 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr |
| Affects Version/s: | None |
| Fix Version/s: | 1.2.0 |
| Type: | New Feature | Priority: | Major |
| Reporter: | David Thompson (Inactive) | Assignee: | Patrick LeBlanc (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Sprint: | 2018-12, 2018-13, 2018-14, 2018-15, 2018-16, 2018-17, 2018-18 | ||||||||||||||||||||||||
| Description |
|
add support for the regr_* functions as aggregate and window functions. |
| Comments |
| Comment by David Thompson (Inactive) [ 2017-06-03 ] | |||||||
|
This has been scoped out of 1.1 due to the fact that there is currently no support for aggregate functions with multiple parameters. This would need to be added first. | |||||||
| Comment by David Hall (Inactive) [ 2018-07-31 ] | |||||||
|
For convenience, I've copied the requirements here: Function When regression line is linear for dependant variable y and independent variable x such that it can be represented by y = a + bx , the regression coefficient is the constant (a) that represents the rate of change of one variable Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2) The purpose of this feature is to support the above listed regression functions as aggregate and window functions. MariaDB ColumnStore shall support following aggregate functions Example Details to be added: TBD | |||||||
| Comment by David Hall (Inactive) [ 2018-10-01 ] | |||||||
|
Due to the upgrade to MariaDB 10.3, a manual merge was necessary and branch | |||||||
| Comment by Andrew Hutchings (Inactive) [ 2018-10-02 ] | |||||||
|
Unfortunately CentOS 6 buildbot failed:
| |||||||
| Comment by David Hall (Inactive) [ 2018-10-02 ] | |||||||
|
Removed the "typename". Interesting that it compiled on my CentOS 6 with that in there. | |||||||
| Comment by Daniel Lee (Inactive) [ 2018-10-11 ] | |||||||
|
Build verified: Forgot the capture git info. The build was made in the morning of Oct 10. Verified the regression*() functions using the datatypetestm and 1mb DBT3 orders tables and against Oracle 18. Test results were within expectation, with some expected variance. Also verified the same regr*() functions in windowing function syntax. Each function has 247 queries and all of them execute without any errors. The test was done on a build made today. commit 6b44f0d9c453ede53024f525b7ddf32b5323171b Merge pull request #134 from mariadb-corporation/versionCmakeFix port changes for mysql_version cmake to fix columnstore RPM packaging /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine Merge pull request #588 from mariadb-corporation/ /root/columnstore/mariadb-columnstore-tools Merge pull request #14 from mariadb-corporation/ | |||||||
| Comment by Daniel Lee (Inactive) [ 2018-10-11 ] | |||||||
|
I checked some of the window function test results for each of the regr*() functions, there seemed to be issues with regr_r2() and regr_slope(). The results between MCS 1.2 and Oracle 18 are more then just significant digits/precisions. one slope example. Below are the last 10 rows of results Oracle: columnstore: | |||||||
| Comment by Daniel Lee (Inactive) [ 2018-10-11 ] | |||||||
|
Closing this ticket. Identified issue is being tracked by | |||||||
| Comment by David Hall (Inactive) [ 2018-11-05 ] | |||||||
|
Missing the stub code for distinct_count() | |||||||
| Comment by David Hall (Inactive) [ 2018-11-05 ] | |||||||
|
This is the stub code for distinct_count. Just close this JIRA again after merge. | |||||||
| Comment by David Hall (Inactive) [ 2018-11-05 ] | |||||||
|
Missing code has been merged. There's no need for Test to be involved |