[MCOL-1119] spark connector for publishing dataframe results using mcsapi to columnstore. - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 1.1.2
Fix Version/s: 1.1.3
Component/s: None
Labels:
None

Sprint:
2017-25, 2018-01, 2018-02, 2018-03, 2018-04, 2018-05, 2018-06, 2018-07

Description

We should support a data adapter that allows bridging spark (both scala and pyspark) to columnstore. The intended use case is to support publishing ML results to column store as both results system of record and to enable easier consumption of that data with SQL and other data already stored in MariaDB.

Broadly speaking the goal is to take a DataFrame object and serialize that to a ColumnStore table using mcsapi. This requires creation of new code to bridge the spark world to mcsapi. The first implementation can make assumptions that an appropriate table exists but it would be valuable to either create or adapt some code to generate appropriate columnstore create table statements that could be run as stage 1 before writing the data.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

spark-dev.zip
1 kB
2018-01-17 22:07
spark-dev-build.zip
295 kB
2018-01-17 22:07

Activity

People

Assignee:: David Thompson (Inactive)

Reporter:: David Thompson (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2017-12-18 23:39

Updated:: 2023-10-26 13:16

Resolved:: 2018-04-02 03:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.