Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.1.2
-
None
-
None
-
2017-25, 2018-01, 2018-02, 2018-03, 2018-04, 2018-05, 2018-06, 2018-07
Description
We should support a data adapter that allows bridging spark (both scala and pyspark) to columnstore. The intended use case is to support publishing ML results to column store as both results system of record and to enable easier consumption of that data with SQL and other data already stored in MariaDB.
Broadly speaking the goal is to take a DataFrame object and serialize that to a ColumnStore table using mcsapi. This requires creation of new code to bridge the spark world to mcsapi. The first implementation can make assumptions that an appropriate table exists but it would be valuable to either create or adapt some code to generate appropriate columnstore create table statements that could be run as stage 1 before writing the data.