Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-1360

Spark connector performs all its work on the Driver

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • None
    • Icebox
    • None
    • None

    Description

      mariadb-columnstore-api/spark-connector/scala/src/main/scala/com/mariadb/columnstore/api/connector/ColumnStoreExporter.scala Line 25 performs a df.collect() which pulls all the data to the Spark Driver node which runs counter to having a distributed cluster, and forces the Driver node to have enough RAM to fit all the data into before sending any to the columnstore database.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ccoleman Charles Coleman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.