Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3579

Manually set distribution key

    XMLWordPrintable

Details

    • New Feature
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Do
    • None
    • N/A
    • N/A
    • None

    Description

      We want to have manual control over the distribution key in a multi-PM layout. The reason for this is to allow a JOIN between two large tables (which, while bad, is sometimes really needed in analytics) to be pushed down without redistribution. We want this to be available without the need to manually split the data and load it with cpimport (which split is not supported by its modern replacement, the bulk load API).

      The distribution key should be set in the CREATE TABLE statement - either by utilising the PARTITION BY (which is, AFAIK, currently unused in CS), by adding a new keyword for this, or simply by specifying the key in a comment (a technique long used in Spider engine, for example).

      The distribution key should be either a single column or a list of columns. If no key is specified, the current method should be retained (a full hash of the row, I think).

      It is OK to not support changing the distribution key via ALTER TABLE as this could be a lengthy process; alternatives include dump, drop and re-import.

      If a custom distribution key is set, it should be observed at least by the new bulk load API and its client library/bindings/utilities. Not backporting to cpimport could be OK.

      Changes will have to be made to the metadata storage (so that the distribution key would have an explicit definition) and to the import facilities (which will need to read and use the specified key).

      This feature implicitly requires that data redistribution on JOIN be turned off if both tables are partitioned by the same key. A similar requirement and mechanism when using replicated tables is described in MCOL-3258.

      Attachments

        Issue Links

          Activity

            People

              toddstoffel Todd Stoffel (Inactive)
              assen.totin Assen Totin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.