Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-6033

Increase cpimport batch size up to the current max size of an Extent that is 8 000 000

    XMLWordPrintable

Details

    • New Feature
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Done
    • None
    • 23.10.6
    • None
    • None

    Description

      cpimport is a standalone binary to do batch data ingestion. It has multiple modes of operation. Mode one is when cpimport reads data either locally or from S3, breaks data into batches and tries to distribute them equally across all dbroots(units of storage layer available in the cluster that are described in /etc/columnstore/Columnstore).
      There is a `-q` parameter of cpimport that controls the size of the batch. As of now the max size of the batch is hardcoded to 10 000 that is a way smaller than a logical storage unit size called Extent that is 8 000 000. If a user supplies a pre-sorted data the orientation of data will be mostly lost. The max batch size must be equal to the size of the Extent.

      Also recommend a change to documentation about presorted ranges and using less than 8,000,000 row breaking up extent ranges

      Attachments

        Issue Links

          Activity

            People

              kristina Kristina Pavlova
              drrtuy Roman
              Roman Roman
              Aleksei Bukhalov Aleksei Bukhalov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.