Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-608

INSERT converted to bulk load

    XMLWordPrintable

Details

    • New Feature
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Won't Fix
    • None
    • Icebox
    • cpimport
    • None

    Description

      One advantage of ColumnStore is that standard DML is also allowed, in addition to Bulk Loading. The issue is that is this is still very slow. Many application developers probably appreciate the ability to run DML instead of using bulk loading facilities, data might be streamed or some other reason that makes classic bulk loading cumbersome. I thuse came up with the idea to use MaxScale as a converter of INSERT to bulk loading. The way I see it working is this:
      1. All SQL to CS pass though MaxScale.
      2. Any SELECT, UPDATE or DELETE are passed directly to ColumnStore.
      3. INSERTs are passed to a module that writes data in the insert to a file, somehow named after the table being inserted to, and in a cpimport friendly format so each table has a separate file.
      4. At regular intervals, based on # of records in the file, time since the first records was placed in the file or size of the file, whichever comes first. The file is renamed using some sequence and a new file is created.
      5. After the switch has taken place, an external script run asynchronously, with the filename as an argument.
      6. The script will then run cpimport, or whatever else actually, to import the data in to ColumnStore.
      7. The script should also support doing a distributed load somehow, assuming the the file that MaxScale creates is shared across the nodes.

      This would allow a user to run standard INSERTs with bulk-load speed without code changes, admittedly, the load would be asynchronous, but the performance benefits should outweigh this many times over.

      Attachments

        Activity

          People

            toddstoffel Todd Stoffel (Inactive)
            karlsson Anders Karlsson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.