Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4000

Add an option to allow using cpimport for LDI and INSERT..SELECT in a transaction

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 1.4.3
    • 1.4.4
    • MDB Plugin
    • None
    • 2020-7

    Description

      CS has a session variable, columnstore_use_import_for_batchinsert, that has 2 values, ON (default) and OFF. This allows a user to set the option of using cpimport (ON) or the regular DML (OFF) for LDI and INSERT SELECT statements.

      If a user performs these queries in a transaction (such as by setting autocommit=0 or using the START TRANSACTION statement), however, CS fallsback to using DML for the operation. This behaviour is because of the current limitation of cpimport to handle rollbacks.

      We want to add a third option to columnstore_use_import_for_batchinsert: ALWAYS. This value will always use cpimport for LDI and INSERT SELECT queries, irrespective of whether the query runs in a transaction or not. This gives the user the ability to use a faster import method, with the caveat that if a user issues a rollback of the transaction, it will have no effect as the data would have already been committed to actual database files by cpimport.

      Attachments

        Issue Links

          Activity

            Following values can be used for columnstore_use_import_for_batchinsert:

            OFF (0): Do not use cpimport for LDI or INSERT..SELECT
            ON (1, default): Use cpimport for LDI or INSERT..SELECT only for non-transactional queries
            ALWAYS (2): Use cpimport for LDI or INSERT..SELECT from within a transaction as well as for non-transactional queries

            The session variable can be changed for either the current session using the 'set' command,
            or for all future sessions using the 'set global' command.

            For QA:
            Create a table with a few columns and a large data file (e.g. 100k or 200k records) to be imported using LDI.

            1. First test the current default behaviour, i.e. columnstore_use_import_for_batchinsert=ON:

            • Perform LDI inside a transaction (e.g. by using the START TRANSACTION command, or SET AUTOCOMMIT=0): this should use the regular DML operation. Doing a rollback will undo the load from the table.
            • Perform LDI outside the transaction: this will use cpimport for the load. To check the operation uses cpimport, use the linux top command. This operation will also run magnitudes of order faster than the previous one from within a transaction that used DML. Performing a rollback with not undo the load.

            2. Test the new behaviour by setting columnstore_use_import_for_batchinsert=ALWAYS:

            • Perform LDI inside a transaction: this time, it should use cpimport for the operation. Doing a rollback will not undo the load from the table.
            • Perform LDI outside the transaction: this should again use cpimport for the load. Performing a rollback with not undo the load.

            Same tests can be repeated for INSERT ... SELECT.

            tntnatbry Gagan Goel (Inactive) added a comment - Following values can be used for columnstore_use_import_for_batchinsert: OFF (0): Do not use cpimport for LDI or INSERT..SELECT ON (1, default): Use cpimport for LDI or INSERT..SELECT only for non-transactional queries ALWAYS (2): Use cpimport for LDI or INSERT..SELECT from within a transaction as well as for non-transactional queries The session variable can be changed for either the current session using the 'set' command, or for all future sessions using the 'set global' command. For QA: Create a table with a few columns and a large data file (e.g. 100k or 200k records) to be imported using LDI. 1. First test the current default behaviour, i.e. columnstore_use_import_for_batchinsert=ON: Perform LDI inside a transaction (e.g. by using the START TRANSACTION command, or SET AUTOCOMMIT=0): this should use the regular DML operation. Doing a rollback will undo the load from the table. Perform LDI outside the transaction: this will use cpimport for the load. To check the operation uses cpimport, use the linux top command. This operation will also run magnitudes of order faster than the previous one from within a transaction that used DML. Performing a rollback with not undo the load. 2. Test the new behaviour by setting columnstore_use_import_for_batchinsert=ALWAYS: Perform LDI inside a transaction: this time, it should use cpimport for the operation. Doing a rollback will not undo the load from the table. Perform LDI outside the transaction: this should again use cpimport for the load. Performing a rollback with not undo the load. Same tests can be repeated for INSERT ... SELECT.

            For QA: I have added instructions above on how to test this feature. In addition, we have added test cases to the regression suite for this: mysql/queries/working_dml/misc/MCOL-4000.sql

            tntnatbry Gagan Goel (Inactive) added a comment - For QA: I have added instructions above on how to test this feature. In addition, we have added test cases to the regression suite for this: mysql/queries/working_dml/misc/ MCOL-4000 .sql

            Created MCOL-4007 for tracking this issue in 1.5. Removing fixed version of 1.5 in this ticket.

            dleeyh Daniel Lee (Inactive) added a comment - Created MCOL-4007 for tracking this issue in 1.5. Removing fixed version of 1.5 in this ticket.

            Build verified: 1.4.4-1 (Jenkins 20200518)

            Tested all these modes (On, Off, Always) for the columnstore_use_import_for_batchinsert variable, with and without transaction.

            Tested for commit, auto commit, and rollback, for both LDI and insert..select.

            dleeyh Daniel Lee (Inactive) added a comment - Build verified: 1.4.4-1 (Jenkins 20200518) Tested all these modes (On, Off, Always) for the columnstore_use_import_for_batchinsert variable, with and without transaction. Tested for commit, auto commit, and rollback, for both LDI and insert..select.

            People

              dleeyh Daniel Lee (Inactive)
              tntnatbry Gagan Goel (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.