Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34719

Disable purge for LOAD DATA INFILE into empty table

Details

    Description

      According to thiru,
      https://github.com/MariaDB/server/commit/3cef4f8f0fc88ae5bfae4603d8d600ec84cc70a9
      optimization, with "single undo log record TRX_UNDO_EMPTY" does not work
      It does not work if I LOCK TABLE WRITE prior to LOAD DATA INFILE, it does not work if I start transaction after LOCK TABLE WRITE, and it does not work, if I also add "set autocommit=0" to the mix.

      Loading 1,4G worth of data into empty table, with any of the above tricks, profiler shows 15% of the time spent on row_purge_reset_trx_id()

      Can't enable "bulk optimization" because of the bug MDEV-34703 I filed previously,
      can't increase innodb sort buffer sizes, because actual requirement for tens of GB worth data
      is unknown.

      Attachments

        Issue Links

          Activity

            I think that similar to what was done already in MDEV-11415 and MDEV-515, it would be best to disable row-level undo logging for LOAD into an empty table. MDEV-515 allows to replace the row-level undo logging with table-level logging and locking. We definitely would not want to have an extremely slow row-by-row rollback of the bulk insert in case the server was killed or the operation failed. So, table-level logging seems to be the way to go.

            As noted in MDEV-24813, the interaction of LOCK TABLE and transactions is rather broken.

            If I understood it correctly, for MDEV-33625 we’d want a way to enable MDEV-515 but disable MDEV-24621 for data loads. If I got it correctly, the current idea would be to create the clustered index row by row, and then create the indexes afterwards (MDEV-5171).

            marko Marko Mäkelä added a comment - I think that similar to what was done already in MDEV-11415 and MDEV-515 , it would be best to disable row-level undo logging for LOAD into an empty table. MDEV-515 allows to replace the row-level undo logging with table-level logging and locking. We definitely would not want to have an extremely slow row-by-row rollback of the bulk insert in case the server was killed or the operation failed. So, table-level logging seems to be the way to go. As noted in MDEV-24813 , the interaction of LOCK TABLE and transactions is rather broken. If I understood it correctly, for MDEV-33625 we’d want a way to enable MDEV-515 but disable MDEV-24621 for data loads. If I got it correctly, the current idea would be to create the clustered index row by row, and then create the indexes afterwards ( MDEV-5171 ).

            the table is loaded in the same order as SELECT * INTO OUTFILE, i.e in primary key order, or whatever Innodb thinks is primary key (unique not null?). So it is a good order already . If DISABLE/ENABLE KEYS worked in Innodb out-of-the-box, this would be the best thing ever, then nothing would need to be done. Otherwise, we might parse SHOW CREATE TABLE output and delay creation of all secondary indexes, and foreign key constraints until after LOAD DATA finishes. This seems good at least "on paper", percona has that option, so why can't we.

            MDEV-24621 just did not work. temp is not infinite, sort buffer size is not infinite, but tables to be loaded in LOAD DATA can be hundreds of GB, that's normal for production quality databases.

            wlad Vladislav Vaintroub added a comment - the table is loaded in the same order as SELECT * INTO OUTFILE, i.e in primary key order, or whatever Innodb thinks is primary key (unique not null?). So it is a good order already . If DISABLE/ENABLE KEYS worked in Innodb out-of-the-box, this would be the best thing ever, then nothing would need to be done. Otherwise, we might parse SHOW CREATE TABLE output and delay creation of all secondary indexes, and foreign key constraints until after LOAD DATA finishes. This seems good at least "on paper", percona has that option, so why can't we. MDEV-24621 just did not work. temp is not infinite, sort buffer size is not infinite, but tables to be loaded in LOAD DATA can be hundreds of GB, that's normal for production quality databases.

            wlad, is there any LOAD DATA syntax that asserts that the data is sorted by the primary key? As far as I can tell, heap organized tables (such as MyISAM or Aria) do not care about the order, but InnoDB certainly would care and could skip a merge sort step for the clustered index, which would allow it to build the index page by page, with fewer buffer pool operations and less log records written.

            marko Marko Mäkelä added a comment - wlad , is there any LOAD DATA syntax that asserts that the data is sorted by the primary key? As far as I can tell, heap organized tables (such as MyISAM or Aria) do not care about the order, but InnoDB certainly would care and could skip a merge sort step for the clustered index, which would allow it to build the index page by page, with fewer buffer pool operations and less log records written.
            wlad Vladislav Vaintroub added a comment - - edited

            There is no such syntax. In vast majority of cases mariadb-import cares about, the output for Innodb is naturally sorted by primary key already, because it comes from "SELECT * from Innodb_table into outfile" order, for "SELECT *" it would be silly not to take clustered index to for this operation (although it is possible, and I have seen Innodb sorting by another secondary key in SELECT *, I'd think it is very rare in practice)

            There is an option in mariadb-dump to force output by primary key order --order-by-primary , and perhaps it makes sense to always do that for Innodb.

            MyISAM and Aria are not really interesting here, they store as they want to, the objective here is to improve Innodb load times.

            wlad Vladislav Vaintroub added a comment - - edited There is no such syntax. In vast majority of cases mariadb-import cares about, the output for Innodb is naturally sorted by primary key already, because it comes from "SELECT * from Innodb_table into outfile" order, for "SELECT *" it would be silly not to take clustered index to for this operation (although it is possible, and I have seen Innodb sorting by another secondary key in SELECT *, I'd think it is very rare in practice) There is an option in mariadb-dump to force output by primary key order --order-by-primary , and perhaps it makes sense to always do that for Innodb. MyISAM and Aria are not really interesting here, they store as they want to, the objective here is to improve Innodb load times.
            wlad Vladislav Vaintroub added a comment - - edited

            About this page-by-page optimization, I filed MDEV-34703 previously. I can't take this optimization, it does not work for me, on Linux, on Windows, anywhere. It is much slower as no-optimization at all, also it causes OOM on Linux, so, a no-go

            wlad Vladislav Vaintroub added a comment - - edited About this page-by-page optimization, I filed MDEV-34703 previously. I can't take this optimization, it does not work for me, on Linux, on Windows, anywhere. It is much slower as no-optimization at all, also it causes OOM on Linux, so, a no-go

            I see that some regression tests are failing. That would have to be fixed before I can review this.

            marko Marko Mäkelä added a comment - I see that some regression tests are failing. That would have to be fixed before I can review this.

            MDEV-34703 already addresses this issue and avoids the temporary file usage for clustered
            index and avoids writing undo log for the load data statement. It reduces the I/O, temporary file usage and reduces the time significantly
            for the load statement during bulk insert mode.

            thiru Thirunarayanan Balathandayuthapani added a comment - - edited MDEV-34703 already addresses this issue and avoids the temporary file usage for clustered index and avoids writing undo log for the load data statement. It reduces the I/O, temporary file usage and reduces the time significantly for the load statement during bulk insert mode.

            Technically, it is not exactly a duplicate of MDEV-34703, it would be better if everyone could benefit from reduced purge and bufferpool activity, without setting 2 session variables.

            wlad Vladislav Vaintroub added a comment - Technically, it is not exactly a duplicate of MDEV-34703 , it would be better if everyone could benefit from reduced purge and bufferpool activity, without setting 2 session variables.

            People

              thiru Thirunarayanan Balathandayuthapani
              wlad Vladislav Vaintroub
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.