Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4900

LOAD DATA LOCAL INFILE is slow when there is log for synchronizeWithJournal

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Not a Bug
    • Affects Version/s: 6.1.1
    • Fix Version/s: 6.4.1
    • Component/s: Documentation
    • Labels:
      None
    • Environment:
      Amazon EC2
    • Sprint:
      2021-13, 2021-14, 2021-15, 2021-16, 2021-17

      Description

      customer reported that LDLI(LOAD DATA LOCAL INFILE) got slower than before when using along with S3. From the debug.log, the followings are the time taken per each import process.

      #1 Oct 18 03:25:53 - Oct 18 03:42:05 : rea_bigsumplus_20200705000000.txt - For table reax6.bigsumplus: 24680000 rows processed and 24680000 rows inserted / 18 min
      #2 Oct 18 03:44:24 - Oct 18 05:47:40: rea_bigsumplus_20200706000000.txt - For table reax6.bigsumplus: 29040000 rows processed and 29040000 rows inserted / 2 hour and 3 min
      #3 Oct 18 13:16:01 - Oct 18 13:33:16: rea_bigsumplus_20200707000000.txt - For table reax6.bigsumplus: 27280000 rows processed and 27280000 rows inserted / 17 min
      #4 Oct 18 13:35:25 - Oct 18 16:18:02: rea_bigsumplus_20200708000000.txt - For table reax6.bigsumplus: 26857196 rows processed and 26857196 rows inserted / 2 hour 43 min
      #5 Oct 18 17:47:22 - Oct 18 20:28:53: rea_bigsumplus_20200709000000.txt - For table reax6.bigsumplus: 27560000 rows processed and 27560000 rows inserted / 2 hour 41 min
      

      Inserted rows aren't much different. Looking at #2,4,5, it spent the most of time on synchronizeWithJournal().

      [From #2] - took 1 hour 31 minutes
      ======================================================================================
      Oct 18 04:14:32 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): no journal file found for data1/a295d85e-ca19-4633-876d-712f0fbfe075_10485760_3606487_data1~systemFiles~dbrm~BRM_saves_journal
      Oct 18 04:14:32 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): data1/a295d85e-ca19-4633-876d-712f0fbfe075_10485760_3606487_data1~systemFiles~dbrm~BRM_saves_journal has no journal, but it does exist in the cloud. This indicates that an overlapping syncWithJournal() call handled it properly.
      ....
      Oct 18 05:45:53 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): data1/4ccd1fe4-7c13-431c-b41f-2140be7e2673_0_2891776_data1~000.dir~000.dir~050.dir~004.dir~146.dir~FILE002.cdf has no journal, but it does exist in the cloud. This indicates that an overlapping syncWithJournal() call handled it properly.
      ======================================================================================
       
      [From #4] - took 2 hour 20minutes
      ======================================================================================
      Oct 18 13:35:39 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): no metadata found for data1/bulkRollback/12772_data/12816.p147.s0.tmp. It must have been deleted.
      Oct 18 13:47:09 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): no journal file found for data1/81bfaa41-a078-492c-b836-527099f7f0dd_0_4325376_data1~000.dir~000.dir~049.dir~229.dir~148.dir~FILE001.cdf
      ...
      Oct 18 15:54:48 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): data1/8e71ebc1-37db-430e-a8b7-a6c4ae4606b1_0_4874240_data1~000.dir~000.dir~049.dir~229.dir~149.dir~FILE001.cdf has no journal, but it does exist in the cloud. This indicates that an overlapping syncWithJournal() call handled it properly.
      ======================================================================================
       
      [From #5] - took 1 hour 50 minutes
      ======================================================================================
      Oct 18 18:27:30 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): no journal file found for data1/559a05c5-d34e-4d08-9552-d71179904486_0_1105920_data1~000.dir~000.dir~049.dir~229.dir~150.dir~FILE002.cdf
      Oct 18 18:27:30 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): data1/559a05c5-d34e-4d08-9552-d71179904486_0_1105920_data1~000.dir~000.dir~049.dir~229.dir~150.dir~FILE002.cdf has no journal, but it does exist in the cloud. This indicates that an overlapping syncWithJournal() call handled it properly.
      ...
      Oct 18 20:17:16 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): no journal file found for data1/335dd7cb-4b49-4f87-b4fd-7e06e4312e14_0_3809280_data1~000.dir~000.dir~050.dir~005.dir~150.dir~FILE002.cdf
      Oct 18 20:17:16 mariadb-x6-pm1 StorageManager[29081]: synchronizeWithJournal(): data1/335dd7cb-4b49-4f87-b4fd-7e06e4312e14_0_3809280_data1~000.dir~000.dir~050.dir~005.dir~150.dir~FILE002.cdf has no journal, but it does exist in the cloud. This indicates that an overlapping syncWithJournal() call handled it properly.
      ======================================================================================
      
      

      What does synchronizeWithJournal do and when it happens?

        Attachments

          Activity

            People

            Assignee:
            ben.thompson Ben Thompson
            Reporter:
            allen.lee@mariadb.com Allen Lee
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.