Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-2040

mcsimport load is executed with worst compression ratio and more used disk space than mcs cpimport

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.2.3, 1.2.2
    • Fix Version/s: Icebox
    • Component/s: mcsapi, mcsimport
    • Labels:
      None
    • Environment:
      mcs single server; 64G memory; 8CPUs; CentOS7.5

      Description

      mcsimport load is executed with worst compression ratio and more used disk space than mcs cpimport

      It would be expected to accomplish mcsimport tool without degradation in compression ratio and increased disk space usage in comparison to the cpimport

      notes:
      mcsimport was installed and executed locally on MCS in order to compare the mcsimport tool to cpimport excluding the network delay
      mcsimport is installed with mariadb-columnstore-api-cpp and mariadb-columnstore-tools packages;
      cpimport comes with mariadb-columnstore installation; cpimport was run in mode m1;

      MCS Load Method Scale Factor Data Volume Used Disk Space Compression_Ratio
      mcsimport 100 100GB 31.69 GB 5.0409:1
      cpimport 100 100GB 21.5 GB 5.0392:1
      MCS Load Method Scale Factor Data Volume Used Disk Space
      mcsimport 1000 1TB 493.46 GB
      cpimport 1000 1TB 360.13 GB

      *cpimport*
      Start Load Test columnstore_info.total_usage
      TOTAL_DATA_SIZE TOTAL_DISK_USAGE
      249.05 GB       83.51 GB
       
       
       
      End Load Test columnstore_info.total_usage
      TOTAL_DATA_SIZE TOTAL_DISK_USAGE
      311.82 GB       105.01 GB
       
       
      columnstore_info.table_usage
      TABLE_SCHEMA    TABLE_NAME      DATA_DISK_USAGE DICT_DATA_USAGE TOTAL_USAGE
      tpcds_100       call_center     48.74 MB        36.14 MB        84.88 MB
      tpcds_100       catalog_page    13.07 MB        8.03 MB 21.10 MB
      tpcds_100       catalog_returns 1.52 GB 0.00 Bytes      1.52 GB
      tpcds_100       catalog_sales   5.07 GB 0.00 Bytes      5.07 GB
      tpcds_100       customer        808.14 MB       264.06 MB       1.05 GB
      tpcds_100       customer_address        720.10 MB       206.08 MB       926.18 MB
      tpcds_100       customer_demographics   304.07 MB       4.02 MB 308.09 MB
      tpcds_100       date_dim        25.22 MB        4.02 MB 29.23 MB
      tpcds_100       household_demographics  6.04 MB 2.01 MB 8.05 MB
      tpcds_100       income_band     3.02 MB 0.00 Bytes      3.02 MB
      tpcds_100       inventory       454.61 MB       0.00 Bytes      454.61 MB
      tpcds_100       item    34.17 MB        272.10 MB       306.27 MB
      tpcds_100       promotion       17.40 MB        8.03 MB 25.43 MB
      tpcds_100       reason  5.02 MB 4.02 MB 9.04 MB
      tpcds_100       ship_mode       11.05 MB        10.04 MB        21.09 MB
      tpcds_100       store   45.73 MB        34.13 MB        79.86 MB
      tpcds_100       store_returns   2.20 GB 0.00 Bytes      2.20 GB
      tpcds_100       store_sales     3.40 GB 0.00 Bytes      3.40 GB
      tpcds_100       time_dim        13.58 MB        8.03 MB 21.61 MB
      tpcds_100       warehouse       23.61 MB        20.08 MB        43.69 MB
      tpcds_100       web_page        16.36 MB        6.02 MB 22.38 MB
      tpcds_100       web_returns     768.19 MB       0.00 Bytes      768.19 MB
      tpcds_100       web_sales       5.13 GB 0.00 Bytes      5.13 GB
      tpcds_100       web_site        41.70 MB        32.12 MB        73.83 MB
       
      columnstore_info.compression_ratio
      COMPRESSION_RATIO
      5.0392:1
      
      

      *mcsimport*
       
      Start Load Test columnstore_info.total_usage
      TOTAL_DATA_SIZE TOTAL_DISK_USAGE
      249.05 GB       83.51 GB
       
       
      End Load Test columnstore_info.total_usage
      TOTAL_DATA_SIZE TOTAL_DISK_USAGE
      311.83 GB       115.20 GB
       
      columnstore_info.table_usage
      TABLE_SCHEMA    TABLE_NAME      DATA_DISK_USAGE DICT_DATA_USAGE TOTAL_USAGE
      tpcds_100       call_center     48.74 MB        36.14 MB        84.88 MB
      tpcds_100       catalog_page    13.07 MB        8.03 MB 21.10 MB
      tpcds_100       catalog_returns 1.69 GB 0.00 Bytes      1.69 GB
      tpcds_100       catalog_sales   8.51 GB 0.00 Bytes      8.51 GB
      tpcds_100       customer        808.14 MB       264.06 MB       1.05 GB
      tpcds_100       customer_address        720.10 MB       206.08 MB       926.18 MB
      tpcds_100       customer_demographics   304.07 MB       4.02 MB 308.09 MB
      tpcds_100       date_dim        25.22 MB        4.02 MB 29.23 MB
      tpcds_100       household_demographics  6.04 MB 2.01 MB 8.05 MB
      tpcds_100       income_band     3.02 MB 0.00 Bytes      3.02 MB
      tpcds_100       inventory       1.00 GB 0.00 Bytes      1.00 GB
      tpcds_100       item    34.17 MB        272.10 MB       306.27 MB
      tpcds_100       promotion       17.40 MB        8.03 MB 25.43 MB
      tpcds_100       reason  5.02 MB 4.02 MB 9.04 MB
      tpcds_100       ship_mode       11.05 MB        10.04 MB        21.09 MB
      tpcds_100       store   45.73 MB        34.13 MB        79.86 MB
      tpcds_100       store_returns   2.50 GB 0.00 Bytes      2.50 GB
      tpcds_100       store_sales     5.75 GB 0.00 Bytes      5.75 GB
      tpcds_100       time_dim        13.58 MB        8.03 MB 21.61 MB
      tpcds_100       warehouse       23.61 MB        20.08 MB        43.69 MB
      tpcds_100       web_page        16.36 MB        6.02 MB 22.38 MB
      tpcds_100       web_returns     768.19 MB       0.00 Bytes      768.19 MB
      tpcds_100       web_sales       8.51 GB 0.00 Bytes      8.51 GB
      tpcds_100       web_site        41.70 MB        32.12 MB        73.83 MB
       
       
      columnstore_info.compression_ratio
      COMPRESSION_RATIO
      5.0409:1
      
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              winstone Zdravelina Sokolovska (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.