Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3567

Overallocation of blocks with LONGBLOB

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • None
    • Icebox
    • writeengine
    • None

    Description

      Using the ColumnStore write API:

      CREATE TABLE IF NOT EXISTS sessions (
        k_idjob int NOT NULL,
        t_ver varchar(255) DEFAULT NULL,
        b_data longblob DEFAULT NULL 
      ) ENGINE=Columnstore;
      

      try
              {
                  bulk = driver->createBulkInsert("test_cs", "sessions", 0, 0);
                  for (i = 0; i < nrecs; i++)
                  {
                      bulk->setColumn(0, (i + 1));        // int NOT NULL
                      bulk->setColumn(1, "blob test");    // varchar
                      bulk->setColumn(2, str.c_str(), str.length() ); // blob
                      bulk->writeRow();
                      totrecs++;
                  }
                  bulk->commit(); // commit nrecs 
                  summary = bulk->getSummary();
                  std::cout << "BulkInsert Block " << j << ": " << summary.getRowsInsertedCount() << " records inserted in " << summary.getExecutionTime() << " seconds, total " << totrecs << std::endl;
              }
              catch (mcsapi::ColumnStoreError &e1) 
              {
                  std::cout << "BulkInsert Block " << j << " error caught: " << e1.what() << std::endl;
                  bulk->rollback();
              }
              if (bulk!=nullptr)
                  delete bulk;
              bulk = nullptr;
      

      str is a simple string of 10000, the same for every row.

      ColumnStore appears to be allocating many blocks onto the table, using 14GB to store 700MB of blocks:

      MariaDB [test_cs]> select * from information_schema.columnstore_files;
      +-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+
      | OBJECT_ID | SEGMENT_ID | PARTITION_ID | FILENAME                                                                                 | FILE_SIZE   | COMPRESSED_DATA_SIZE |
      +-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+
      |      3003 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/187.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3004 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/188.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
      |      3005 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/189.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
      |      3006 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/190.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3007 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/191.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3008 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/192.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3009 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/193.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3010 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/194.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3011 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/195.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
      |      3012 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/196.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
      |      3013 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/197.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3014 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/198.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3015 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/199.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
      |      3016 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/200.dir/000.dir/FILE000.cdf |     2113536 |               229376 |
      |      3017 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/201.dir/000.dir/FILE000.cdf |     2113536 |               229376 |
      |      3018 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/202.dir/000.dir/FILE000.cdf | 14697111552 |            746594304 |
      +-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+
      

      Whilst de-duplication won't apply we likely shouldn't be creating that many empty blocks.

      Attachments

        Issue Links

          Activity

            People

              jrojas Jose Rojas (Inactive)
              LinuxJedi Andrew Hutchings (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.