[MCOL-3567] Overallocation of blocks with LONGBLOB Created: 2019-10-18  Updated: 2019-11-08  Resolved: 2019-11-08

Status: Closed
Project: MariaDB ColumnStore
Component/s: writeengine
Affects Version/s: None
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Andrew Hutchings (Inactive) Assignee: Jose Rojas (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates MCOL-2276 Unable to insert certain row Closed

 Description   

Using the ColumnStore write API:

CREATE TABLE IF NOT EXISTS sessions (
  k_idjob int NOT NULL,
  t_ver varchar(255) DEFAULT NULL,
  b_data longblob DEFAULT NULL 
) ENGINE=Columnstore;

try
        {
            bulk = driver->createBulkInsert("test_cs", "sessions", 0, 0);
            for (i = 0; i < nrecs; i++)
            {
                bulk->setColumn(0, (i + 1));        // int NOT NULL
                bulk->setColumn(1, "blob test");    // varchar
                bulk->setColumn(2, str.c_str(), str.length() ); // blob
                bulk->writeRow();
                totrecs++;
            }
            bulk->commit(); // commit nrecs 
            summary = bulk->getSummary();
            std::cout << "BulkInsert Block " << j << ": " << summary.getRowsInsertedCount() << " records inserted in " << summary.getExecutionTime() << " seconds, total " << totrecs << std::endl;
        }
        catch (mcsapi::ColumnStoreError &e1) 
        {
            std::cout << "BulkInsert Block " << j << " error caught: " << e1.what() << std::endl;
            bulk->rollback();
        }
        if (bulk!=nullptr)
            delete bulk;
        bulk = nullptr;

str is a simple string of 10000, the same for every row.

ColumnStore appears to be allocating many blocks onto the table, using 14GB to store 700MB of blocks:

MariaDB [test_cs]> select * from information_schema.columnstore_files;
+-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+
| OBJECT_ID | SEGMENT_ID | PARTITION_ID | FILENAME                                                                                 | FILE_SIZE   | COMPRESSED_DATA_SIZE |
+-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+
|      3003 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/187.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3004 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/188.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
|      3005 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/189.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
|      3006 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/190.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3007 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/191.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3008 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/192.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3009 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/193.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3010 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/194.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3011 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/195.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
|      3012 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/196.dir/000.dir/FILE000.cdf |     1056768 |                73728 |
|      3013 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/197.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3014 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/198.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3015 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/199.dir/000.dir/FILE000.cdf |     2105344 |               122880 |
|      3016 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/200.dir/000.dir/FILE000.cdf |     2113536 |               229376 |
|      3017 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/201.dir/000.dir/FILE000.cdf |     2113536 |               229376 |
|      3018 |          0 |            0 | /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/202.dir/000.dir/FILE000.cdf | 14697111552 |            746594304 |
+-----------+------------+--------------+------------------------------------------------------------------------------------------+-------------+----------------------+

Whilst de-duplication won't apply we likely shouldn't be creating that many empty blocks.



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2019-11-08 ]

Jose's fix for MCOL-2276 along with another fix has resolved this.

Generated at Thu Feb 08 02:43:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.