[MCOL-4924] Insert Cache - Table not accessible after LDI caused a version buffer overflow error Created: 2021-11-16  Updated: 2023-12-15  Resolved: 2023-12-15

Status: Closed
Project: MariaDB ColumnStore
Component/s: DMLProc
Affects Version/s: 6.2.1
Fix Version/s: 23.10

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: Gagan Goel (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None

Issue Links:
Blocks
blocks MCOL-4914 Complete ColumnStore Cache Part 2 Open

 Description   

Build tests: 6.2.1-1 (#3379)

columnstore_cache_inserts=ON
columnstore_cache_use_import=OFF
columnstore_use_import_for_batchinsert=ON
<VersionBufferFileSize>4GB</VersionBufferFileSize>

Tried to LDI 15000000 rows into the dbt3 orders table. LDI command executed fine, but flushing cache hit the version buffer overflow error.

In previous release, when buffer overflow error occurred, the DML comment is automatically rollbacked. What is the expected behavior now since rollback is not supported?

Since rollback is not supported when insert cache is on, should DML bypass the versioning?

MariaDB [insertcache]> load data infile "/data/qa/source/dbt3/10g/orders.tbl" into table orders fields terminated by "|";
Query OK, 15000000 rows affected (27.781 sec)        
Records: 15000000  Deleted: 0  Skipped: 0  Warnings: 0
 
MariaDB [insertcache]> select count(*) from orders;
ERROR 1815 (HY000): Internal error: CAL0006: MCS-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.
MariaDB [insertcache]> select count(*) from orders;
ERROR 1815 (HY000): Internal error: CAL0006: There is no extent information for table orders
MariaDB [insertcache]> exit
Bye
[centos8:root~]# mysql insertcache
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 662
Server version: 10.6.5-1-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [insertcache]> select count(*) from orders;
ERROR 1815 (HY000): Internal error: load failed. The detailed error information is listed in err.log.
 
MariaDB [insertcache]> drop table orders;
ERROR 1815 (HY000): Internal error: CAL0009: Drop table failed due to  MCS-2009: Unable to perform the drop table operation because cpimport with PID 86556 is currently holding the table lock for session -1.  
 
[centos8:root~]# ./viewtablelock
 There is 1 table lock
 
  Table               LockID  Process   PID    Session   Txn  CreationTime              State    DBRoots  
  insertcache.orders  1519    cpimport  86556  BulkLoad  n/a  Tue Nov 16 14:17:56 2021  LOADING  1        
 
[centos8:root~]# ./cleartablelock 1519
writeengine[87601]: 41.291158 |0|0|0| I 19 CAL0088: ClearTableLock: Initiating cleartablelock command for table insertcache.orders; lock-1519
Rolling back and clearing table lock for table insertcache.orders; table lock 1519
 
Sending rollback request to PM1...
Successful rollback response from PM1
controllernode[87601]: 41.372013 |0|0|0| E 29 CAL0000: DBRM: error: SessionManager::rolledback() failed (valid error code)         
Sending cleanup request to PM1...
Successful cleanup response from PM1
 
writeengine[87601]: 41.388516 |0|0|0| I 19 CAL0089: ClearTableLock: cleartablelock command for table insertcache.orders; lock-1519.  Completed successfully
Table lock 1519 for table insertcache.orders is cleared.
 
MariaDB [insertcache]> drop table orders;
Query OK, 0 rows affected, 1 warning (0.826 sec)
 
MariaDB [insertcache]> show warnings;
+---------+------+------------------------------------------------------------------+
| Level   | Code | Message                                                          |
+---------+------+------------------------------------------------------------------+
| Warning | 1017 | Can't find file: 'orders' (errno: 2 "No such file or directory") |
+---------+------+------------------------------------------------------------------+
1 row in set (0.001 sec)
 
MariaDB [insertcache]> 

When DELETE caused an version buffer overflow error, the table is still in good condition

[centos8:root~]# cpimport insertcache lineitem /data/qa/source/dbt3/10g/lineitem.tbl 
cpimport.bin[87711]: 30.150928 |0|0|0| I 34 CAL0086: Initiating BulkLoad: -L /var/log/mariadb/columnstore/cpimport/ -P pm1-87711 -T SYSTEM -u003f45bc-51d0-4560-801f-834c8782cb6e insertcache lineitem /data/qa/source/dbt3/10g/lineitem.tbl 
Locale = en_US.UTF-8
Using table OID 4252 as the default JOB ID
Input file(s) will be read from : /usr/bin
Job description file : /var/log/mariadb/columnstore/data/bulk/tmpjob/4252_D20211116_T153430_S229365_Job_4252.xml
Log file for this job: /var/log/mariadb/columnstore/cpimport/Job_4252.log
2021-11-16 15:34:30 (87711) INFO : successfully loaded job file /var/log/mariadb/columnstore/data/bulk/tmpjob/4252_D20211116_T153430_S229365_Job_4252.xml
2021-11-16 15:34:30 (87711) INFO : Job file loaded, run time for this step : 0.189884 seconds
cpimport.bin[87711]: 30.378885 |0|0|0| I 34 CAL0081: Start BulkLoad: JobId-4252; db-insertcache
2021-11-16 15:34:30 (87711) INFO : PreProcessing check starts
2021-11-16 15:34:30 (87711) INFO : input data file /data/qa/source/dbt3/10g/lineitem.tbl
2021-11-16 15:34:30 (87711) INFO : PreProcessing check completed
2021-11-16 15:34:30 (87711) INFO : preProcess completed, run time for this step : 0.052151 seconds
2021-11-16 15:34:30 (87711) INFO : No of Read Threads Spawned = 1
2021-11-16 15:34:30 (87711) INFO : No of Parse Threads Spawned = 3
2021-11-16 15:36:55 (87711) INFO : For table insertcache.lineitem: 59986052 rows processed and 59986052 rows inserted.
cpimport.bin[87711]: 55.216259 |0|0|0| I 34 CAL0083: BulkLoad: JobId-4252; finished loading table insertcache.lineitem; 59986052 rows inserted
writeengine[87711]: 55.216318 |0|0|0| I 19 CAL0008: Bulkload |Job: /var/log/mariadb/columnstore/data/bulk/tmpjob/4252_D20211116_T153430_S229365_Job_4252.xml |For table insertcache.lineitem: 59986052 rows processed and 59986052 rows inserted.
2021-11-16 15:36:55 (87711) INFO : Bulk load completed, total run time : 145.22 seconds
 
cpimport.bin[87711]: 55.409266 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-4252; status-SUCCESS
[centos8:root~]# mysql insertcache
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 665
Server version: 10.6.5-1-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [insertcache]> delete from lineitem;
ERROR 1815 (HY000): Internal error: CAL0002: Delete Failed:  MCS-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.
MariaDB [insertcache]> select count(*) from lineitem;
+----------+
| count(*) |
+----------+
| 59986052 |
+----------+
1 row in set (2.531 sec)



 Comments   
Comment by David Hall (Inactive) [ 2021-11-19 ]

It's probable that any LOAD DATA INFILE would have a problem if version buffer overflow. This should be looked into. But this has not yet been tested.

Comment by Daniel Lee (Inactive) [ 2021-12-10 ]

Build tested: 6.2.2-1 (#3496)

Could not reproduce the issue with 4GB version buffer, not even doubling the dataset size to 20gb orders.tbl.

I lowered the version buffer to 4MB, instead of 4GB, and I saw the issue again. The following kept coming up when the orders table is queried.

MariaDB [mytest]> select count from orders;
ERROR 1815 (HY000): Internal error: CAL0006: MCS-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.
MariaDB [mytest]> select count from orders;
ERROR 1815 (HY000): Internal error: CAL0006: MCS-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.

Comment by David Hall (Inactive) [ 2022-03-31 ]

Regardless of the settings of columnstore_cache_use_import and columnstore_use_import_for_batchinsert, version buffer overflow should result in a rollback and the table should be accessible.

Comment by David Hall (Inactive) [ 2022-03-31 ]

Daniel: Please run a test where
columnstore_cache_inserts=OFF
columnstore_cache_use_import=OFF
columnstore_use_import_for_batchinsert=OFF

Run an LDI with 150000000 rows in an attempt to cause version buffer overflow. Report if the error is recovered from (rollback works).

If it fails to rollback correctly, re-assign this to me.
If it succeeds in the rollback, re-assign to Gagan.

In either case, let me know what happened.

Comment by Daniel Lee (Inactive) [ 2022-04-18 ]

Build tested: Build tests: 6.2.1-1 (#3379)

I confirmed with Mr. Hall that he want to load 15,000,000 rows, not 150,000,000 rows.

I executed the requested test on the same columnstore build which the issue was oringally reported on. It actually took 3 times of the dataset (45,000,000 rows) to cause the version buffer overflow error. When that happened, the subsequent SELECT statement worked fine.

I also repeated the original test (with (45,000,000 rows) on the same build, as well as on the latest from develop-6 (drone build 4287). Both still have the reported issue.

Generated at Thu Feb 08 02:54:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.