Details
-
New Feature
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
-
None
-
2021-9, 2021-10, 2021-11, 2021-12, 2021-13, 2021-14, 2021-15, 2021-16, 2021-17
Description
The objective of this task is to advance ColumnStore write cache - as defined in MCOL-3875.
First phase of it was limited to supporting HTAP replication only. This task is extending it into more general status: INSERTS and LDI.
Known problems to be solved under priority:
1. Once the feature is activated in configs, all previous columnstore tables can no longer be accessed. Once de-activated, they come back.
2. Once the feature is activated, and a table is created and populated, it can be dropped, but then cannot be re-created. (this is confirmed as not happening anymore, not a problem).
3. In case of multi-node CS cluster set up for DDL replication from primary to the rest, when the cache is flushed, it appears that the data replication occurs also (in addition to DDL). It is supposed to be suppressed in a normal CS cluster operation.
4. Under heavier concurrent bombardment of LDIs, some sessions go into "waiting for table lock" state, which in some cases clears after a few hours (unclear why, perhaps some sort of timeout), and some remain in that state forever. Waiting for the lock is understandable - cache is being locked for the duration of a flush. But even if flush takes a few seconds, why would some sessions lock out for hours, and some forever? (workaround - MCOL-4790, use columnstore_cache_use_import)
Basics of the algorithms (full description in 3875):
- inserts are accumulated in cache, transactions serialized by cache locks
- Rollbacks will roll back cache
- Accumulated inserts are flushed into columnstore on various events (all non inserts, including SELECT - and settable size threshold defaulted to 500,000 rows in the cache).
- we may want to consider relaxing the above rule and letting SELECTS run as "dirty reads"
Testing instructions
Part 1. Positive tests.
- AUTOCOMMIT=ON, do a lot of individual INSERTs. With cache_insert and cpimport_cache_insert the speed should be orders of magnitude greater.
- Do same inserts but this time within multiple transactions - have something like 3000-4000 inserts in each transaction.
- Do lots of LDIs (also 3000-4000 rows each) in rapid loop (see test scripts in
MCOL-4790as an example). Should be a huge difference in speed. - Do above from multiple concurrent sessions - the speed improvements should be even greater.
Part 2. Adversarial tests.
- While running AUTOCOMMIT=ON and inserts, also do occasional updates, deletes, and even selects. These reduce the effectiveness of the feature (cache has to be flushed every time these happen). The feature is not useful for workflows if they do not have a predominance of uninterrupted inserts or LDIs.
- Put some updates, deletes or selects inside transactions which do inserts (say, one of those for each 500 inserts). compare the speed.
- Run transactions with inserts from multiple sessions, but rollback some and not others. Make sure the end result data is correct.
For QA:
To test Point 1:
Start mariadb with columnstore_cache_inserts set to OFF. Then create a table, perform some INSERTs. After this, restart the server with columnstore_cache_inserts set to ON. Make sure the table created when the cache inserts were disabled is now accessible.
To test Point 3:
Launch a 3-node MCS cluster as normal, but make sure the cache is enabled on the primary and the replica nodes using the conf file. Create a table and perform some inserts into the primary. Then perform SELECTs on this table on the primary as well as the replica nodes. You should see the correct number of records in the table. Before the fix, if a single record is inserted, it will be duplicated on the replica nodes as well, and when the cache is flushed on the replicas, the record will be inserted into the CS table. So in the end, you would see 3 copies of the same record (provided a SELECT is executed on the primary as well as both the replicas) instead of 1. This is now fixed.
Point 4 is already tested and closed in
MCOL-4790.