Details
-
New Feature
-
Status: Confirmed (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
Description
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
- DELETE DML statements.
- bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally.
In case when the number of empty records is significant it is a disk space and CPU time waste.
The project will deliver a functionality that allows to:
- analyze empty values/records percentage in an extent, file, partition, table.
- manually cleanup empty values to reduce disk space usage.
- automaticaly cleanup empty values using a background worker.
Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition:
- take another partition to cleanup
- take another extent in the partition choosen; lock it and store its original HWM;
- open a segment file with the extent with a min(col width) in the partition
- take a pointer to the last used value in the extent
- while !eof of the min width extent
- find another empty value in the extent
- save block in Version Buffer if not yet
- replace empty with the pointer
- update pointer
- reduce HWM droping empty values
- take another extent in the partition choosen; lock it and store its original HWM;
See FSM diagram for details.
TBD Exceptions processing
Attachments
Issue Links
- relates to
-
MCOL-4887 Background worker to automate on-disk data housekeeping
- Closed