Details
-
New Feature
-
Status: Confirmed (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
Description
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
- DELETE DML statements.
- bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally.
In case when the number of empty records is significant it is a disk space and CPU time waste.
The project will deliver a functionality that allows to:
- analyze empty values/records percentage in an extent, file, partition, table.
- manually cleanup empty values to reduce disk space usage.
- automaticaly cleanup empty values using a background worker.
Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition:
- take another partition to cleanup
- take another extent in the partition choosen; lock it and store its original HWM;
- open a segment file with the extent with a min(col width) in the partition
- take a pointer to the last used value in the extent
- while !eof of the min width extent
- find another empty value in the extent
- save block in Version Buffer if not yet
- replace empty with the pointer
- update pointer
- reduce HWM droping empty values
- take another extent in the partition choosen; lock it and store its original HWM;
See FSM diagram for details.
TBD Exceptions processing
Attachments
Issue Links
Activity
Description |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: # take another partition to cleanup # take another extent in the partition choosen; lock it and store its original HWM; # open a segment file with the extent with a min(col width) in the partition # take a pointer to the last used value in the extent # while !eof of the min width extent # find another empty value in the extent # save block in Version Buffer if not yet # replace empty with the pointer # update pointer # reduce HWM droping empty values See FSM diagram for more details. TBD Exceptions processing |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: # take another partition to cleanup ## take another extent in the partition choosen; lock it and store its original HWM; # open a segment file with the extent with a min(col width) in the partition # take a pointer to the last used value in the extent # while !eof of the min width extent # find another empty value in the extent # save block in Version Buffer if not yet # replace empty with the pointer # update pointer # reduce HWM droping empty values See FSM diagram for more details. TBD Exceptions processing |
Description |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: # take another partition to cleanup ## take another extent in the partition choosen; lock it and store its original HWM; # open a segment file with the extent with a min(col width) in the partition # take a pointer to the last used value in the extent # while !eof of the min width extent # find another empty value in the extent # save block in Version Buffer if not yet # replace empty with the pointer # update pointer # reduce HWM droping empty values See FSM diagram for more details. TBD Exceptions processing |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: - take another partition to cleanup -- take another extent in the partition choosen; lock it and store its original HWM; # open a segment file with the extent with a min(col width) in the partition # take a pointer to the last used value in the extent # while !eof of the min width extent # find another empty value in the extent # save block in Version Buffer if not yet # replace empty with the pointer # update pointer # reduce HWM droping empty values See FSM diagram for more details. TBD Exceptions processing |
Description |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: - take another partition to cleanup -- take another extent in the partition choosen; lock it and store its original HWM; # open a segment file with the extent with a min(col width) in the partition # take a pointer to the last used value in the extent # while !eof of the min width extent # find another empty value in the extent # save block in Version Buffer if not yet # replace empty with the pointer # update pointer # reduce HWM droping empty values See FSM diagram for more details. TBD Exceptions processing |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: - take another partition to cleanup --take another extent in the partition choosen; lock it and store its original HWM; ---open a segment file with the extent with a min(col width) in the partition ---take a pointer to the last used value in the extent ---while !eof of the min width extent ----find another empty value in the extent ----save block in Version Buffer if not yet ----replace empty with the pointer ----update pointer ----reduce HWM droping empty values See FSM diagram for details. TBD Exceptions processing |
Description |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: - take another partition to cleanup --take another extent in the partition choosen; lock it and store its original HWM; ---open a segment file with the extent with a min(col width) in the partition ---take a pointer to the last used value in the extent ---while !eof of the min width extent ----find another empty value in the extent ----save block in Version Buffer if not yet ----replace empty with the pointer ----update pointer ----reduce HWM droping empty values See FSM diagram for details. TBD Exceptions processing |
MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values corresponds with empty record if treat the table data in row orientation. The empty records are the results of:
# DELETE DML statements. # bulk insertion operations that leverage cpimport explicitly or implicitly, e.g. INSERT..SELECT uses cpimport to ingest the data internally. In case when the number of empty records is significant it is a disk space and CPU time waste. The project will deliver a functionality that allows to: - analyze empty values/records percentage in an extent, file, partition, table. - manually cleanup empty values to reduce disk space usage. - automaticaly cleanup empty values using a background worker. Here is the initial scenario that automated background worker should follow to automatically clean-up empty records in a partition: - take another partition to cleanup -- take another extent in the partition choosen; lock it and store its original HWM; --- open a segment file with the extent with a min(col width) in the partition --- take a pointer to the last used value in the extent --- while !eof of the min width extent ---- find another empty value in the extent ---- save block in Version Buffer if not yet ---- replace empty with the pointer ---- update pointer ---- reduce HWM droping empty values See FSM diagram for details. TBD Exceptions processing |
Rank | Ranked higher |
Rank | Ranked higher |
Assignee | Todd Stoffel [ toddstoffel ] |
Fix Version/s | Icebox [ 22302 ] |
Resolution | Won't Do [ 10201 ] | |
Status | Open [ 1 ] | Closed [ 6 ] |
Assignee | Todd Stoffel [ toddstoffel ] | Roman [ drrtuy ] |
Resolution | Won't Do [ 10201 ] | |
Status | Closed [ 6 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | Confirmed [ 10101 ] |
Labels | gsoc24 |
Labels | gsoc24 | gsoc24 gsoc25 |