Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
At the moment, ByteStream uses a growing buffer for serialized data. This can lead to increased memory consumption under certain conditions. For example, when dumping RGData to disk, we create a ByteStream object that consumes at least as much memory as the RGData itself. Moreover, if compression is enabled, we allocate another buffer for the compressed data.
It would be nice to have one of (or both):
- support for streaming: caller provides read/write callbacks to provide more data to deserialize or process already serialized part
- zero-copy processing for large portions of data, such as RGData::rowData.
Attachments
Issue Links
- includes
-
MCOL-3758 Parallel sorting 2nd phase and on disk spill capability.
-
- Stalled
-