Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Often, columns have a few values that are very frequent.
For this case, one can use a histogram that is a collection of (value, frequency) pairs.
There are [approximate] algorithms that allow to find most common values while using a limited amount of memory and/or basing on sample of the table.
Another important property is that most-common-value collection/storage can be generalized to tuples of multiple columns.
Attachments
Issue Links
- relates to
-
MDEV-21130 Histograms: use JSON as on-disk format
- Closed