[MDEV-21829] Use packed sort keys in Unique objects Created: 2020-02-27 Updated: 2024-01-24 |
|
| Status: | In Review |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Fix Version/s: | 11.5 |
| Type: | New Feature | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Michael Widenius |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
|
The task deals with packing the values stored in the Unique tree for each record. BackgroundIn the implementation of Unique for variable sized fields like VARCHAR, For example: lets say there is a query like:
and column col is defined as VARCHAR(200) in latin character set, so for each So this limitation prompted to store packed values inside the Unique tree. Unique class is currently used in
Implementation details1) Packed key format for Unique treeThe keys are stored in a new variable-size data format called "packed". Lets say the record in the Unique tree looks like:
key_length is the length of the whole key. key_length is stored in 4 bytes, Each packed value is encoded as follows
The format of the packed_value depends on the data type. NULLS are ignored for COUNT(DISTINCT col) and GROUP_CONCAT(DISTINCT col1 ....) . For EITS we try NULLS are only considered for JSON_ARRAYAGG currently and the packed value
Note: Note: This is the case where the Unique tree would store only one column. So when the key is of single component Note: How is packing done ? 2) Which format to useContrary to sort keys in Filesort, keys in Unique tree don't have mem-comparable We may consider adding a heuristic to disallow packing for small VARCHAR's and CHARs. |
| Comments |
| Comment by Varun Gupta (Inactive) [ 2020-03-17 ] | ||||||||
|
Debugging the case with group_concat, I see that the keys are compared with collation specific function and not memcmp
But the allocation for each node of the unique tree is still fixed size | ||||||||
| Comment by Varun Gupta (Inactive) [ 2020-03-19 ] | ||||||||
|
The branch is 10.5-mdev21829 | ||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-12 ] | ||||||||
|
The code needs to be adjusted according to the with the pushes regarding | ||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-23 ] | ||||||||
|
After going through the code again, I see that packing is currently not handled for JSON_ARRAYAGG function. | ||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-30 ] | ||||||||
|
The latest patch is in the branch 10.5-mdev21829 | ||||||||
| Comment by Varun Gupta (Inactive) [ 2020-12-07 ] | ||||||||
|
The updated branch is bb-10.6-mdev21829 | ||||||||
| Comment by Michael Widenius [ 2023-11-17 ] | ||||||||
|
The old code is now in second review and reworked to take into account some comments from Igor and me. | ||||||||
| Comment by Vicențiu Ciorbaru [ 2024-01-12 ] | ||||||||
| Comment by Sergei Petrunia [ 2024-01-24 ] | ||||||||
|
Task for testing this MDEV-33305 (NOT including performance) |