Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
Description
UUIDs are stored in a binary form, but not byte-swapped like in MySQL's uuid_to_bin https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
The order provided by UUID_TO_BIN() is not really friendly to index prefix compression:
it only reorders the timestamp segments, while the clock-seq and the node-ID segments are still at the end. This ruins the whole idea of prefix compression.
Let's sort UUID in a really index-friendly order, by moving the constant part to the left and the variable part to the right.
So for comparison purposes we'll reorder segments, but keep the byte order inside the segments (i.e. in big endian order):
MariaDB binary UUIDv1 representation, as returned by the UUID() function:
|
|
llllllll-mmmm-Vhhh-vsss-nnnnnnnnnnnn
|
|
Binary sortable representation:
|
|
nnnnnnnnnnnn-vsss-Vhhh-mmmm-llllllll
|
|
Sign Section Bits Bytes Pos PosBinSortable
|
------------- ------- ---- ----- --- --------------
|
llllllll time low 32 4 0 12
|
mmmm time mid 16 2 4 10
|
Vhhh version and time hi 16 2 6 8
|
vsss variant and clock seq 16 2 8 6
|
nnnnnnnnnnnn node ID 48 6 10 0
|
Attachments
Issue Links
- causes
-
MDEV-29959 UUID Sorting
-
- Closed
-
- is caused by
-
MDEV-4958 Adding datatype UUID
-
- Closed
-
- relates to
-
MDEV-20477 Merge binlog extended metadata support from the upstream
-
- Closed
-
-
MDEV-27588 MyISAM/Aria descending index compression is not as efficient as ascending
-
- Closed
-
- links to
danblack, I agree, for externally generated UUIDs, it's better to place "vsss" at the end.
Just tested this script:
DELIMITER $$
DO
$$
DELIMITER ;
nnnnnnnnnnnn-vsss-Vhhh-mmmm-llllllll gives the ratio of 0.56:
Data_length: 1114112
Index_length: 624640
nnnnnnnnnnnn-Vhhh-mmmm-llllllll-vsss gives the ratio of 0.68:
Data_length: 1114112
Index_length: 764928
It's probably still good enough.
Note, the "user visible" order llllllll-mmmm-Vhhh-vsss-nnnnnnnnnnnn, which is now in the preview branch, gives the ratio of 1.44. This one is really bad.
Data_length: 1114112
Index_length: 1611776