[MDEV-29959] UUID Sorting Created: 2022-11-06  Updated: 2023-12-19  Resolved: 2023-07-06

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 11.0.0, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 10.9.8, 10.10.6, 10.11.5, 11.0.3

Type: Bug Priority: Critical
Reporter: Rick Strafy Assignee: Sergei Golubchik
Resolution: Fixed Votes: 4
Labels: uuid

Attachments: File UUID-old-new.diff    
Issue Links:
Blocks
blocks MDEV-31626 implement inet4->inet6 cast Closed
Problem/Incident
causes MDEV-32112 Random incorrect uuid errors since 10... Closed
causes MDEV-33065 Insert in table with constraint by UU... Open
is caused by MDEV-26664 Store UUIDs in a more efficient manner Closed
Relates
relates to MDEV-31926 UUID v7 are compared incorrectly Closed

 Description   

Ricky's original entry:

Hi, is it possible to revert rearranging bits when inserting new uuid?
I have no idea how did anyone come to conclusion to rearrange uuid
bits before inserting and suppose that every inserted uuid is a
version 1 uuid, and not even bother checking version that's being
inserted. UuidV6,7,8 and ULID are now useless with uuid type in
mariadb and they will cause fragmentation, and it's impossible to sort
rows.

Monty;
This Jira entry is about the new UUID type in MariadB 10.7

It seams that the main confusion is that when doing an ORDER BY
on the new UUID type, the UUID's is not coming in lexical order
but in storage order.

I have addded a little bit of background to allow us to better
understand and discuss the issues with UUID and also give other
readers more understand of the different UUID options that MariaDB
supports.

MariaDB has a few different uuid's (all unique in the server and
across servers)

uuid()

  • Universal Unique Identifiers (UUIDs), as in DCE 1.1: Remote Procedure Call,
  • Unique among all MariaDB installations
  • String, 5 parts separated with '-', 36 bytes
  • Starts 'scrambled time' (time-low, time-mid, time-high) followed by a set
    system unique constants
  • 'Random' from the user point of view

sys_guid()

  • Like the above but without '-'
  • String, 32 bytes

uuid_short()

  • 'MySQL's original space efficient uuid'. Unique among a MariaDB cluster.
  • Server id - timestamp - incrementor
  • Each call generates a slightly higher number
  • longlong, 8 bytes ; Efficient storage!
  • 'Sequential' from the user point of view.

UUID data type in 10.7.0

  • String representation defined in RFC4122.
  • Stored in 'index-friendly manner'. This means that the last
    'constant part' of uuid is stored first and then the time in high-to-low
    byte order. Time bytes are store in big-endian format (same as uuid())
    This only works when the input comes from the MariaDB UUID() function (UUID v1).
  • Newer uuid's will be >= than older id's
  • 16 bytes
  • Storage efficient for storage_engines that can do prefix compression.
  • Can be confusion when doing ORDER BY UUID_column as the order
    is in storage order, not 'string value order'.

Other things

  • For sorting there is no difference if timestamp is stored first or
    timestamp is stored last after a 'constant part'. The main issue
    for this Jira entry seams to be the the timestamp value is
    internally reordered to get a better storage and bulk insert rate.
  • There is no problem in sorting any of the above UUIDs. As Barkov
    shows with an example, one can always ensure lexical order also
    for the UUID type.
  • For index storage efficiency, storing timestamp last in an UUID is
    better as it allows the storage engine to do prefix-compression and
    can reduce the 16 byte key to 8 bytes or less.
  • The main purpose of an UUID is that they should be unique for the
    server and also across different systems. The storage order of
    bytes does not matter for this to hold true. As long as an
    application does not depend on the sort order of UUID's (and
    applications should not), the MariaDB UUID type is compatible with
    any other UUID or any other database server.

For storage engine performance there are two things that one would
like to optimize related to UUID's:

1) When doing bulk insert, having UUID in order makes inserts of the
primary key faster and makes the primary key index smaller (initially).
2) When doing single insert from multiple threads, having UUID in 'random
order' is better as there will be less page collisions between multiple
threads and one can get higher insert throughput.

In other words, it depends on the usage of UUID keys how they should
be stored. There is no obvious 'best way'.

When using UUID type or uuid_short() one gets benefit 1)
By using UUID() strings, one gets benefit 2)

After reading the comment on this Jira entry, the conclusion with current
code is:

  • If your application is not depending on the sort order of UUID and you want to optimizer for bulk insert and less storage of the UUID key, then use the UUID type.
  • If your application requires UUID's to be sorted as strings, if you have your own version of UUID or you want to optimize for concurrency between multiple threads, then use BINARY(32) and the sys_guid() function.
  • We should consider adding another UUID type that will store things in the given order. Another option would be to add a type that would
    store HEX strings in binary and show them as hex strings. This could be universally useful and could be a building block for a new UUID
    type that stores things in exact byte order.
    The HEX type would also work for UUID's, but then one would miss the '-' in the output string.
  • As Sergei points out in the comments, we should be able to detect other standard UUID's types > version 5 and not swap bits in them. He has already been working on making this happen (time table not yet defined). For 'own' UUID types, a HEX type could be an efficient.


 Comments   
Comment by Sergei Golubchik [ 2022-11-06 ]

At of now MariaDB can only generate UUID v1. For externally generated UUIDs, rfc4122 specifies only UUID versions up to v5 (where v3 to v5 are unordered, so bit rearranging doesn't hurt them).

v6 to v8 are specified in a draft that was created on 23 June 2022 and that expires on 25 December 2022.

You're right that if there will be new standard UUID versions where bit rearranging decreases locality then MariaDB should be changed to look into the version field and rearrange bits conditionally.

Comment by Rick Strafy [ 2022-11-06 ]

Hm, but lot of the main uuid libraries (I checked php, ruby, python, rust, go) have already implemented at least v6 and v7 uuid, even if they are still a draft, it's clear that they will be time based and time must be within the first part to make it sortable.

People are starting to experiment with that new versions, and it was really surprising how sorting works, so I'm switching back to binary(16) or postgres, I think the best time for that change is now when there is barely no one using that new datetype, since last LTS is 10.6, because it may be a BC break, I suggest not to do such magic at all, since there are people who may use ULIDs with UUID string representation, and that place where should be version specification is a random number 0-f, so if there is condition only for v1 uuid, every ~16. generated ulid will be inserted randomly and not at the end of the table.

Comment by Rick Strafy [ 2022-11-06 ]

And people are switching from BINARY(16) or VARCHAR(36) to UUID, and when they figure out that sorting is totally different, they need to stay on binary, or they need to rearrange bits, and if they do that and you will revert the change, well.

Comment by Jan [ 2022-11-08 ]

I think taht good enough solution would by types UUID(1) and UUID(0). One is reanranged, second one is not. I dont understand how someone could think, that forced rearranging is good idea.

Comment by Rick Strafy [ 2022-11-11 ]

Will it be at least opt-out?

Comment by Sergei Golubchik [ 2022-11-18 ]

I think the server should just unconditionally bit-swap UUID versions 5 and below, and not bit-swap versions 6 and above. There's hardly a reason to make this optional with opt-in or opt-out.

Technically, there is no reason to bit-swap versions 3 to 5, but it's likely too late, we might have users who store those UUIDs in a table. And v6+ are new enough. Also, perhaps, CHECK TABLE FOR UPGRADE will be able to detect it.

Comment by Jan [ 2022-11-18 ]

Mako no sense to me. Wouldn't be better just let current stored UUID let as is and when you resave than server choose swap or not? Same problem is for v6+, there may be late too and i'm not sure if it's valid argument.

Comment by Rick Strafy [ 2022-11-18 ]

I won't argue with that, because for now I will be thankful for any fix of versions above 5, because I stayed on uuid since I believe this will be resolved, and I'm getting full memory consumption when selecting and migrating data, it was doing ok with binary and ULID.

Is there any eta for fixing this at least for v6 and v7? Because I want to release new version of my app at the end of the month, and I want to have optimal db performance, currently I'm using MariaDB 10.10.

The only problem I see is that if someone will use uuid as his custom integer representation, or ULID (https://github.com/ulid/spec), migration from binary to uuid will be painful when they realize that "v1" uuid is saved differently and the sorting will skip every X. uuid. It is a good idea to rearrange bits in v1 so it can be sortable, but I think it should be optional, but I'm happy if it will be fixed at least for versions > 5.

Btw, will 10.11 be LTS?

Comment by Rick Strafy [ 2022-11-28 ]

Hi, any progress on this?

Comment by Alexander Barkov [ 2022-12-01 ]

The UUID data type was designed to store the result of the function UUID(), which returns UUID version 1 variant 1, which is timestamp based UUID with timestamp components stored in big-endian order.

Therefore ordering is optimized for this particular combination of version and variant to achieve best index performance: The earlier in time UUID() was generated - the earlier it's sorted. This scripts demonstrates ordering:

CREATE OR REPLACE TABLE t1 (id INT AUTO_INCREMENT PRIMARY KEY, c1 UUID);
INSERT INTO t1 (c1) VALUES ('11111112-2222-1332-ffff-040300000000');
INSERT INTO t1 (c1) VALUES ('11111111-2222-1333-ffff-040300000000');
SELECT * FROM t1 ORDER BY c1;

+----+--------------------------------------+
| id | c1                                   |
+----+--------------------------------------+
|  1 | 11111112-2222-1332-ffff-040300000000 | Timestamp starts with 1332, so it's smaller.
|  2 | 11111111-2222-1333-ffff-040300000000 | Timestamp starts with 1333, so it's greater.
+----+--------------------------------------+

The underlying code could be extended to analyse UUID value fragments responsible for version/variant and apply different byte shuffling (or no shuffling at all) for different version/variant combinations.
I'm afraid we cannot provide a very quick solution, it will take some time.

However, there is no a need to switch back to BINARY(16) to sovle the ordering problem.
If you need to order an UUID column lexicographically, please use CAST(..AS BINARY):

SELECT * FROM t1 ORDER BY CAST(c1 AS BINARY);

+----+--------------------------------------+
| id | c1                                   |
+----+--------------------------------------+
|  2 | 11111111-2222-1333-ffff-040300000000 |
|  1 | 11111112-2222-1332-ffff-040300000000 |
+----+--------------------------------------+

Comment by Rick Strafy [ 2022-12-01 ]

I understand, but I still think this magical byte shuffling should be optional, like UUID(something), because it's very unexpected and I think no other DB system supporting UUID will do that, so people can easily store any 128 bit value as uuid (ULIDs for example). UUID types are designed as db-independent, I believe there are more users that are generating UUID somewhere else, and inserting it into DB, this is very important advantage od that type, it could be generated anywhere and not rely on auto increment in db, so created "entity" is valid even before insertion to database and you know its id.

I tried using CAST(id AS BINARY), but it's 10 000x slower than normal sort in table with 6 000 000 records, 3.1481s vs 0.0003s, so I'm using datetime column for now, It's not so accurate but so far I can live with that. I also tried migrating to another DB platform that sorts uuid like binary, but there were much slower simple selects, so it wasn't worth it.

So I will wait for a solution (I'm using UUID v7 now, so even version checking will help me), I see more people are starting switching from binary to uuid and have the same problem - https://dba.stackexchange.com/questions/315494/mariadb-uuid-datatype-not-orders-correctly

Comment by Sergei Golubchik [ 2022-12-02 ]

It looks like we can implement a solution that automatically swaps bits in UUID versions 5 and below, and does not swap in versions 6 and above. But as far as I can see it will make it totally impossible to store UUID v8 or above with variant 0. It doesn't look like a combination people would use though, variant 0 is from 1988 and was obsoleted long before version 8 was introduced, so may be it's a limitation we can live with.

Alternatively, it could be a different type where the version is specified explicitly (for example, UUID(7) or UUIDv7) which enforces the correct version, but this looks less usable to me

Comment by Anton [ 2022-12-18 ]

I can confirm that this is a big problem if I want to use modern uuid v6 or v7. Earlier such behavior was ok, but nowadays since we have modern lexicographic sortable uuids it's absolutely needs to be at least optional behavior.
Now we have a situation that we can't effectively use native UUID type with uuid if it not old standard uuid v1.
Moreover such behavior not obvious and I spend a time to finding out why lexicographic sortable uuid is not actually sorted in the correct order.

Comment by Trevor Gross [ 2022-12-21 ]

What would a solution that automatically swaps <=v5 look like? As in, some rows would be reordered and others not (this sounds scary) or the `uuid()` function produces reordered v1 (i.e., v6) UUIDs?

I personally am +1 on a different datatype rather than any sort of automatic reordering, something like `UUID(1)` or `UUID_FIXED`. Or renaming the current type to `UUID_REORDER_V1` and re-adding a `UUID` type that doesn't reorder, if that's possible in some way. The reordering thing was awesome when it came out and definitely fit a need, it's really too bad that it seems like a decision that came back to bite a bit.

It will be interesting to see if ISO SQL develops a spec for this after v6-8 UUIDs are finalized, if so it would be good to be compliant (I'm sure some of the MariaDB team would be aware of this)

Comment by Anton [ 2022-12-21 ]

Different datatypes is a wrong way because different uuids versions should be compatible with each others. UUID datatype should accept all version uuid at the same time.

Comment by Rick Strafy [ 2022-12-30 ]

I think the best move will be to remove that behavior completely and make BC break for the sake of the majority of users. Also it will be expected and compatible with PostgreSQL and other DB platforms.

Because I doubt there are people using latest MariaDB 10.7+ which isn't even LTS, together with oldest UUID version, in the time MariaDB 10.7 came out, there were already lexicographic options like ULID for primary keys.

Another thing is as I mentioned - custom UUID versions and ULIDs will not work, if you want to reassemble bits based on version position in UUID string.

We are humans, we all make mistakes or rash decisions, but there is still option to take it back, it's not too late - since there is no LTS yet, and most people are using only LTS versions. Version 10.11 is only in RC state, there's still a time.

Also I can see there is 11.0.0 alpha, maybe the best time to do it there. That reassembling feature is today completely obsolete, when people have the option to use newer UUID versions instead of v1, and legacy apps with v1 UUID did not have ti sorted anyway, so if they were using it as PK and faced performance issue, maybe they were reordering bits before insertion/read on their backend.

Comment by Rick Strafy [ 2023-01-16 ]

Any progress on this issue?

Comment by Rick Strafy [ 2023-01-30 ]

I would appreciate some response or eta of resolving this bug, I have new version of web app with UUIDv7 migration ready and I'm waiting for months, I'm considering now doing rewrite to bigint as pk, because I want to be app as fast as possible without db fragmentation, since there are tables with a >10 mil. rows and there are intensive api calls.

Comment by Daniel Black [ 2023-01-31 ]

serg - I did a first attempt covering the storage - https://github.com/MariaDB/server/pull/2468

Additional guidance on where and how to implement a migration appreciated.

Comment by Illia Somov [ 2023-04-27 ]

Please get this resolved. This is ridiculous behavior, we cannot use native MariaDB UUIDs in our project because of it.

I see how you can solve it in a way without BC break: just add some sort of parameter to UUID type e.g. `UUID(rearrange=true)`, dunno, which defaults to `false` in new UUIDs, but for existing databases it's `true` to prevent breaking existing data.

Comment by Sergei Golubchik [ 2023-05-17 ]

introducing UUID type that only reorders bytes for version <= 5 and using UUID "old" logic (that always reorders) for already existing tables — this is easy, I've attached commits that do it.

But to teach MariaDB how to convert old UUID to the new one (which is needed for joins, UNION, even ALTER TABLE) — this requires non-trivial refactoring of the data type plugin framework. It'll need some time.

Comment by Trevor Gross [ 2023-05-17 ]

Is the plan to have a new UUID type (`uuid2` or similar) that detects version upon insert and reorders <=v5, then does the reverse upon select? Meaning that if you insert some v1 and some v7 UUIDs into the same column, some will be reordered before store and others won't?

Auto-rearranging would work nicely for anyone who a combination of v1 with v6 or v7 - but it seems like unneeded work for all other cases, and could have some unintuitive results (e.g. columns that have a mix of any other types for whatever reason). Especially since the isn't a point in reordering v3/v4/v5 either.

I would just prefer a new type that doesn't ever reorder (`binary(16)` with hex/unhex and formatting) instead of one that tries to automatically rearrange some types of UUIDs. This way is just easier to understand, and it makes sense for everyone not using v1 - which will probably eventually become the norm, since v7/v7 are meant to supercede v1.

Comment by Sergei Golubchik [ 2023-05-18 ]

The plan is to fix existing UUID, not to add a new type. I'd prefer to have one type that always works in all cases, instead of forcing user to choose which UUID type to use in every specific case.
And adding a new type that doesn't ever reorder won't make the implementation simpler anyway. The refactoring problem still need to be solved.

Comment by Sergei Golubchik [ 2023-06-27 ]

see commits 83d97632bb91..68ba39d0b0b (in the bb-11.1-serg branch)

Comment by Illia Somov [ 2023-06-28 ]

Hi, Sergei.

Just want to make sure there is no misunderstanding. In the commits you've pushed and which were automatically mentioned above, you change the behavior such that UUID bytes are not swapped only for version 8. Personally I don't understand such choice. The original issue description specifically mentions UUIDs version 6 and version 7 which are broken by the current MariaDB behavior, but those seem to not be targeted by your fix. UUIDv8 is a version reserved for custom implementation of a UUID which can depend on a specific implementation, whereas UUIDs v6 and v7 are time-based and already have their bytes swapped for database storage. It's v6 and v7 which are the main source of the problem, not v8.

More to that, the only UUID version for which it makes sense to reorder the bytes is UUIDv1 (UPD: and v2), which is the only "old" time-based UUID (by "old" here I mean defined in RFC 4122, prior to this RFC draft: https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.html), while the other versions (2, 3, 4, 5, 8) are not time based or (6, 7) have their time bytes already reordered.

Comment by Sergei Golubchik [ 2023-06-28 ]

Sorry, it was a typo in the comment (the commit went through a lot of iterations over the last month).
It doesn't swap bytes starting from the v6. The relevant line is this: https://github.com/MariaDB/server/blob/68ba39d0b0/plugin/type_uuid/sql_type_uuid.h#L179

179
    if (force_swap || (from[6] > 0 && from[6] < 0x60 && from[8] < 0))

And from[6] < 0x60 condition means "version < 6"

Thanks for spotting this, I'll update the comment when I recommit

Comment by Illia Somov [ 2023-06-28 ]

Oh, good to know, thank you!

What do you think about not swapping the bytes for other versions (2, 3, 4 and 5) as well? It really makes sense only for v1, the other four are not time-based.

Comment by Illia Somov [ 2023-06-28 ]

UPD: I am sorry, it seems UUIDv2 is also time-based. It's not actively used (if used at all) in the wild and I got a bit confused. But v3, v4 and v5 are certainly not time-based.

Comment by Sergei Golubchik [ 2023-06-28 ]

It's for backward compatibility. Supposedly v3/v4/v5 are used for a long time now. And it doesn't hurt to swap bytes there, so better to keep them as they were. If I'd do a new implementation from scratch, then, perhaps, I'd only swap bytes in v1 and v2.

Comment by Evgenii [ 2023-07-05 ]

@serg
Will sorting work correctly if the table has some rows with v1 and some rows with v7?

Comment by Sergei Golubchik [ 2023-07-05 ]

Yes

Comment by Alexander Barkov [ 2023-07-06 ]

This patch is OK to push:

https://github.com/MariaDB/server/commit/ef84f8137b7c150635c23e1493f8620a4c8527ef

Comment by Evgenii [ 2023-07-09 ]

@serg
Is there an approximate date when the release 10.11.5 will appear, including this fix?

Comment by Sergei Golubchik [ 2023-07-09 ]

Yes. See the home page of https://jira.mariadb.org

Comment by Evgenii [ 2023-08-15 ]

@serg Good afternoon, I tried the latest release version 10.11.5 which was released yesterday on August 14th. This bug is present and incorrectly processes queries like "where id < uuid_v7".

How to reproduce:
1. create "TEST" table (id: UUID, model_value: STRING) and fill the data

id;model_value
0189f81d-498e-72f8-b0fa-246d7be1adb1;30
0189f81d-335c-71c7-a620-3f093ccc39c3;29
0189f81d-2f49-72fa-8ed9-4bd2fe509e59;28
0189f81d-2ae6-7204-b694-84602a2560c7;27
0189f81d-2620-717b-b873-1829f25a4d53;26
0189f81c-f940-7161-9c48-9677e11bc40c;25
0189f81c-d4cf-71c1-93f1-ddd005eadb8b;24
0189f81c-d07f-7143-8c8b-b19457138aef;23
0189f81c-cbec-707d-9b45-b007920b5ef6;22
0189f81c-c795-71d0-bc3f-8b7b74abe044;21
0189f81c-c33d-7100-a39c-80ea40a090fc;20
0189f81c-bf35-70a0-a288-8d80473bc015;19
0189f81c-baf8-718a-870c-cd41e677c3f4;18
0189f81c-b5f1-721d-b135-22c38153aab6;17
0189f81c-b011-71cd-bf2a-d6be3cea6dd7;16
0189f81c-9934-721f-90dc-05a65a79f4b6;15
0189f81c-94d6-733d-9953-2addf3a83be4;14
0189f81c-9016-71a6-a340-ec300186a4d9;13
0189f81c-8bca-73c9-8ffb-225a39a5dc49;12
0189f81c-868b-71d1-830e-9b4ae08aa782;11
0189f81c-8219-731a-b648-0e19e64d0ec0;10
0189f81c-7e01-710a-91ce-49e62163fa29;9
0189f81c-7a05-724e-a7e1-9baf65a91ecb;8
0189f81c-75b8-7223-bd6f-b60601be3aa1;7
0189f81c-71fc-7303-b850-4538b119c5fc;6
0189f81c-6e06-71a3-a47a-664a606b08d2;5
0189f81c-6a38-710e-8730-8babd48c28f4;4
0189f81c-650e-7171-b962-045cc4033000;3
0189f81c-4eff-70c7-a361-caffaf2c0666;2
0189f81c-49e9-7279-8087-9f8e25730957;1

If i run query "SELECT id, model_value FROM test ORDER BY ID DESC" - everything works correctly and i got all this rows.

But if i run query "SELECT id, model_value FROM test where ID < '0189f81c-c795-71d0-bc3f-8b7b74abe044' ORDER BY ID DESC", then i got the result

id;model_value
0189f81c-c33d-7100-a39c-80ea40a090fc;20
0189f81c-b5f1-721d-b135-22c38153aab6;17
0189f81c-9934-721f-90dc-05a65a79f4b6;15
0189f81c-94d6-733d-9953-2addf3a83be4;14
0189f81c-8bca-73c9-8ffb-225a39a5dc49;12
0189f81c-8219-731a-b648-0e19e64d0ec0;10
0189f81c-7e01-710a-91ce-49e62163fa29;9
0189f81c-71fc-7303-b850-4538b119c5fc;6
0189f81c-6e06-71a3-a47a-664a606b08d2;5
0189f81c-650e-7171-b962-045cc4033000;3

The "model_value" column shows that some rows are missing - 19, 18, 16, 13, 11, 8, 7, 4, 2, 1

@serg Is there a quick fix for this bug? I have been waiting for this feature since the end of 2022

Comment by Sergei Golubchik [ 2023-08-15 ]

xpun, this is a new bug, next time (if there will be one) please, report it separately, otherwise it'll likely be missed. This time, as I've noticed it, I've reported it as MDEV-31926.

Generated at Thu Feb 08 10:12:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.