[MDEV-4958] Adding datatype UUID - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Fix Version/s: 10.7.1
Component/s: N/A
Labels:
- Preview_10.7
- uuid

Epic Link:
New data types

Description

The current way of handling uuid's in MariaDB is not very user friendly. If you want to do it right you have to replace all 4 "-" chars with "" and store it in a binary(16) type and so on. That is a lot of work and sadly people start to use just varchar() for it because it's easier. But that is a huge performence problem.

To fix that i would propose to add a "uuid" datatype to MariaDB the same way PostgreSQL did it. http://www.postgresql.org/docs/9.2/static/datatype-uuid.html

That would make working with uuid's a lot easier and faster.
Thanks and greetings
Leo

Attachments

Issue Links

blocks

CONJ-899 Support UUID Object

Closed

MDEV-15854 Implement uuid_to_bin, bin_to_uuid and is_uuid functions

Open

causes

MDEV-26615 uuid() function on default column value can violate PK/Unique constrants

Closed

MDEV-26616 uuid data type - convert from text with binary data based on length

Closed

MDEV-26664 Store UUIDs in a more efficient manner

Closed

MDEV-26732 Assertion `0' failed in Item::val_native

Closed

MDEV-26742 Assertion `field->type_handler() == this' failed in FixedBinTypeBundle<NATIVE_LEN, MAX_CHAR_LEN>::Type_handler_fbt::stored_field_cmp_to_item

Closed

MDEV-26785 Hyphens inside the value of uuid datatype

Closed

MDEV-34981 Functions missing from INFORMATION_SCHEMA.SQL_FUNCTIONS

Closed

duplicates

MDEV-5593 Feature request native support for UUID's as a column type

Closed

is blocked by

MDEV-4912 Data type plugin API version 1

Closed

MDEV-20890 Illegal mix of collations with UUID()

Closed

relates to

MDEV-27207 Assertion `!m_null_value' failed in int FixedBinTypeBundle<FbtImpl>::cmp_item_fbt::compare or in cmp_item_inet6::compare

Closed

MDEV-31926 UUID v7 are compared incorrectly

Closed

MDEV-33442 REPAIR TABLE corrupts UUIDs

Closed

MDEV-8605 MariaDB not use DEFAULT value even when inserted NULL for NOT NULLABLE column.

Closed

MDEV-11339 Support UUID v4 generation

Closed

MDEV-23748 support not-MAC-address based UUID versions

Closed

MDEV-27015 Assertion `!is_null()' failed in FixedBinTypeBundle<FbtImpl>::Fbt FixedBinTypeBundle<FbtImpl>::Field_fbt::to_fbt()

Closed

MDEV-28491 Uuid. "UPDATE/DELETE" not working "WHERE id IN (SELECT id FROM ..)"

Closed

MDEV-31137 UUID type is never used for user variables

Closed

MDEV-33827 UUID() returns a NULL-able result

Closed

MDEV-35427 Assertion `is_null() >= item->null_value' failed in Timestamp_or_zero_datetime_native_null::Timestamp_or_zero_datetime_native_null on EXECUTE

Closed

links to

NEWSEQUENTIALID (SQL Server)

(4 causes, 1 duplicates, 2 is blocked by, 11 relates to, 1 links to)

Activity

Ascending order - Click to sort in descending order

Oliver Hoff (Inactive) added a comment - 2013-08-27 22:16 - edited

A datatype will maybe less useful for current applications, because of BC schemas, but I would recommend it for new schema.
For current schemas some (robust) conversion between textual and binary representation of UUIDs would be very helpful, although this functions can be shimed on legacy systems.

Oliver Hoff (Inactive) added a comment - 2013-08-27 22:16 - edited A datatype will maybe less useful for current applications, because of BC schemas, but I would recommend it for new schema. For current schemas some (robust) conversion between textual and binary representation of UUIDs would be very helpful, although this functions can be shimed on legacy systems.

Sergei Golubchik added a comment - 2013-08-30 18:03

You can convert from text uuid to binary with something like unhex(replace(uuid, "-", "")).

But yes, a dedicated type would be much more convenient, I agree.

Sergei Golubchik added a comment - 2013-08-30 18:03 You can convert from text uuid to binary with something like unhex(replace(uuid, "-", "")). But yes, a dedicated type would be much more convenient, I agree.

Jacob Rhoden (Inactive) added a comment - 2013-10-24 08:40

Recommend that this is implemented as a native data type, not plugin or other method, given the fact that people will be using this as a primary key.

Jacob Rhoden (Inactive) added a comment - 2013-10-24 08:40 Recommend that this is implemented as a native data type, not plugin or other method, given the fact that people will be using this as a primary key.

Sergei Golubchik added a comment - 2014-07-21 23:07

Also, the benefits of the dedicated type would be (quoting ~~MDEV-5593~~):

more comfortable because they do not have to do hex / unhex and remove dash

faster index than varchar variant

automatic generation of UUID if inserted record with NULL PK

Sergei Golubchik added a comment - 2014-07-21 23:07 Also, the benefits of the dedicated type would be (quoting MDEV-5593 ): more comfortable because they do not have to do hex / unhex and remove dash faster index than varchar variant automatic generation of UUID if inserted record with NULL PK

Michael Amado added a comment - 2015-10-08 09:36

Just in case performance could be improved in the initial implementation I thought I'd link this article:
https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ (author: Karthik Appigatla)
It suggests reordering the bits when storing UUIDs (version 1) has a big performance benefit when the UUID is used as a PK for InnoDB. I assume XtraDB would be the same. Ideally this bit shuffling would be handled so clients can naively insert/select the conventionally arranged version 1 UUIDs, but get the performance benefits of the reordering.

You can count me as someone who would benefit greatly from a performant UUID data type. Thanks!

Michael Amado added a comment - 2015-10-08 09:36 Just in case performance could be improved in the initial implementation I thought I'd link this article: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ (author: Karthik Appigatla) It suggests reordering the bits when storing UUIDs (version 1) has a big performance benefit when the UUID is used as a PK for InnoDB. I assume XtraDB would be the same. Ideally this bit shuffling would be handled so clients can naively insert/select the conventionally arranged version 1 UUIDs, but get the performance benefits of the reordering. You can count me as someone who would benefit greatly from a performant UUID data type. Thanks!

Federico Razzoli added a comment - 2015-10-21 15:29

I hope that automatic generation will work even if the UUID column is not a PK - just like timestamps.

Federico Razzoli added a comment - 2015-10-21 15:29 I hope that automatic generation will work even if the UUID column is not a PK - just like timestamps.

VAROQUI Stephane added a comment - 2015-11-11 11:56

Storing UUID with time locality :
http://mysqlserverteam.com/author/guilhem/

VAROQUI Stephane added a comment - 2015-11-11 11:56 Storing UUID with time locality : http://mysqlserverteam.com/author/guilhem/

Daniel Black added a comment - 2018-12-13 00:00

Functions implemented/ported in MDEV-15854 which include the lower level functions for a UUID datatype.

Daniel Black added a comment - 2018-12-13 00:00 Functions implemented/ported in MDEV-15854 which include the lower level functions for a UUID datatype.

Rick James added a comment - 2020-01-31 23:33

Recommend: When storing the bits into BINARY(16), rearrange them. Rationale...

Type-1 UUIDs (which MySQL has always used) include a timestamp. But the bits of the time are scrambled. This means that indexing on a UUID leads to random accesses. When an index (or table) is big, 'consecutive' rows tend to be scattered.

By simply shuffling the bits, you can make a Type-1 UUID act like an AUTO_INCREMENT or TIMESTAMP – you can get "locality of reference" for accessing chronologically 'close' records.

More discussion here: http://mysql.rjweb.org/doc.php/uuid , and the 8.0.0 Changelog: UUID_TO_BIN() and BIN_TO_UUID() convert between UUID values in string and binary formats (represented as hexadecimal characters and VARBINARY(16), respectively). This permits conversion of string UUID values to binary values that take less storage space. UUID values converted to binary can be represented in a way that permits improved indexing efficiency.

Rick James added a comment - 2020-01-31 23:33 Recommend: When storing the bits into BINARY(16), rearrange them. Rationale... Type-1 UUIDs (which MySQL has always used) include a timestamp. But the bits of the time are scrambled. This means that indexing on a UUID leads to random accesses. When an index (or table) is big, 'consecutive' rows tend to be scattered. By simply shuffling the bits, you can make a Type-1 UUID act like an AUTO_INCREMENT or TIMESTAMP – you can get "locality of reference" for accessing chronologically 'close' records. More discussion here: http://mysql.rjweb.org/doc.php/uuid , and the 8.0.0 Changelog: UUID_TO_BIN() and BIN_TO_UUID() convert between UUID values in string and binary formats (represented as hexadecimal characters and VARBINARY(16), respectively). This permits conversion of string UUID values to binary values that take less storage space. UUID values converted to binary can be represented in a way that permits improved indexing efficiency.

Alexander Barkov added a comment - 2020-02-11 10:36

SQL Server introduces a separate data type for this:

https://docs.microsoft.com/en-us/sql/t-sql/functions/newsequentialid-transact-sql?view=sql-server-ver15

Alexander Barkov added a comment - 2020-02-11 10:36 SQL Server introduces a separate data type for this: https://docs.microsoft.com/en-us/sql/t-sql/functions/newsequentialid-transact-sql?view=sql-server-ver15

Steven Ayre added a comment - 2020-08-12 22:54 - edited

Rick, would you suggest that it always reorders? That incurs a small performance penalty which is unnecessary for v4 UUIDs.

Perhaps a UUID(0) and UUID(1) argument to replicate the swap_flag on UUID_TO_BIN() could be used to toggle that behaviour?

Alexander, NEWSEQUENTIALID is a function returning a uuid-like 128bit value to store in a uniqueidentifier column. It isn't compatible with v1/v2 UUIDs. Are you suggesting two separate data types?

A flag affecting the storage would seem preferable to me, similar to the fsp argument on the DATETIME type.

Steven Ayre added a comment - 2020-08-12 22:54 - edited Rick, would you suggest that it always reorders? That incurs a small performance penalty which is unnecessary for v4 UUIDs. Perhaps a UUID(0) and UUID(1) argument to replicate the swap_flag on UUID_TO_BIN() could be used to toggle that behaviour? Alexander, NEWSEQUENTIALID is a function returning a uuid-like 128bit value to store in a uniqueidentifier column. It isn't compatible with v1/v2 UUIDs. Are you suggesting two separate data types? A flag affecting the storage would seem preferable to me, similar to the fsp argument on the DATETIME type.

Rick James added a comment - 2020-08-13 22:37

I lean toward always reordering. (But I don't have a strong opinion.) Some debating points (on both sides):

Nothing else will look at the layout on disk, so always swapping works.
MySQL 8 has a flag – I see this as clutter with no benefit.
Would mysqldump (etc) always convert to the standard 36-char HEX layout?
Performance penalty is insignificant. (I repeatedly remind users that all the overhead of fetching a row is much higher than simple function calls.)
Compatibility with MySQL 8 (with its optional flag) may be more important than other arguments. (MariaDB has been drifting away from its "drop-in compatibility" claim.)
Swapping does not hurt type-4 (etc) uuids.

Rick James added a comment - 2020-08-13 22:37 I lean toward always reordering. (But I don't have a strong opinion.) Some debating points (on both sides): Nothing else will look at the layout on disk, so always swapping works. MySQL 8 has a flag – I see this as clutter with no benefit. Would mysqldump (etc) always convert to the standard 36-char HEX layout? Performance penalty is insignificant. (I repeatedly remind users that all the overhead of fetching a row is much higher than simple function calls.) Compatibility with MySQL 8 (with its optional flag) may be more important than other arguments. (MariaDB has been drifting away from its "drop-in compatibility" claim.) Swapping does not hurt type-4 (etc) uuids.

Steven Ayre added a comment - 2020-08-18 22:50 - edited

> MySQL 8 has a flag – I see this as clutter with no benefit.
I've implemented the MySQL 8 functions as a UDF (https://github.com/SteveAyre/uuid2bin) to use until this type is added, and the reordering has a 5% performance hit. So there is a benefit. What impact it will have on the data type will depend on the implementation and yes fetching the row is more significant, but it won't be free. If you're working on millions of rows it may be important.

Steven Ayre added a comment - 2020-08-18 22:50 - edited > MySQL 8 has a flag – I see this as clutter with no benefit. I've implemented the MySQL 8 functions as a UDF ( https://github.com/SteveAyre/uuid2bin ) to use until this type is added, and the reordering has a 5% performance hit. So there is a benefit. What impact it will have on the data type will depend on the implementation and yes fetching the row is more significant, but it won't be free. If you're working on millions of rows it may be important.

Rick James added a comment - 2020-08-19 01:54

In some situations, there is a significant performance advantage for the reshuffle.

Given:

A huge table (bigger than the buffer_pool)
PRIMARY KEY(uuid) – Type 1 with the bits rearranged as described
The working set is "recent" data
Then the activity on the table will use much less I/O, hence be much faster than if the bits were not rearranged.

(A similar argument can be made for a secondary index starting with UUID.)

Rick James added a comment - 2020-08-19 01:54 In some situations, there is a significant performance advantage for the reshuffle. Given: A huge table (bigger than the buffer_pool) PRIMARY KEY(uuid) – Type 1 with the bits rearranged as described The working set is "recent" data Then the activity on the table will use much less I/O, hence be much faster than if the bits were not rearranged. (A similar argument can be made for a secondary index starting with UUID.)

Steven Ayre added a comment - 2020-08-20 20:30 - edited

Should consider an upgrade path from existing databases using byte reordering (MySQL 8 or MariaDB with either stored functions or UDF)...

Importing a dump containing BINARY(16) into a schema using the UUID type
ALTER TABLE MODIFY on a column from the BINARY(16) to UUID type
In these cases the UUID type will need to know whether the timestamp bytes are reordered or not.

Personally I like the idea of UUID(0) and UUID(1) a lot as it gives a way to specify this, or as an alternative two different UUID types.

Perhaps the default behaviour should be to reorder if the DBA hasn't specified one.

Steven Ayre added a comment - 2020-08-20 20:30 - edited Should consider an upgrade path from existing databases using byte reordering (MySQL 8 or MariaDB with either stored functions or UDF)... Importing a dump containing BINARY(16) into a schema using the UUID type ALTER TABLE MODIFY on a column from the BINARY(16) to UUID type In these cases the UUID type will need to know whether the timestamp bytes are reordered or not. Personally I like the idea of UUID(0) and UUID(1) a lot as it gives a way to specify this, or as an alternative two different UUID types. Perhaps the default behaviour should be to reorder if the DBA hasn't specified one.

Daniel Black added a comment - 2021-02-15 11:26

An example of the ideal case for this: https://dba.stackexchange.com/questions/285410/innodb-primary-key-advice

Daniel Black added a comment - 2021-02-15 11:26 An example of the ideal case for this: https://dba.stackexchange.com/questions/285410/innodb-primary-key-advice

Rick James added a comment - 2021-02-15 22:25

Another thought... Instead of shuffling the bits when the UUID is stored, make use of the "datatype UUID" to shuffle the bits as it is being used in BTree accesses.

That is, the storage has the straightforward mapping between bits and hex with dashes; no rearranging.

Just as FLOAT and DECIMAL and SIGNED/UNSIGNED must interpret the bits differently when comparing, UUID would shuffle the bits when comparing.

Rick James added a comment - 2021-02-15 22:25 Another thought... Instead of shuffling the bits when the UUID is stored, make use of the "datatype UUID" to shuffle the bits as it is being used in BTree accesses. That is, the storage has the straightforward mapping between bits and hex with dashes; no rearranging. Just as FLOAT and DECIMAL and SIGNED/UNSIGNED must interpret the bits differently when comparing, UUID would shuffle the bits when comparing.

Daniel Black added a comment - 2021-09-20 11:30

To all those that are watching, a preview release is available for testing and feedback:

https://mariadb.org/10-7-preview-feature-uuid-data-type/

Daniel Black added a comment - 2021-09-20 11:30 To all those that are watching, a preview release is available for testing and feedback: https://mariadb.org/10-7-preview-feature-uuid-data-type/

Alice Sherepa added a comment - 2021-10-29 12:57

I think it could be merged into 10.7

Alice Sherepa added a comment - 2021-10-29 12:57 I think it could be merged into 10.7

Slawomir Dymitrow added a comment - 2022-01-22 08:49

Hi guys, so excited to see the UUID data type as a feature! One question - does the actually implemented feature use the bits-shuffle solution to optimize performance? Or was this only an idea that didn't get implemented?

Slawomir Dymitrow added a comment - 2022-01-22 08:49 Hi guys, so excited to see the UUID data type as a feature! One question - does the actually implemented feature use the bits-shuffle solution to optimize performance? Or was this only an idea that didn't get implemented?

Ian Gilfillan added a comment - 2022-01-22 09:15

eXsio see ~~MDEV-26664~~

Ian Gilfillan added a comment - 2022-01-22 09:15 eXsio see MDEV-26664

Alexander Barkov added a comment - 2022-01-22 10:23

eXsio,

Also please see these comments:
https://github.com/MariaDB/server/blob/10.7/plugin/type_uuid/sql_type_uuid.h#L29

Alexander Barkov added a comment - 2022-01-22 10:23 eXsio , Also please see these comments: https://github.com/MariaDB/server/blob/10.7/plugin/type_uuid/sql_type_uuid.h#L29

People

Assignee:: Alice Sherepa

Reporter:: Leo Unglaub

Votes:: 36 Vote for this issue

Watchers:: 37 Start watching this issue

Dates

Created:: 2013-08-27 22:04

Updated:: 2025-03-11 17:03

Resolved:: 2021-11-10 16:02

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server