Details

    Description

      The current way of handling uuid's in MariaDB is not very user friendly. If you want to do it right you have to replace all 4 "-" chars with "" and store it in a binary(16) type and so on. That is a lot of work and sadly people start to use just varchar() for it because it's easier. But that is a huge performence problem.

      To fix that i would propose to add a "uuid" datatype to MariaDB the same way PostgreSQL did it. http://www.postgresql.org/docs/9.2/static/datatype-uuid.html

      That would make working with uuid's a lot easier and faster.
      Thanks and greetings
      Leo

      Attachments

        Issue Links

          Activity

            backbone Oliver Hoff (Inactive) added a comment - - edited

            A datatype will maybe less useful for current applications, because of BC schemas, but I would recommend it for new schema.
            For current schemas some (robust) conversion between textual and binary representation of UUIDs would be very helpful, although this functions can be shimed on legacy systems.

            backbone Oliver Hoff (Inactive) added a comment - - edited A datatype will maybe less useful for current applications, because of BC schemas, but I would recommend it for new schema. For current schemas some (robust) conversion between textual and binary representation of UUIDs would be very helpful, although this functions can be shimed on legacy systems.

            You can convert from text uuid to binary with something like unhex(replace(uuid, "-", "")).

            But yes, a dedicated type would be much more convenient, I agree.

            serg Sergei Golubchik added a comment - You can convert from text uuid to binary with something like unhex(replace(uuid, "-", "")). But yes, a dedicated type would be much more convenient, I agree.

            Recommend that this is implemented as a native data type, not plugin or other method, given the fact that people will be using this as a primary key.

            corydoras Jacob Rhoden (Inactive) added a comment - Recommend that this is implemented as a native data type, not plugin or other method, given the fact that people will be using this as a primary key.

            Also, the benefits of the dedicated type would be (quoting MDEV-5593):

            • more comfortable because they do not have to do hex / unhex and remove dash
            • faster index than varchar variant
            • automatic generation of UUID if inserted record with NULL PK
            serg Sergei Golubchik added a comment - Also, the benefits of the dedicated type would be (quoting MDEV-5593 ): more comfortable because they do not have to do hex / unhex and remove dash faster index than varchar variant automatic generation of UUID if inserted record with NULL PK
            amado Michael Amado added a comment -

            Just in case performance could be improved in the initial implementation I thought I'd link this article:
            https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ (author: Karthik Appigatla)
            It suggests reordering the bits when storing UUIDs (version 1) has a big performance benefit when the UUID is used as a PK for InnoDB. I assume XtraDB would be the same. Ideally this bit shuffling would be handled so clients can naively insert/select the conventionally arranged version 1 UUIDs, but get the performance benefits of the reordering.

            You can count me as someone who would benefit greatly from a performant UUID data type. Thanks!

            amado Michael Amado added a comment - Just in case performance could be improved in the initial implementation I thought I'd link this article: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ (author: Karthik Appigatla) It suggests reordering the bits when storing UUIDs (version 1) has a big performance benefit when the UUID is used as a PK for InnoDB. I assume XtraDB would be the same. Ideally this bit shuffling would be handled so clients can naively insert/select the conventionally arranged version 1 UUIDs, but get the performance benefits of the reordering. You can count me as someone who would benefit greatly from a performant UUID data type. Thanks!

            I hope that automatic generation will work even if the UUID column is not a PK - just like timestamps.

            f_razzoli Federico Razzoli added a comment - I hope that automatic generation will work even if the UUID column is not a PK - just like timestamps.

            Storing UUID with time locality :
            http://mysqlserverteam.com/author/guilhem/

            stephane@skysql.com VAROQUI Stephane added a comment - Storing UUID with time locality : http://mysqlserverteam.com/author/guilhem/
            danblack Daniel Black added a comment -

            Functions implemented/ported in MDEV-15854 which include the lower level functions for a UUID datatype.

            danblack Daniel Black added a comment - Functions implemented/ported in MDEV-15854 which include the lower level functions for a UUID datatype.
            rjasdfiii Rick James added a comment -

            Recommend: When storing the bits into BINARY(16), rearrange them. Rationale...

            Type-1 UUIDs (which MySQL has always used) include a timestamp. But the bits of the time are scrambled. This means that indexing on a UUID leads to random accesses. When an index (or table) is big, 'consecutive' rows tend to be scattered.

            By simply shuffling the bits, you can make a Type-1 UUID act like an AUTO_INCREMENT or TIMESTAMP – you can get "locality of reference" for accessing chronologically 'close' records.

            More discussion here: http://mysql.rjweb.org/doc.php/uuid , and the 8.0.0 Changelog: UUID_TO_BIN() and BIN_TO_UUID() convert between UUID values in string and binary formats (represented as hexadecimal characters and VARBINARY(16), respectively). This permits conversion of string UUID values to binary values that take less storage space. UUID values converted to binary can be represented in a way that permits improved indexing efficiency.

            rjasdfiii Rick James added a comment - Recommend: When storing the bits into BINARY(16), rearrange them. Rationale... Type-1 UUIDs (which MySQL has always used) include a timestamp. But the bits of the time are scrambled. This means that indexing on a UUID leads to random accesses. When an index (or table) is big, 'consecutive' rows tend to be scattered. By simply shuffling the bits, you can make a Type-1 UUID act like an AUTO_INCREMENT or TIMESTAMP – you can get "locality of reference" for accessing chronologically 'close' records. More discussion here: http://mysql.rjweb.org/doc.php/uuid , and the 8.0.0 Changelog: UUID_TO_BIN() and BIN_TO_UUID() convert between UUID values in string and binary formats (represented as hexadecimal characters and VARBINARY(16), respectively). This permits conversion of string UUID values to binary values that take less storage space. UUID values converted to binary can be represented in a way that permits improved indexing efficiency.
            bar Alexander Barkov added a comment - SQL Server introduces a separate data type for this: https://docs.microsoft.com/en-us/sql/t-sql/functions/newsequentialid-transact-sql?view=sql-server-ver15
            steveayre Steven Ayre added a comment - - edited

            Rick, would you suggest that it always reorders? That incurs a small performance penalty which is unnecessary for v4 UUIDs.

            Perhaps a UUID(0) and UUID(1) argument to replicate the swap_flag on UUID_TO_BIN() could be used to toggle that behaviour?

            Alexander, NEWSEQUENTIALID is a function returning a uuid-like 128bit value to store in a uniqueidentifier column. It isn't compatible with v1/v2 UUIDs. Are you suggesting two separate data types?

            A flag affecting the storage would seem preferable to me, similar to the fsp argument on the DATETIME type.

            steveayre Steven Ayre added a comment - - edited Rick, would you suggest that it always reorders? That incurs a small performance penalty which is unnecessary for v4 UUIDs. Perhaps a UUID(0) and UUID(1) argument to replicate the swap_flag on UUID_TO_BIN() could be used to toggle that behaviour? Alexander, NEWSEQUENTIALID is a function returning a uuid-like 128bit value to store in a uniqueidentifier column. It isn't compatible with v1/v2 UUIDs. Are you suggesting two separate data types? A flag affecting the storage would seem preferable to me, similar to the fsp argument on the DATETIME type.
            rjasdfiii Rick James added a comment -

            I lean toward always reordering. (But I don't have a strong opinion.) Some debating points (on both sides):

            • Nothing else will look at the layout on disk, so always swapping works.
            • MySQL 8 has a flag – I see this as clutter with no benefit.
            • Would mysqldump (etc) always convert to the standard 36-char HEX layout?
            • Performance penalty is insignificant. (I repeatedly remind users that all the overhead of fetching a row is much higher than simple function calls.)
            • Compatibility with MySQL 8 (with its optional flag) may be more important than other arguments. (MariaDB has been drifting away from its "drop-in compatibility" claim.)
            • Swapping does not hurt type-4 (etc) uuids.
            rjasdfiii Rick James added a comment - I lean toward always reordering. (But I don't have a strong opinion.) Some debating points (on both sides): Nothing else will look at the layout on disk, so always swapping works. MySQL 8 has a flag – I see this as clutter with no benefit. Would mysqldump (etc) always convert to the standard 36-char HEX layout? Performance penalty is insignificant. (I repeatedly remind users that all the overhead of fetching a row is much higher than simple function calls.) Compatibility with MySQL 8 (with its optional flag) may be more important than other arguments. (MariaDB has been drifting away from its "drop-in compatibility" claim.) Swapping does not hurt type-4 (etc) uuids.
            steveayre Steven Ayre added a comment - - edited

            > MySQL 8 has a flag – I see this as clutter with no benefit.
            I've implemented the MySQL 8 functions as a UDF (https://github.com/SteveAyre/uuid2bin) to use until this type is added, and the reordering has a 5% performance hit. So there is a benefit. What impact it will have on the data type will depend on the implementation and yes fetching the row is more significant, but it won't be free. If you're working on millions of rows it may be important.

            steveayre Steven Ayre added a comment - - edited > MySQL 8 has a flag – I see this as clutter with no benefit. I've implemented the MySQL 8 functions as a UDF ( https://github.com/SteveAyre/uuid2bin ) to use until this type is added, and the reordering has a 5% performance hit. So there is a benefit. What impact it will have on the data type will depend on the implementation and yes fetching the row is more significant, but it won't be free. If you're working on millions of rows it may be important.
            rjasdfiii Rick James added a comment -

            In some situations, there is a significant performance advantage for the reshuffle.

            Given:

            • A huge table (bigger than the buffer_pool)
            • PRIMARY KEY(uuid) – Type 1 with the bits rearranged as described
            • The working set is "recent" data
              Then the activity on the table will use much less I/O, hence be much faster than if the bits were not rearranged.

            (A similar argument can be made for a secondary index starting with UUID.)

            rjasdfiii Rick James added a comment - In some situations, there is a significant performance advantage for the reshuffle. Given: A huge table (bigger than the buffer_pool) PRIMARY KEY(uuid) – Type 1 with the bits rearranged as described The working set is "recent" data Then the activity on the table will use much less I/O, hence be much faster than if the bits were not rearranged. (A similar argument can be made for a secondary index starting with UUID.)
            steveayre Steven Ayre added a comment - - edited

            Should consider an upgrade path from existing databases using byte reordering (MySQL 8 or MariaDB with either stored functions or UDF)...

            • Importing a dump containing BINARY(16) into a schema using the UUID type
            • ALTER TABLE MODIFY on a column from the BINARY(16) to UUID type
              In these cases the UUID type will need to know whether the timestamp bytes are reordered or not.

            Personally I like the idea of UUID(0) and UUID(1) a lot as it gives a way to specify this, or as an alternative two different UUID types.

            Perhaps the default behaviour should be to reorder if the DBA hasn't specified one.

            steveayre Steven Ayre added a comment - - edited Should consider an upgrade path from existing databases using byte reordering (MySQL 8 or MariaDB with either stored functions or UDF)... Importing a dump containing BINARY(16) into a schema using the UUID type ALTER TABLE MODIFY on a column from the BINARY(16) to UUID type In these cases the UUID type will need to know whether the timestamp bytes are reordered or not. Personally I like the idea of UUID(0) and UUID(1) a lot as it gives a way to specify this, or as an alternative two different UUID types. Perhaps the default behaviour should be to reorder if the DBA hasn't specified one.
            danblack Daniel Black added a comment - An example of the ideal case for this: https://dba.stackexchange.com/questions/285410/innodb-primary-key-advice
            rjasdfiii Rick James added a comment -

            Another thought... Instead of shuffling the bits when the UUID is stored, make use of the "datatype UUID" to shuffle the bits as it is being used in BTree accesses.

            That is, the storage has the straightforward mapping between bits and hex with dashes; no rearranging.

            Just as FLOAT and DECIMAL and SIGNED/UNSIGNED must interpret the bits differently when comparing, UUID would shuffle the bits when comparing.

            rjasdfiii Rick James added a comment - Another thought... Instead of shuffling the bits when the UUID is stored, make use of the "datatype UUID" to shuffle the bits as it is being used in BTree accesses. That is, the storage has the straightforward mapping between bits and hex with dashes; no rearranging. Just as FLOAT and DECIMAL and SIGNED/UNSIGNED must interpret the bits differently when comparing, UUID would shuffle the bits when comparing.
            danblack Daniel Black added a comment -

            To all those that are watching, a preview release is available for testing and feedback:

            https://mariadb.org/10-7-preview-feature-uuid-data-type/

            danblack Daniel Black added a comment - To all those that are watching, a preview release is available for testing and feedback: https://mariadb.org/10-7-preview-feature-uuid-data-type/
            alice Alice Sherepa added a comment -

            I think it could be merged into 10.7

            alice Alice Sherepa added a comment - I think it could be merged into 10.7

            Hi guys, so excited to see the UUID data type as a feature! One question - does the actually implemented feature use the bits-shuffle solution to optimize performance? Or was this only an idea that didn't get implemented?

            eXsio Slawomir Dymitrow added a comment - Hi guys, so excited to see the UUID data type as a feature! One question - does the actually implemented feature use the bits-shuffle solution to optimize performance? Or was this only an idea that didn't get implemented?
            greenman Ian Gilfillan added a comment - eXsio see MDEV-26664
            bar Alexander Barkov added a comment - eXsio , Also please see these comments: https://github.com/MariaDB/server/blob/10.7/plugin/type_uuid/sql_type_uuid.h#L29

            People

              alice Alice Sherepa
              leo.unglaub Leo Unglaub
              Votes:
              36 Vote for this issue
              Watchers:
              37 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.