Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30164

System variable for default collations

    XMLWordPrintable

Details

    Description

      We're planning to change default collation for Unicode character sets to uca1400_ai_ci soon.

      A hard coded change in the collation library is not desirable, and is hardly even possible.

      It should be implemented on the server (rather than collation library) level, to set default collation overrides per-session:

      • For old-to-new replication
      • Some users may still want to use the legacy xxx_general_ci instead of uca1400_ai_ci.
      • We'll likely add more UCA versions in the future, so users will also want to choose the desired default UCA version.
      • Some users will want to have accent sensitive comparison by default, i.e. uca140_as_ci instead of uca1400_ai_ci.

      A new system variable @@character_set_collations

      The easiest way to solve all these problems is introducing a new system variable, in this format:
      'charset1=collation1,charset2=collation2,charset3=collation3'.
      That is, a comma separated list of charset/collation pairs.

      For example:

      SET @@character_set_collations='utf8mb4=utf8mb4_uca1400_ai_ci, utf8mb3=utf8mb4_uca1400_ai_ci';
      

      The new variable will understand contextually typed collations:

      # No needs for the charset prefix in the collation name:
      SET @@character_set_collations='utf8mb4=uca1400_ai_ci, utf8mb3=uca1400_ai_ci';
      

      The new variable will be global and session. The global value should be settable in the server command line and in my.cnf.

      Convenience syntax (postponed)

      Originally there was a proposal to add a convenience syntax, however we won't add it under terms of this task:

      Let's additionally introduce a nicer syntax to set this variable in a more readable way:

      SET CHARACTER SET COLLATION uca1400_ai_ci FOR utf8mb3, utf8mb4, ucs2, utf16, utf32;
      

      Usage example:

      SET @@character_set_collations='utf8mb4=uca1400_ai_ci';
      CREATE OR REPLACE TABLE t1 (a TEXT CHARACTER SET utf8mb4);
      SHOW CREATE TABLE t1;
      

      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                                                                     |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `a` text CHARACTER SET utf8mb4 COLLATE utf8mb4_uca1400_ai_ci DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      

      Notice, the column was created with the collation utf8mb4_uca1400_ai_ci, which is the current session default collation for utf8mb4 as stated in @@character_set_collations.

      The new variable will take effect in all cases where a character set is explicitly or implicitly specified without an explicit COLLATE clause, including but not limited to:

      • Column collation
      • Table collation
      • Database collation
      • CHAR(expr USING csname)
      • CONVERT(expr USING csname)
      • CAST(expr AS CHAR CHARACTER SET csname)
      • '' - character string literal
      • _utf8mb3'text' - a character string literal with an introducer
      • _utf8mb3 X'61' - a character string literal with an introducer with hex notation
      • _utf8mb3 0x61 - a character string literal with an introducer with hex hybrid notation
      • @@collation_connection after a SET NAMES without COLLATE

      Attachments

        Issue Links

          Activity

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.