Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5241

Collation incompatibilities with MySQL-5.6

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.4, 5.5.33a
    • 10.0.6
    • None
    • None

    Description

      There are incompatibilities between some MariaDB and MySQL collations
      which we need to solve somehow.

      Problems

      1.

      The utf8_croatian_ci and ucs2_croatian_ci collations appeared in MariaDB-5.1 in the end of 2009, based on Alexander Barkov's patch from: http://collation-charts.org/articles/croatian.htm

      Later, the Croatian collations were added into MySQL-5.6.

      Still, MariaDB Croatian collation uses the latest version of the rules from http://unicode.org/cldr/trac/browser/trunk/common/collation/hr.xml while MySQL implements the older version.

      The difference is in 3 letters only. But it's enough to make the indexes incompatible.

      As a effect:

      • utf8_croatian_ci (ID 213) is different in MariaDB and MySQL
      • ucs2_croatian_ci (ID 149) is different in MariaDB and MySQL

      2.

      Later, MySQL-5.5 added support for utf8mb4, utf16, utf32. When merging the new character sets (MySQL-5.5 -> MariaDB-5.5) the MariaDB team added the following corresponding collations, for symmetry with utf8 and ucs2:

      • utf8mb4_croatian_ci (ID=245)
      • utf16_croatian_ci (ID=215)
      • utf32_croatian_ci (ID=214)

      But when the collations with the same names finally appeared in MySQL-5.6, they were given different IDs. So the IDs 215, 215, 245 are assigned in MySQL-5.6 to something else.

      This is what we have in MariaDB:

      mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS
         --> WHERE COLLATION_NAME LIKE 'u%croat%';
      +---------------------+-----+
      | COLLATION_NAME      | ID  |
      +---------------------+-----+
      | ucs2_croatian_ci    | 149 |
      | utf8_croatian_ci    | 213 |
      | utf32_croatian_ci   | 214 |
      | utf16_croatian_ci   | 215 |
      | utf8mb4_croatian_ci | 245 |
      +---------------------+-----+
      5 rows in set (0.01 sec)

      This is what we have in MySQL-5.6:

      mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS
         --> WHERE ID IN (149,213,214,215,245);
      +---------------------+-----+
      | COLLATION_NAME      | ID  | Problem:
      +---------------------+-----+
      | ucs2_croatian_ci    | 149 | MySQL rules differ from MariaDB rules
      | utf8_croatian_ci    | 213 | MySQL rules differ from MariaDB rules
      | utf8_unicode_520_ci | 214 | MariaDB utf32_croatian_ci
      | utf8_vietnamese_ci  | 215 | MariaDB utf16_croatian_ci
      | utf8mb4_croatian_ci | 245 | MySQL rules differ from MariaDB rules
      +---------------------+-----+

      Solution

      Collation changes

      • Bar moves MariaDB-5.5 xxx_croatian_ci collations to new IDs (preferrably, outside of the 0..255 range), without changing the collation name.
      • Bar merges MySQL-5.6 xxx_croatian_ci using MySQL-5.6 IDs, but changing the names to xxx_croatian_mysql56_ci.

      Detect attempts to open tables with the old MariaDB collations.

      Bar fixes TABLE_SHARE::init_from_binary_frm_image() and adds an error message for a table created by any MariaDB version prior to 10.0.5 that have indexes using collation IDs 213, 149, 245, 215, 214:

      +---------------------+---------+-----+---------+----------+---------+
      | Collation           | Charset | Id  | Default | Compiled | Sortlen |
      +---------------------+---------+-----+---------+----------+---------+
      | utf8_croatian_ci    | utf8    | 213 |         | Yes      |       8 |
      | ucs2_croatian_ci    | ucs2    | 149 |         | Yes      |       8 |
      | utf8mb4_croatian_ci | utf8mb4 | 245 |         | Yes      |       8 |
      | utf16_croatian_ci   | utf16   | 215 |         | Yes      |       8 |
      | utf32_croatian_ci   | utf32   | 214 |         | Yes      |       8 |
      +---------------------+---------+-----+---------+----------+---------+

      ER_TABLE_NEEDS_UPGRADE looks suitable for this purposes:

      "Table upgrade required. Please do \"REPAIR TABLE `%-.32s`\" or dump/reload to fix it!"

      mysql_upgrade

      Monty will try to fix REPAIR to solve the conflicting IDs problem.

      quick REPAIR

      In long terms we can add a quick REPAIR to replace collation IDs in table definitions in FRM files and in engine-specific structure definitions (e.g. in MYI files for MyISAM) without having to do the full repair for the table.

      Attachments

        Issue Links

          Activity

            People

              bar Alexander Barkov
              serg Sergei Golubchik
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.