Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.4, 5.5.33a
-
None
-
None
Description
There are incompatibilities between some MariaDB and MySQL collations
which we need to solve somehow.
Problems
1.
The utf8_croatian_ci and ucs2_croatian_ci collations appeared in MariaDB-5.1 in the end of 2009, based on Alexander Barkov's patch from: http://collation-charts.org/articles/croatian.htm
Later, the Croatian collations were added into MySQL-5.6.
Still, MariaDB Croatian collation uses the latest version of the rules from http://unicode.org/cldr/trac/browser/trunk/common/collation/hr.xml while MySQL implements the older version.
The difference is in 3 letters only. But it's enough to make the indexes incompatible.
As a effect:
- utf8_croatian_ci (ID 213) is different in MariaDB and MySQL
- ucs2_croatian_ci (ID 149) is different in MariaDB and MySQL
2.
Later, MySQL-5.5 added support for utf8mb4, utf16, utf32. When merging the new character sets (MySQL-5.5 -> MariaDB-5.5) the MariaDB team added the following corresponding collations, for symmetry with utf8 and ucs2:
- utf8mb4_croatian_ci (ID=245)
- utf16_croatian_ci (ID=215)
- utf32_croatian_ci (ID=214)
But when the collations with the same names finally appeared in MySQL-5.6, they were given different IDs. So the IDs 215, 215, 245 are assigned in MySQL-5.6 to something else.
This is what we have in MariaDB:
mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS
|
--> WHERE COLLATION_NAME LIKE 'u%croat%';
|
+---------------------+-----+
|
| COLLATION_NAME | ID |
|
+---------------------+-----+
|
| ucs2_croatian_ci | 149 |
|
| utf8_croatian_ci | 213 |
|
| utf32_croatian_ci | 214 |
|
| utf16_croatian_ci | 215 |
|
| utf8mb4_croatian_ci | 245 |
|
+---------------------+-----+
|
5 rows in set (0.01 sec)
|
This is what we have in MySQL-5.6:
mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS
|
--> WHERE ID IN (149,213,214,215,245);
|
+---------------------+-----+
|
| COLLATION_NAME | ID | Problem:
|
+---------------------+-----+
|
| ucs2_croatian_ci | 149 | MySQL rules differ from MariaDB rules
|
| utf8_croatian_ci | 213 | MySQL rules differ from MariaDB rules
|
| utf8_unicode_520_ci | 214 | MariaDB utf32_croatian_ci
|
| utf8_vietnamese_ci | 215 | MariaDB utf16_croatian_ci
|
| utf8mb4_croatian_ci | 245 | MySQL rules differ from MariaDB rules
|
+---------------------+-----+
|
Solution
Collation changes
- Bar moves MariaDB-5.5 xxx_croatian_ci collations to new IDs (preferrably, outside of the 0..255 range), without changing the collation name.
- Bar merges MySQL-5.6 xxx_croatian_ci using MySQL-5.6 IDs, but changing the names to xxx_croatian_mysql56_ci.
Detect attempts to open tables with the old MariaDB collations.
Bar fixes TABLE_SHARE::init_from_binary_frm_image() and adds an error message for a table created by any MariaDB version prior to 10.0.5 that have indexes using collation IDs 213, 149, 245, 215, 214:
+---------------------+---------+-----+---------+----------+---------+
|
| Collation | Charset | Id | Default | Compiled | Sortlen |
|
+---------------------+---------+-----+---------+----------+---------+
|
| utf8_croatian_ci | utf8 | 213 | | Yes | 8 |
|
| ucs2_croatian_ci | ucs2 | 149 | | Yes | 8 |
|
| utf8mb4_croatian_ci | utf8mb4 | 245 | | Yes | 8 |
|
| utf16_croatian_ci | utf16 | 215 | | Yes | 8 |
|
| utf32_croatian_ci | utf32 | 214 | | Yes | 8 |
|
+---------------------+---------+-----+---------+----------+---------+
|
ER_TABLE_NEEDS_UPGRADE looks suitable for this purposes:
"Table upgrade required. Please do \"REPAIR TABLE `%-.32s`\" or dump/reload to fix it!"
|
mysql_upgrade
Monty will try to fix REPAIR to solve the conflicting IDs problem.
quick REPAIR
In long terms we can add a quick REPAIR to replace collation IDs in table definitions in FRM files and in engine-specific structure definitions (e.g. in MYI files for MyISAM) without having to do the full repair for the table.
Attachments
Issue Links
- relates to
-
MDEV-16945 main.ctype_upgrade failed in buildbot, error upon mysql_upgrade
- Open