|
In the code we use two things to compare identifiers:
- my_strcasecmp(system_charset_info, ...), or just system_charset_info->strcasecmp(..) when used directly without the macro. This comparison way is used when the two strings are known to be 0-terminated.
- system_charset_info->strnncoll(...). This way is used when we know the lengths of the operands and/or the strings are not necessarily 0-terminated.
There is a problem with this choice:
- strcasecmp() performs accent sensitive case insensitive comparison
- strnncoll() performs accent insensitive case insensitive comparison
This mismatch causes troubles (see examples below).
To avoid this mismatch, we need a dedicated collation for identifiers, whose strnncoll() compares in the accent sensitive case insensitive way, to have both strcasecmp() and strnncoll() compare in the same style.
A tentative name is utf8_general_as_ci, or even utf8_identifier_ci.
Under terms of this task, it may not even be necessary to expose this collation to the SQL level. But later, we should revise INFORMATION_SCHEMA to use this collation for certain columns.
The following scripts demonstrate the difference:
- Index names demonstrate both accent sensitive (correct) and accent insensitive (wrong) behavior
CREATE OR REPLACE TABLE t1 (a INT, KEY a(a), KEY ä(a));
|
SHOW WARNINGS;
|
+-------+------+-------------------------------------------------------------------------------------+
|
| Level | Code | Message |
|
+-------+------+-------------------------------------------------------------------------------------+
|
| Note | 1831 | Duplicate index `ä`. This is deprecated and will be disallowed in a future release |
|
+-------+------+-------------------------------------------------------------------------------------+
|
Notice, index names were compared in accent insensitive way to display the warning.
However, on ALTER, they are compared in accent sensitive way as expected:
ALTER TABLE t1 DROP INDEX ä;
|
SHOW CREATE TABLE t1;
|
+-------+--------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+--------------------------------------------------------------------------------------------------------+
|
| t1 | CREATE TABLE `t1` (
|
`a` int(11) DEFAULT NULL,
|
KEY `a` (`a`)
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
|
+-------+--------------------------------------------------------------------------------------------------------+
|
The second ALTER does not drop anything:
ALTER TABLE t1 DROP INDEX ä;
|
ERROR 1091 (42000): Can't DROP INDEX `ä`; check that it exists
|
- SP variables are accent insensitive (wrong)
DELIMITER $$
|
BEGIN NOT ATOMIC
|
DECLARE a INT;
|
DECLARE ä INT;
|
END;
|
$$
|
DELIMITER ;
|
ERROR 1331 (42000): Duplicate variable: ä
|
|