[MDEV-30746] Regression in ucs2_general_mysql500_ci Created: 2023-02-28  Updated: 2023-04-11  Resolved: 2023-03-01

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 10.11.3, 11.0.2, 10.4.29, 10.5.20, 10.6.13, 10.7.8, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Critical
Reporter: Alexander Barkov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: regression-10.4

Issue Links:
Blocks
blocks MDEV-30577 Case folding for uca1400 collations i... Closed

 Description   

ucs2_general_mysql500_ci is a MySQL-5.0.0 compatibility collations and (unlike ucs2_general_ci) sorts 'ß' after 's' in all MariaDB versions up to 10.3:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(32) CHARACTER SET ucs2 COLLATE ucs2_general_mysql500_ci);
INSERT INTO t1 VALUES ('s'),('z'),(_latin1 0xDF);
SELECT GROUP_CONCAT(a) FROM t1 GROUP BY a ORDER BY a;

+-----------------+
| GROUP_CONCAT(a) |
+-----------------+
| s               |
| z               |
| ß               |
+-----------------+

Starting from 10.4 it returns a wrong result (equal to ucs2_general_ci):

+-----------------+
| GROUP_CONCAT(a) |
+-----------------+
| s,ß             |
| z               |
+-----------------+

This is not expected. The compatibility collations should still provide the old MySQL-5.0.0 order.

The order was broken by:

commit a8efe7ab1f28e2219df5ae9aa88fa63c40ad1066
Author: Alexander Barkov <bar@mariadb.com>
Date:   Fri Oct 19 14:20:31 2018 +0400
 
    MDEV-17502 MDEV-17474 Change Unicode xxx_general_ci and xxx_bin collation implementation to "inline" style

Note, a similar collation for utf8mb3 correctly returns results in the expected order in all MariaDB versions:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(32) CHARACTER SET utf8mb3 COLLATE utf8mb3_general_mysql500_ci);
INSERT INTO t1 VALUES ('s'),('z'),(_latin1 0xDF);
SELECT GROUP_CONCAT(a) FROM t1 GROUP BY a ORDER BY a;

+-----------------+
| GROUP_CONCAT(a) |
+-----------------+
| s               |
| z               |
| ß               |
+-----------------+


Generated at Thu Feb 08 10:18:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.