Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
10.4.13, 10.4.18, 10.5.6, 10.5.10
-
None
Description
MariaDB's implementation of CAST is different (more "strict") than MySQL's, and this creates additional compatibility problems.
Consider two examples, MySQL 8.0.23 first:
openxs@ao756:~/dbs/8.0$ bin/mysql -uroot --socket=/tmp/mysql8.sock
|
Welcome to the MySQL monitor. Commands end with ; or \g.
|
Your MySQL connection id is 10
|
Server version: 8.0.23 Source distribution
|
|
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
|
|
Oracle is a registered trademark of Oracle Corporation and/or its
|
affiliates. Other names may be trademarks of their respective
|
owners.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
mysql> select cast(unhex('FF') as char(1) charset utf8mb4);
|
+----------------------------------------------+
|
| cast(unhex('FF') as char(1) charset utf8mb4) |
|
+----------------------------------------------+
|
| ▒ |
|
+----------------------------------------------+
|
1 row in set (0.00 sec)
|
|
mysql> select hex(cast(unhex('FF') as char(1) charset utf8mb4));
|
+---------------------------------------------------+
|
| hex(cast(unhex('FF') as char(1) charset utf8mb4)) |
|
+---------------------------------------------------+
|
| FF |
|
+---------------------------------------------------+
|
1 row in set (0.00 sec)
|
|
mysql> select CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16);
|
+-----------------------------------------------------------+
|
| CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16) |
|
+-----------------------------------------------------------+
|
| ?¬ |
|
+-----------------------------------------------------------+
|
1 row in set (0.00 sec)
|
|
mysql> select hex(CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16));
|
+----------------------------------------------------------------+
|
| hex(CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16)) |
|
+----------------------------------------------------------------+
|
| D800AC01 |
|
+----------------------------------------------------------------+
|
1 row in set (0.00 sec)
|
In the former we see invalid UTF8 byte value, 0xFF, accepted. In the latter we see "unpaired surrogate" bytes in UTF16, both accepted silently. Now let's consider the same statements in MariaDB:
openxs@ao756:~/dbs/maria10.5$ bin/mysql --socket=/tmp/mariadb.sock
|
Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 3
|
Server version: 10.5.10-MariaDB MariaDB Server
|
|
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> select cast(unhex('FF') as char(1) charset utf8mb4);
|
+----------------------------------------------+
|
| cast(unhex('FF') as char(1) charset utf8mb4) |
|
+----------------------------------------------+
|
| ? |
|
+----------------------------------------------+
|
1 row in set, 1 warning (0,034 sec)
|
|
MariaDB [(none)]> show warnings\G
|
*************************** 1. row ***************************
|
Level: Warning
|
Code: 1300
|
Message: Invalid utf8mb4 character string: '\xFF'
|
1 row in set (0,000 sec)
|
|
MariaDB [(none)]> select CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16);
|
+-----------------------------------------------------------+
|
| CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16) |
|
+-----------------------------------------------------------+
|
| ?¬? |
|
+-----------------------------------------------------------+
|
1 row in set, 1 warning (0,000 sec)
|
|
MariaDB [(none)]> show warnings\G
|
*************************** 1. row ***************************
|
Level: Warning
|
Code: 1300
|
Message: Invalid utf16 character string: '\xD8\x00\xAC\x01'
|
1 row in set (0,000 sec)
|
|
MariaDB [(none)]> select hex(CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16));
|
+----------------------------------------------------------------+
|
| hex(CAST(UNHEX('D800AC01') AS CHAR(3072) CHARACTER SET utf16)) |
|
+----------------------------------------------------------------+
|
| 003F00AC003F |
|
+----------------------------------------------------------------+
|
1 row in set, 1 warning (0,000 sec)
|
|
MariaDB [(none)]> show warnings\G *************************** 1. row ***************************
|
Level: Warning
|
Code: 1300
|
Message: Invalid utf16 character string: '\xD8\x00\xAC\x01'
|
1 row in set (0,000 sec)
|
We see warnings that in DML or DDL case will be errors in strict SQL mode. Moreover, in the latter case with warning a different hex value is produced.
IMHO this is a bug, but if intended, I'd ask for some mode/setting that would make CAST work the same way as in MySQL, for compatibility of existing applications.