Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
Description
This result is wrong:
mysql> SELECT HEX(CAST(0xA341 AS CHAR CHARACTER SET gb2312));
|
+------------------------------------------------+
|
| HEX(CAST(0xA341 AS CHAR CHARACTER SET gb2312)) |
|
+------------------------------------------------+
|
| A341 |
|
+------------------------------------------------+
|
1 row in set (1.15 sec)
|
0xA341 is not a well formed gb2312 byte sequence.
mysql> SELECT _gb2312 0xA341;
|
ERROR 1300 (HY000): Invalid gb2312 character string: 'A341'
|
0xA3 is a multi-byte head, but it is not followed by a valid multi-byte tails.
The expected result would be to replace the bad byte 0xA3 to '?' and return 0x3F41.
Additionally, badly formed sequences are converted to something strange during
character set conversion:
mysql> SELECT HEX(CONVERT(CAST(0xA341 AS CHAR CHARACTER SET gb2312) USING utf16));
|
+---------------------------------------------------------------------+
|
| HEX(CONVERT(CAST(0xA341 AS CHAR CHARACTER SET gb2312) USING utf16)) |
|
+---------------------------------------------------------------------+
|
| FF21 |
|
+---------------------------------------------------------------------+
|
1 row in set (0.00 sec)
|
A341 was converted to "U+FF21 FULLWIDTH LATIN CAPITAL LETTER A", which is wrong.
It seems A341 was erroneously taken as A3C1, which is the correct gb2312 for U+FF21.
Attachments
Issue Links
- is blocked by
-
MDEV-6566 Different INSERT behaviour on bad bytes with and without character set conversion
- Closed