Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Fixed
-
5.5.39, 10.0.13
-
None
Description
Byte sequence 0x8EA0 is erroneously accepted as a valid ujis/eucjpms code:
DROP TABLE IF EXISTS t1;
|
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ujis);
|
INSERT INTO t1 VALUES (0x8EA0);
|
SELECT HEX(a), CHAR_LENGTH(a) FROM t1;
|
returns:
+--------+----------------+
|
| HEX(a) | CHAR_LENGTH(a) |
|
+--------+----------------+
|
| 8EA0 | 2 |
|
+--------+----------------+
|
This is wrong. The correct code ranges for ujis are:
[x00-x7F] # ASCII/JIS-Roman (one-byte/character)
|
[x8E][xA1-xDF] # half-width katakana (two bytes/char)
|
[x8F][xA1-xFE][xA1-xFE] # JIS X 0212-1990 (three bytes/char)
|
[xA1-xFE][xA1-xFE] # JIS X 0208:1997 (two bytes/char)
|
The same problem is observed with eucjpms.