Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
5.3.12, 5.5.39, 10.0.13
-
None
-
None
Description
Start a terminal session using character set big5.
In gnome-terminal:
Terminal -> Character Coding -> Traditional Chinese (big5)
Make sure everything works fine:
LANG=zh_TW.big mysql --default-character-set=big5 --table << END |
SET NAMES big5; |
SELECT HEX(''),HEX('乂'); |
END |
should return:
+----------+-----------+
|
| HEX('?') | HEX('乂') |
|
+----------+-----------+
|
| C840 | C940 |
|
+----------+-----------+
|
If you get a different output, then something is wrong with the terminal
character set settings.
Notice, the character with the Big5 code C840 is unassigned
(does not have a Unicode mapping), while the character with
the Big5 code c940 is assigned.
Now create an ENUM with non-assigned and assigned characters:
LANG=zh_TW.big mysql --default-character-set=big5 --table test << END |
SET NAMES big5; |
DROP TABLE IF EXISTS t1; |
CREATE TABLE t1 (a ENUM('','乂') CHARACTER SET big5); |
SHOW CREATE TABLE t1; |
INSERT INTO t1 VALUES (''),('乂'); |
SELECT HEX(a),a FROM t1; |
END |
The output will be:
+-------+-----------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+-----------------------------------------------------------------------------------------------------------------+
|
| t1 | CREATE TABLE `t1` (
|
`a` enum('?','乂') CHARACTER SET big5 DEFAULT NULL
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
|
+-------+-----------------------------------------------------------------------------------------------------------------+
|
+--------+------+
|
| HEX(a) | a |
|
+--------+------+
|
| C840 | |
|
| C940 | 乂 |
|
+--------+------+
|
Notice, the unassigned character got converted to question mark
in the SHOW CREATE output, but INSERT/SELECT actually work fine.
Now dump and restore:
mysqldump --socket=/tmp/mysql.sock test >t1.sql
|
mysql -e "drop table t1" test
|
mysql test <t1.sql
|
mysql -e "select hex(a),a from t1" test
|
The output will be:
+--------+------+
|
| hex(a) | a |
|
+--------+------+
|
| 3F | ? |
|
| C940 | 乂 |
|
+--------+------+
|
The unassigned character got lost.