|
I create a database with gbk as a default character set,
create two tables of equal structure and populate table t1:
mysql -uroot --table <<END
|
DROP DATABASE IF EXISTS db1;
|
CREATE DATABASE db1 DEFAULT CHARACTER SET gbk;
|
USE db1;
|
CREATE TABLE t1 (a TEXT, b BLOB);
|
CREATE TABLE t2 (a TEXT, b BLOB);
|
INSERT INTO t1 VALUES (_gbk 0xEE5C, 0xEE5C8060);
|
SELECT HEX(a), HEX(b) FROM t1;
|
END
|
Then I want to dump the table t1 using "mysqldump -T" and then reload the dump into t2.
To test with different locale settings and mysqldump options, I create a helper shell script "dump.sh":
LANG=$LOCALE sudo sh -c '
|
rm -rf /var/lib/mysql/backups
|
mkdir /var/lib/mysql/backups
|
chmod a+rwx /var/lib/mysql/backups' &&
|
cd /var/lib/mysql &&
|
mysqldump $DUMPPARAM -uroot -Tbackups db1 t1 &&
|
mysql -uroot --table db1 << END
|
TRUNCATE TABLE t2;
|
LOAD DATA INFILE '/var/lib/mysql/backups/t1.txt' INTO TABLE t2;
|
SELECT HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt'));
|
SELECT HEX(a), HEX(b) FROM t2;
|
END
|
Now If I invoke the script using either of these commands:
LOCALE=zh_CN.gbk DUMPPARAM=--default-character-set=binary sh dump.sh
|
LOCALE=en_US.utf8 DUMPPARAM=--default-character-set=binary sh dump.sh
|
LOCALE=en_US.iso88591 DUMPPARAM=--default-character-set=binary sh dump.sh
|
LOCALE=C DUMPPARAM=--default-character-set=binary sh dump.sh
|
LOCALE=zh_CN.gbk DUMPPARAM= sh dump.sh
|
LOCALE=en_US.utf8 DUMPPARAM= sh dump.sh
|
LOCALE=en_US.iso88591 DUMPPARAM= sh dump.sh
|
LOCALE=C DUMPPARAM= sh dump.sh
|
it works fine and returns this output:
+-------------------------------------------------+
|
| HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt')) |
|
+-------------------------------------------------+
|
| EE5C09EE5C5C80600A |
|
+-------------------------------------------------+
|
+--------+----------+
|
| HEX(a) | HEX(b) |
|
+--------+----------+
|
| EE5C | EE5C8060 |
|
+--------+----------+
|
But if then I invoke the script using these commands:
LOCALE=zh_CN.gbk DUMPPARAM=--default-character-set=gbk sh dump.sh
|
LOCALE=en_US.utf8 DUMPPARAM=--default-character-set=gbk sh dump.sh
|
LOCALE=en_US.iso88591 DUMPPARAM=--default-character-set=gbk sh dump.sh
|
LOCALE=C DUMPPARAM=--default-character-set=gbk sh dump.sh
|
it returns a wrong BLOB value:
+-------------------------------------------------+
|
| HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt')) |
|
+-------------------------------------------------+
|
| EE5C09EE5C0A |
|
+-------------------------------------------------+
|
+--------+--------+
|
| HEX(a) | HEX(b) |
|
+--------+--------+
|
| EE5C | EE5C |
|
+--------+--------+
|
This looks wrong. According to the documentation, LOAD DATA INFILE treats the file as having the same character set with the default character set of the database, which is GBK. So dump with explicit --default-character-set=gbk followed by LOAD DATA which implicitly use gbk as the file character set do not work together.
|