Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9893

mysqldump -T followed by LOAD DATA does not restore data well

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL)
    • 10.2(EOL)
    • Character Sets
    • None

    Description

      I create a database with gbk as a default character set,
      create two tables of equal structure and populate table t1:

      mysql -uroot --table <<END
      DROP DATABASE IF EXISTS db1;
      CREATE DATABASE db1 DEFAULT CHARACTER SET gbk;
      USE db1;
      CREATE TABLE t1 (a TEXT, b BLOB);
      CREATE TABLE t2 (a TEXT, b BLOB);
      INSERT INTO t1 VALUES (_gbk 0xEE5C, 0xEE5C8060);
      SELECT HEX(a), HEX(b) FROM t1;
      END
      

      Then I want to dump the table t1 using "mysqldump -T" and then reload the dump into t2.
      To test with different locale settings and mysqldump options, I create a helper shell script "dump.sh":

      LANG=$LOCALE sudo sh -c '          
      rm -rf /var/lib/mysql/backups
      mkdir /var/lib/mysql/backups
      chmod a+rwx /var/lib/mysql/backups' &&
      cd /var/lib/mysql &&
      mysqldump $DUMPPARAM -uroot -Tbackups db1 t1 &&                       
      mysql -uroot --table db1 << END
      TRUNCATE TABLE t2;
      LOAD DATA INFILE '/var/lib/mysql/backups/t1.txt' INTO TABLE t2;
      SELECT HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt'));
      SELECT HEX(a), HEX(b) FROM t2;
      END
      

      Now If I invoke the script using either of these commands:

      LOCALE=zh_CN.gbk      DUMPPARAM=--default-character-set=binary sh dump.sh
      LOCALE=en_US.utf8     DUMPPARAM=--default-character-set=binary sh dump.sh 
      LOCALE=en_US.iso88591 DUMPPARAM=--default-character-set=binary sh dump.sh 
      LOCALE=C              DUMPPARAM=--default-character-set=binary sh dump.sh 
      LOCALE=zh_CN.gbk      DUMPPARAM=  sh dump.sh
      LOCALE=en_US.utf8     DUMPPARAM=  sh dump.sh 
      LOCALE=en_US.iso88591 DUMPPARAM=  sh dump.sh 
      LOCALE=C              DUMPPARAM=  sh dump.sh 
      

      it works fine and returns this output:

      +-------------------------------------------------+
      | HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt')) |
      +-------------------------------------------------+
      | EE5C09EE5C5C80600A                              |
      +-------------------------------------------------+
      +--------+----------+
      | HEX(a) | HEX(b)   |
      +--------+----------+
      | EE5C   | EE5C8060 |
      +--------+----------+
      

      But if then I invoke the script using these commands:

      LOCALE=zh_CN.gbk      DUMPPARAM=--default-character-set=gbk sh dump.sh 
      LOCALE=en_US.utf8     DUMPPARAM=--default-character-set=gbk sh dump.sh 
      LOCALE=en_US.iso88591 DUMPPARAM=--default-character-set=gbk sh dump.sh 
      LOCALE=C              DUMPPARAM=--default-character-set=gbk sh dump.sh 
      

      it returns a wrong BLOB value:

      +-------------------------------------------------+
      | HEX(LOAD_FILE('/var/lib/mysql/backups/t1.txt')) |
      +-------------------------------------------------+
      | EE5C09EE5C0A                                    |
      +-------------------------------------------------+
      +--------+--------+
      | HEX(a) | HEX(b) |
      +--------+--------+
      | EE5C   | EE5C   |
      +--------+--------+
      

      This looks wrong. According to the documentation, LOAD DATA INFILE treats the file as having the same character set with the default character set of the database, which is GBK. So dump with explicit --default-character-set=gbk followed by LOAD DATA which implicitly use gbk as the file character set do not work together.

      Attachments

        Activity

          People

            bar Alexander Barkov
            bar Alexander Barkov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.