Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8765

mysqldump silently corrupts 4-byte UTF-8 data

Details

    Description

      Bug for Oracle MySQL: https://bugs.mysql.com/bug.php?id=71746

      But this also affect MariaDB 10.0:

      [dvaneeden@dve-mac msb_ma10_0_20]$ ./my sqldump --skip-extended-insert unicodedata | grep DOLPHIN
      INSERT INTO `ucd` VALUES ('1F42C','?','DOLPHIN','So','0','ON','','','','','N','','','','','');
      [dvaneeden@dve-mac msb_ma10_0_20]$ ./my sqldump --skip-extended-insert --default-character-set=utf8mb4 unicodedata | grep DOLPHIN
      INSERT INTO `ucd` VALUES ('1F42C','��','DOLPHIN','So','0','ON','','','','','N','','','','','');

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            upstream fixed as per ebaff9fffc958030a57d8ea7f1f2d527cac1df64

            mariadb needs to change include/my_global.h:#define MYSQL_UNIVERSAL_CLIENT_CHARSET to utf8mb4
            mysqldump is the only place this is used.

            Really trivial fix to prevent backup corruption, even if utf8mb4 isn't the default.

            danblack Daniel Black added a comment - upstream fixed as per ebaff9fffc958030a57d8ea7f1f2d527cac1df64 mariadb needs to change include/my_global.h:#define MYSQL_UNIVERSAL_CLIENT_CHARSET to utf8mb4 mysqldump is the only place this is used. Really trivial fix to prevent backup corruption, even if utf8mb4 isn't the default.

            Raised priority as there's pull request now.

            svoj Sergey Vojtovich added a comment - Raised priority as there's pull request now.

            @Sergey could you add the link to the pull request here

            rutuja Rutuja Surve (Inactive) added a comment - @Sergey could you add the link to the pull request here

            rutuja, there's a link on the right side under "Development" section.
            https://github.com/MariaDB/server/pull/547

            svoj Sergey Vojtovich added a comment - rutuja , there's a link on the right side under "Development" section. https://github.com/MariaDB/server/pull/547

            Hi, I confirm this on both 5.5 and 10.3 using the UTF dataset available at:
            https://github.com/dveeden/mysqlunicodedata

            The fix in the associated PR#547 does fix the dump issue, instead of garbage '?', mysqldump does export the proper UTF symbols after patching, without the need for explicit --default-character-set.

            As far as the actual fix in the PR, at least the mysqldump* tests need adjusting, however, I can't speak for the overall implications of switching MYSQL_UNIVERSAL_CLIENT_CHARSET to utfmb4 for the entire suite. Someone better suited should evaluate that. Thank you!

            teodor Teodor Mircea Ionita (Inactive) added a comment - Hi, I confirm this on both 5.5 and 10.3 using the UTF dataset available at: https://github.com/dveeden/mysqlunicodedata The fix in the associated PR#547 does fix the dump issue, instead of garbage '?', mysqldump does export the proper UTF symbols after patching, without the need for explicit --default-character-set. As far as the actual fix in the PR, at least the mysqldump* tests need adjusting, however, I can't speak for the overall implications of switching MYSQL_UNIVERSAL_CLIENT_CHARSET to utfmb4 for the entire suite. Someone better suited should evaluate that. Thank you!
            bar Alexander Barkov added a comment - - edited

            This issue is critical for the JSON data type, which is an alias to longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin.

            bar Alexander Barkov added a comment - - edited This issue is critical for the JSON data type, which is an alias to longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin .

            People

              bar Alexander Barkov
              dveeden Daniël van Eeden
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.