Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11777

REGEXP_REPLACE converts utf8mb4 supplementary characters to '?'

    XMLWordPrintable

Details

    Description

      The REGEXP_REPLACE function converts supplementary characters (4 byte utf8 encoding) into "?" when the charset is utf8mb4.

      Due to JIRA not allowing supplementary utf8 characters in the issue description, I have used the CAST/UNHEX of the smiley character, but the actual value should be _utf8mb4'<emoji>'.

      SELECT REGEXP_REPLACE(CAST(UNHEX('F09F9881') AS CHAR CHARACTER SET 'utf8mb4'), _utf8mb4'a', _utf8mb4'b') AS Text;
      

      Expected output: "<emoji>"
      Actual output: "?"

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              pimbroekhof Pim Broekhof
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.