Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35614

JSON_UNQUOTE doesn't work with emojis

Details

    Description

      JSON_UNQUOTE returns "?" when it's called with an emoji
      For example, "😊" is encoded as "\\ud83d
      ude0a"
      :

      SELECT JSON_UNQUOTE('"\\ud83d\\ude0a"');
      ?
      

      I believe this is a bug? JSON_UNQUOTE should return the emoji, not a question mark.

      Attachments

        Issue Links

          Activity

            rucha174, please, confirm, did you approve all 6 commits in this PR or just the fix for MDEV-35614?

            serg Sergei Golubchik added a comment - rucha174 , please, confirm, did you approve all 6 commits in this PR or just the fix for MDEV-35614 ?
            rucha174 Rucha Deodhar added a comment -

            serg , all the 6 commits.

            rucha174 Rucha Deodhar added a comment - serg , all the 6 commits.

            asked about test cases in the PR

            serg Sergei Golubchik added a comment - asked about test cases in the PR
            danblack Daniel Black added a comment - - edited

            rebased to 10.11.

            After much searching I'm pretty sure that test cases that reach the error condition aren't actually possible as the utf16 escaping character always map to a utf8mb4 character.

            To clarify:

            • json strings are validated before anything else.
            • Their validation included confirming to a json spec of limited character set (much stricter than the connection character set) and then exceptions being in the escaped parts of the json string
            • the escaped parts of the json string are strongly validated against utf16
            • the utf16 is mapped down to the full utf8(mb4) codepoint before being validated (again) when doing comparisons
            danblack Daniel Black added a comment - - edited rebased to 10.11. After much searching I'm pretty sure that test cases that reach the error condition aren't actually possible as the utf16 escaping character always map to a utf8mb4 character. To clarify: json strings are validated before anything else. Their validation included confirming to a json spec of limited character set (much stricter than the connection character set) and then exceptions being in the escaped parts of the json string the escaped parts of the json string are strongly validated against utf16 the utf16 is mapped down to the full utf8(mb4) codepoint before being validated (again) when doing comparisons

            ok to push

            serg Sergei Golubchik added a comment - ok to push

            People

              danblack Daniel Black
              csirmazbendeguz Bendeguz Csirmaz
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.