Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21124

JSON_UNQUOTE fails to process surrogate pairs

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.10, 10.2, 10.3, 10.4, 10.5
    • 10.4, 10.5
    • JSON
    • None

    Description

      While it is common to use MySQL's `JSON_UNQUOTE(JSON_EXTRACT())` combination, MariaDB additionally offers `JSON_VALUE` function to basically perform the same task. The problem is the results from the two methods differ when it comes to surrogate pairs, which are coincidentally what JSON uses to encode extended characters, e.g., emojis.

      It seems the linked issue MDEV-19760 addressed a similar problem in `JSON_VALID` — The fix did resolve a different issue that `JSON`-type fields rejects any surrogate pair because of the default `CHECK(JSON_VALID())` constraint.

      Here are simple test cases:

      SET @test1 = '"\\u00e4\\u00f6\"';
      SET @test2 = '"\\ud83d\\udc6c\\ud83c\\udfff"';
      SET @test3 = '"\\ud83d\\ude0a\"';
      SELECT JSON_VALUE(@test1, '$') AS VALUE, JSON_UNQUOTE(@test1) as UNQUOTE
      UNION
      SELECT JSON_VALUE(@test2, '$') AS VALUE, JSON_UNQUOTE(@test2) as UNQUOTE
      UNION
      SELECT JSON_VALUE(@test3, '$') AS VALUE, JSON_UNQUOTE(@test3) as UNQUOTE;
      

      And the result is attached. (Apparently JIRA does not support emojis.)

      Attachments

        Issue Links

          Activity

            People

              rucha174 Rucha Deodhar
              kiding Dongsung Kim
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.