Details
Description
While it is common to use MySQL's `JSON_UNQUOTE(JSON_EXTRACT())` combination, MariaDB additionally offers `JSON_VALUE` function to basically perform the same task. The problem is the results from the two methods differ when it comes to surrogate pairs, which are coincidentally what JSON uses to encode extended characters, e.g., emojis.
It seems the linked issue MDEV-19760 addressed a similar problem in `JSON_VALID` — The fix did resolve a different issue that `JSON`-type fields rejects any surrogate pair because of the default `CHECK(JSON_VALID())` constraint.
Here are simple test cases:
SET @test1 = '"\\u00e4\\u00f6\"'; |
SET @test2 = '"\\ud83d\\udc6c\\ud83c\\udfff"'; |
SET @test3 = '"\\ud83d\\ude0a\"'; |
SELECT JSON_VALUE(@test1, '$') AS VALUE, JSON_UNQUOTE(@test1) as UNQUOTE |
UNION
|
SELECT JSON_VALUE(@test2, '$') AS VALUE, JSON_UNQUOTE(@test2) as UNQUOTE |
UNION
|
SELECT JSON_VALUE(@test3, '$') AS VALUE, JSON_UNQUOTE(@test3) as UNQUOTE; |
And the result is attached. (Apparently JIRA does not support emojis.)
Attachments
Issue Links
- is duplicated by
-
MDEV-35614 JSON_UNQUOTE doesn't work with emojis
- In Review
- relates to
-
MDEV-14301 JSON_UNQUOTE returns incorrect results depending on the collation
- Stalled
-
MDEV-19760 json escaped unicode parse error
- Closed
- links to