Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
-
None
Description
The REGEXP_REPLACE function converts supplementary characters (4 byte utf8 encoding) into "?" when the charset is utf8mb4.
Due to JIRA not allowing supplementary utf8 characters in the issue description, I have used the CAST/UNHEX of the smiley character, but the actual value should be _utf8mb4'<emoji>'.
SELECT REGEXP_REPLACE(CAST(UNHEX('F09F9881') AS CHAR CHARACTER SET 'utf8mb4'), _utf8mb4'a', _utf8mb4'b') AS Text; |
Expected output: "<emoji>"
Actual output: "?"
Attachments
Issue Links
- relates to
-
MDEV-32904 smiley emoji (F09F9883) valid in utf8 but not utf8mb4
- Closed
-
MDEV-34012 Regular expression queries containing 4-byte multicharacters result in an error.
- Closed