[MDEV-6218] Wrong result of CHAR_LENGTH(non-BMP-character) with 3-byte utf8 Created: 2014-05-07 Updated: 2023-11-13 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Affects Version/s: | 10.0.10, 10.2, 10.3 |
| Fix Version/s: | 10.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Alexander Barkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | upstream | ||
| Issue Links: |
|
||||||||
| Description |
|
Notice, I use "SET NAMES utf8" (which is a 3-byte character set 0xF09F9881 is a wrong byte sequence of utf8 (it's correct for utf8mb4 only) The expected result would be:
|
| Comments |
| Comment by Sergei Petrunia [ 2014-05-07 ] | |||||||
|
Repeatable on mysql-5.6.17 | |||||||
| Comment by Alexander Barkov [ 2014-06-10 ] | |||||||
|
LEFT also returns a wrong result:
| |||||||
| Comment by Alexander Barkov [ 2014-07-25 ] | |||||||
|
RIGHT returns a wrong result:
| |||||||
| Comment by Alexander Barkov [ 2014-07-25 ] | |||||||
|
SUBSTRING returns a wrong result:
| |||||||
| Comment by Alexander Barkov [ 2014-07-25 ] | |||||||
|
In this example, the returned string is also bad formed:
It should probably replace unknown bytes to question marks. |