[MDEV-22217] Make OS character sets "utf8" and "utf-8" map to MariaDB character set "utf8mb4" Created: 2020-04-10 Updated: 2020-04-12 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Geoff Montee (Inactive) | Assignee: | Ralf Gebhardt |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
The OS character sets "utf8" and "utf-8" currently map to the MariaDB character set "utf8". In MariaDB, the "utf8" character set refers to the incomplete 3-byte version of the UTF-8 standard (which has "utf8mb3" as an alias"). It may be more appropriate if the OS character sets "utf8" and "utf-8" instead mapped to the MariaDB character set "utf8mb4". That way, UTF-8 clients would get access to the full UTF-8 standard by default in MariaDB. MySQL 8.0 has already made this change:
https://dev.mysql.com/doc/refman/8.0/en/charset-connection.html For example, see here for MariaDB's current behavior:
We can see the relevant mapping in the code here: https://github.com/MariaDB/server/blob/mariadb-10.5.2/mysys/charset.c#L1384 |