Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Duplicate
Description
MySQL 5.7 WL#4024 introduced support for the Chinese national standard GB18030, an encoding for Unicode.
For Chinese, Japanese, Korean (CJK), GB18030 can be an interesting option, because unlike UTF-8, it only needs 2 (not 3) bytes per CJK character, and unlike UTF-16, it only needs 1 byte per ASCII character (not 2). The price that you have to pay is that non-CJK, non-ASCII characters will require a longer encoding than in UTF-8 or UTF-16.
Because MariaDB 10.2 already incorporates the InnoDB of MySQL 5.7, the InnoDB adjustments for this should already be in place. The only missing bit should be that fts_is_charset_cjk() should return true for gb18030, to choose a hash-based internal partitioning scheme of the fulltext index.
Attachments
Issue Links
- is duplicated by
-
MDEV-7495 Support GB18030 character set
- Open