Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13081

Support the GB18030 encoding of Unicode

    XMLWordPrintable

    Details

      Description

      MySQL 5.7 WL#4024 introduced support for the Chinese national standard GB18030, an encoding for Unicode.

      For Chinese, Japanese, Korean (CJK), GB18030 can be an interesting option, because unlike UTF-8, it only needs 2 (not 3) bytes per CJK character, and unlike UTF-16, it only needs 1 byte per ASCII character (not 2). The price that you have to pay is that non-CJK, non-ASCII characters will require a longer encoding than in UTF-8 or UTF-16.

      Because MariaDB 10.2 already incorporates the InnoDB of MySQL 5.7, the InnoDB adjustments for this should already be in place. The only missing bit should be that fts_is_charset_cjk() should return true for gb18030, to choose a hash-based internal partitioning scheme of the fulltext index.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: