Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6776

ujis and eucjmps erroneously accept 0x8EA0 as a valid byte sequence

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 5.5.39, 10.0.13
    • Fix Version/s: 10.0.14
    • Component/s: Character Sets
    • Labels:
      None

      Description

      Byte sequence 0x8EA0 is erroneously accepted as a valid ujis/eucjpms code:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ujis);
      INSERT INTO t1 VALUES (0x8EA0);
      SELECT HEX(a), CHAR_LENGTH(a) FROM t1;

      returns:

      +--------+----------------+
      | HEX(a) | CHAR_LENGTH(a) |
      +--------+----------------+
      | 8EA0   |              2 |
      +--------+----------------+

      This is wrong. The correct code ranges for ujis are:

        [x00-x7F]                     # ASCII/JIS-Roman (one-byte/character)  
        [x8E][xA1-xDF]                # half-width katakana (two bytes/char)  
        [x8F][xA1-xFE][xA1-xFE]       # JIS X 0212-1990 (three bytes/char)  
        [xA1-xFE][xA1-xFE]            # JIS X 0208:1997 (two bytes/char)

      The same problem is observed with eucjpms.

        Attachments

          Activity

            People

            Assignee:
            bar Alexander Barkov
            Reporter:
            bar Alexander Barkov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: