Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28078

Garbage on multiple equal ENUMs with tricky character sets

    XMLWordPrintable

Details

    Description

      I create a table with two similar ENUM columns, both using CHARACTER SET utf32:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (
        c1 ENUM ('a','b') CHARACTER SET utf32 DEFAULT 'a',
        c2 ENUM ('a','b') CHARACTER SET utf32 DEFAULT 'a' 
      );
      SHOW CREATE TABLE t1;
      

      +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                                                                                |
      +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `c1` enum('??','??') CHARACTER SET utf32 DEFAULT '??',
        `c2` enum('??','??') CHARACTER SET utf32 DEFAULT '??'
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
      +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      

      Notice, the SHOW CREATE returns garbage instead of ENUM values.

      The problem happens in this piece of the code in table.cc:

          if (interval_nr && charset->mbminlen > 1)
          {
            /* Unescape UCS2 intervals from HEX notation */
            TYPELIB *interval= share->intervals + interval_nr - 1;
            unhex_type2(interval);
      

      As the two TYPELIBs are equal, only one copy of this TYPELIB is stored in the FRM file. But unhex_type() is called two times.

      Note, TYPELIBs for tricky character sets like utf32 are stored in HEX notation. So the same problem is repeatable if I use a latin1 ENUM column whose values are equal to HEX representations of the utf32 ENUM column:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (
        c1 ENUM ('00000061','00000062') DEFAULT '00000061' COLLATE latin1_bin,
        c2 ENUM ('a','b') DEFAULT 'a' COLLATE utf32_general_ci
      );
      SHOW CREATE TABLE t1;
      

      +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                                                                                                                |
      +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `c1` enum('\0\0\0a','\0\0\0b') CHARACTER SET latin1 COLLATE latin1_bin DEFAULT '\0\0\0a',
        `c2` enum('a','b') CHARACTER SET utf32 DEFAULT 'a'
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
      +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      

      As in the previous example, only one copy of the TYPELIB is stored in the frm file (because they are binary equal).
      But the unhex_type2() is called for this TYPELIB to unescape the utf32 column value. But the latin1 columns points to the same TYPELIB.

      Attachments

        Issue Links

          Activity

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.