Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL)
-
None
Description
I create a table with two similar ENUM columns, both using CHARACTER SET utf32:
DROP TABLE IF EXISTS t1; |
CREATE TABLE t1 ( |
c1 ENUM ('a','b') CHARACTER SET utf32 DEFAULT 'a', |
c2 ENUM ('a','b') CHARACTER SET utf32 DEFAULT 'a' |
);
|
SHOW CREATE TABLE t1; |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| t1 | CREATE TABLE `t1` (
|
`c1` enum('??','??') CHARACTER SET utf32 DEFAULT '??',
|
`c2` enum('??','??') CHARACTER SET utf32 DEFAULT '??'
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
Notice, the SHOW CREATE returns garbage instead of ENUM values.
The problem happens in this piece of the code in table.cc:
if (interval_nr && charset->mbminlen > 1) |
{
|
/* Unescape UCS2 intervals from HEX notation */ |
TYPELIB *interval= share->intervals + interval_nr - 1;
|
unhex_type2(interval);
|
As the two TYPELIBs are equal, only one copy of this TYPELIB is stored in the FRM file. But unhex_type() is called two times.
Note, TYPELIBs for tricky character sets like utf32 are stored in HEX notation. So the same problem is repeatable if I use a latin1 ENUM column whose values are equal to HEX representations of the utf32 ENUM column:
DROP TABLE IF EXISTS t1; |
CREATE TABLE t1 ( |
c1 ENUM ('00000061','00000062') DEFAULT '00000061' COLLATE latin1_bin, |
c2 ENUM ('a','b') DEFAULT 'a' COLLATE utf32_general_ci |
);
|
SHOW CREATE TABLE t1; |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| t1 | CREATE TABLE `t1` (
|
`c1` enum('\0\0\0a','\0\0\0b') CHARACTER SET latin1 COLLATE latin1_bin DEFAULT '\0\0\0a',
|
`c2` enum('a','b') CHARACTER SET utf32 DEFAULT 'a'
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
|
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
As in the previous example, only one copy of the TYPELIB is stored in the frm file (because they are binary equal).
But the unhex_type2() is called for this TYPELIB to unescape the utf32 column value. But the latin1 columns points to the same TYPELIB.
Attachments
Issue Links
- blocks
-
MDEV-28062 Assertion `(length % 4) == 0' failed in my_lengthsp_utf32 on INSERT..SELECT
- Closed
- is duplicated by
-
MDEV-28062 Assertion `(length % 4) == 0' failed in my_lengthsp_utf32 on INSERT..SELECT
- Closed
- relates to
-
MDEV-28498 Incorrect information in file: './test/t0.frm' on CREATE TABLE
- In Review