[MDEV-19285] INSTANT ALTER from ascii_general_ci to latin1_general_ci produces corrupt data Created: 2019-04-19  Updated: 2022-04-15  Resolved: 2019-05-16

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, Storage Engine - InnoDB
Affects Version/s: 10.4
Fix Version/s: 10.4.5

Type: Bug Priority: Major
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-15564 Avoid table rebuild in ALTER TABLE on... Closed
Relates
relates to MDEV-19284 INSTANT ALTER with ucs2-to-utf16 conv... Closed
relates to MDEV-28323 Redundant Item_func_conv_charset on W... Open

 Description   

Copying ALTER from ascii_general_ci to latin1_general_ci may fail in case of 8-bit characters:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ascii COLLATE ascii_general_ci, PRIMARY KEY(a)) ENGINE=InnoDB;
INSERT INTO t1 VALUES ('a'),(0xC0),('b');
ALTER TABLE t1 ALGORITHM=COPY, MODIFY a VARCHAR(10) CHARACTER SET latin1 COLLATE latin1_general_ci;

Instant ALTER performs such conversion silently, but then CHECK TABLE rerpots corrupted data:

ALTER TABLE t1 ALGORITHM=INSTANT, MODIFY a VARCHAR(10) CHARACTER SET latin1 COLLATE latin1_general_ci;
CHECK TABLE t1;

+---------+-------+----------+---------------------------------------------------+
| Table   | Op    | Msg_type | Msg_text                                          |
+---------+-------+----------+---------------------------------------------------+
| test.t1 | check | Warning  | InnoDB: The B-tree of index PRIMARY is corrupted. |
| test.t1 | check | error    | Corrupt                                           |
+---------+-------+----------+---------------------------------------------------+

Instant ALTER should be disallowed from ascii_general_ci to xxx_general_ci.



 Comments   
Comment by Alexander Barkov [ 2019-04-19 ]

A similar problem is repeatable on alter from ascii_general_ci to utf8_general_ci.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ascii COLLATE ascii_general_ci, PRIMARY KEY(a)) ENGINE=InnoDB;
INSERT INTO t1 VALUES ('a'),(0xC0),('b');
ALTER TABLE t1 ALGORITHM=COPY, MODIFY a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_general_ci;

ERROR 1366 (22007): Incorrect string value: '\xC0' for column `test`.`t1`.`a` at row 3

ALTER TABLE t1 ALGORITHM=INSTANT, MODIFY a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_general_ci;
SELECT HEX(a) FROM t1;

+--------+
| HEX(a) |
+--------+
| 61     |
| 62     |
| C0     |
+--------+

Notice, 0xC0 is a not well-formed utf8 string.

Comment by Alexander Barkov [ 2019-04-19 ]

A similar problem is repeatable on alter from ascii_bin to utf8_bin.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ascii COLLATE ascii_bin, PRIMARY KEY(a)) ENGINE=InnoDB;
INSERT INTO t1 VALUES ('a'),(0xC0),('b');
ALTER TABLE t1 ALGORITHM=COPY, MODIFY a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_bin;

ERROR 1366 (22007): Incorrect string value: '\xC0' for column `test`.`t1`.`a` at row 3

ALTER TABLE t1 ALGORITHM=INSTANT, MODIFY a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_general_ci;
SELECT HEX(a) FROM t1;

+--------+
| HEX(a) |
+--------+
| 61     |
| 62     |
| C0     |
+--------+

Notice, 0xC0 is a not well-formed utf8 string.

Comment by Alexander Barkov [ 2019-05-16 ]

Closed by a join patch with MDEV-19284

Generated at Thu Feb 08 08:50:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.