[MDEV-17301] Change of COLLATE unnecessarily requires ALGORITHM=COPY Created: 2018-09-27  Updated: 2022-09-01  Resolved: 2019-06-22

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table
Affects Version/s: 10.0, 10.1, 10.2, 10.3, 10.4
Fix Version/s: 10.4.7

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Eugene Kosov (Inactive)
Resolution: Fixed Votes: 0
Labels: ddl, instant

Issue Links:
Blocks
is blocked by MDEV-18961 Converting charset to BINARY should b... Closed
is blocked by MDEV-19283 Move the code from Field_str::is_equa... Closed
Duplicate
is duplicated by MDEV-18583 Avoid copying when changing the type ... Closed
Problem/Incident
causes MDEV-22771 Assertion `fields[i].same(instant.fie... Closed
causes MDEV-26294 Duplicate entries in unique index not... Closed
causes MDEV-27280 server crashes on CHECK TABLE after C... Closed
Relates
relates to MDEV-11424 Instant ALTER TABLE of failure-free r... Closed
relates to MDEV-15564 Avoid table rebuild in ALTER TABLE on... Closed
relates to MDEV-17300 MariaDB Galera cluster crash after is... Closed
relates to MDEV-29436 Change text column collation unnecess... Open

 Description   

The COLLATE attribute of a column only affects the way how any indexes that are built on the column will be sorted. Thus, it should be possible to change the collation of a column without rebuilding the table, except if the column is part of the PRIMARY KEY.

Currently, changing the collation requires the table to be rebuilt with ALGORITHM=COPY, in all cases below:

--source include/have_innodb.inc
CREATE TABLE t1 (
 a CHAR(1) COLLATE utf8_bin PRIMARY KEY,
 b CHAR(1) COLLATE utf8_bin UNIQUE,
 c CHAR(1) COLLATE utf8_bin
) ENGINE=InnoDB;
ALTER TABLE t1 CHANGE COLUMN c c CHAR(1) COLLATE utf8_general_ci,
ALGORITHM=INSTANT;
ALTER TABLE t1 CHANGE COLUMN b b CHAR(1) COLLATE utf8_general_ci,
ALGORITHM=NOCOPY; # implies DROP INDEX b, ADD UNIQUE INDEX(b)
ALTER TABLE t1 CHANGE COLUMN a a CHAR(1) COLLATE utf8_general_ci,
ALGORITHM=INPLACE; # rebuild due to DROP PRIMARY KEY, ADD PRIMARY KEY(a)
DROP TABLE t1;

(When testing versions predating MDEV-13134, replace the ALGORITHM=INSTANT and ALGORITHM=NOCOPY with the broader clause ALGORITHM=INPLACE.)



 Comments   
Comment by Marko Mäkelä [ 2018-09-27 ]

kevg, this is closely related to MDEV-15564.

Comment by Marko Mäkelä [ 2019-02-20 ]

kevg, is this a duplicate of MDEV-15564?

Comment by Marko Mäkelä [ 2019-02-20 ]

Without remembering about this task, I recently filed MDEV-18583, which duplicates this. Here is a test case for 10.4 with MDEV-15564 present:

--source include/have_innodb.inc
CREATE TABLE t1 (
 a VARCHAR(1) COLLATE utf8_bin PRIMARY KEY,
 b VARCHAR(1) COLLATE utf8_bin UNIQUE,
 c VARCHAR(1) COLLATE utf8_bin
) ENGINE=InnoDB;
INSERT INTO t1 VALUES('a','b','c');
--enable_info
ALTER TABLE t1 MODIFY c VARCHAR(1) COLLATE utf8_general_ci,
ALGORITHM=INSTANT;
# FIXME: ALGORITHM=NOCOPY for DROP INDEX b, ADD UNIQUE INDEX(b)
ALTER TABLE t1 MODIFY b VARCHAR(1) COLLATE utf8_general_ci;
# FIXME: ALGORITHM=INPLACE can rebuild for DROP PRIMARY KEY, ADD PRIMARY KEY(a)
ALTER TABLE t1 MODIFY a VARCHAR(1) COLLATE utf8_general_ci;
--disable_info
CHECK TABLE t1;
SELECT * FROM t1;
DROP TABLE t1;
 
CREATE TABLE t2(a VARCHAR(1) COLLATE utf8_bin PRIMARY KEY);
INSERT INTO t2 VALUES('a');
# FIXME: should allow ALGORITHM=INSTANT (compatible collation)
--enable_info
ALTER TABLE t2 MODIFY a VARCHAR(1) COLLATE utf8mb4_bin;
--disable_info
SELECT * FROM t2;
DROP TABLE t2;

Comment by Eugene Kosov (Inactive) [ 2019-04-01 ]

Requesting a first round of review.

Comment by Marko Mäkelä [ 2019-04-02 ]

I don’t think that we should try to support the table-rebuilding case in InnoDB.
If the SQL layer sets the correct flags, InnoDB should support this via the existing flags for adding and dropping (effectively rebuilding) all affected indexes.

The only InnoDB change should be related to determining whether the character encoding of CHAR(n) columns can be changed instantaneously. For ROW_FORMAT=REDUNDANT, InnoDB cannot allow that, because the columns would always be stored as n*mbmaxlen bytes.

For other ROW_FORMAT than REDUNDANT, InnoDB can allow instantaneous CHAR(n) changes from utf8mb3 to utf8mb4, because it internally uses a variable-length encoding (n*mbminlenn*mbmaxlen bytes) when mbminlen!=mbmaxlen.

Comment by Alexander Barkov [ 2019-05-17 ]

Eugene, please adjust your patch on top of the recent changes in 10.4.
Thanks.

Comment by Alexander Barkov [ 2019-06-21 ]

The set of patches in https://github.com/MariaDB/server/commits/bb-10.4-MDEV-17301-charset-nocopy are OK.
Please join this patch:
NFC: move generic code to a generic method
with the main patch:
MDEV-17301 Change of COLLATE unnecessarily requires ALGORITHM=COPY

Ok to push after this (and after fixing buildbot compilation failures).

Comment by Marko Mäkelä [ 2022-09-01 ]

MDEV-26294 essentially reverted this optimization in 10.4.26 and 10.5.17 when the column is part of an index. In 10.6.9, the affected indexes will be rebuilt using the correct collation information.

Comment by Thirunarayanan Balathandayuthapani [ 2022-09-01 ]

MDEV-29436 has been filed because changing column collation for text still requires COPY algorithm

Generated at Thu Feb 08 08:35:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.