Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17301

Change of COLLATE unnecessarily requires ALGORITHM=COPY

Details

    Description

      The COLLATE attribute of a column only affects the way how any indexes that are built on the column will be sorted. Thus, it should be possible to change the collation of a column without rebuilding the table, except if the column is part of the PRIMARY KEY.

      Currently, changing the collation requires the table to be rebuilt with ALGORITHM=COPY, in all cases below:

      --source include/have_innodb.inc
      CREATE TABLE t1 (
       a CHAR(1) COLLATE utf8_bin PRIMARY KEY,
       b CHAR(1) COLLATE utf8_bin UNIQUE,
       c CHAR(1) COLLATE utf8_bin
      ) ENGINE=InnoDB;
      ALTER TABLE t1 CHANGE COLUMN c c CHAR(1) COLLATE utf8_general_ci,
      ALGORITHM=INSTANT;
      ALTER TABLE t1 CHANGE COLUMN b b CHAR(1) COLLATE utf8_general_ci,
      ALGORITHM=NOCOPY; # implies DROP INDEX b, ADD UNIQUE INDEX(b)
      ALTER TABLE t1 CHANGE COLUMN a a CHAR(1) COLLATE utf8_general_ci,
      ALGORITHM=INPLACE; # rebuild due to DROP PRIMARY KEY, ADD PRIMARY KEY(a)
      DROP TABLE t1;
      

      (When testing versions predating MDEV-13134, replace the ALGORITHM=INSTANT and ALGORITHM=NOCOPY with the broader clause ALGORITHM=INPLACE.)

      Attachments

        Issue Links

          Activity

            I don’t think that we should try to support the table-rebuilding case in InnoDB.
            If the SQL layer sets the correct flags, InnoDB should support this via the existing flags for adding and dropping (effectively rebuilding) all affected indexes.

            The only InnoDB change should be related to determining whether the character encoding of CHAR(n) columns can be changed instantaneously. For ROW_FORMAT=REDUNDANT, InnoDB cannot allow that, because the columns would always be stored as n*mbmaxlen bytes.

            For other ROW_FORMAT than REDUNDANT, InnoDB can allow instantaneous CHAR(n) changes from utf8mb3 to utf8mb4, because it internally uses a variable-length encoding (n*mbminlenn*mbmaxlen bytes) when mbminlen!=mbmaxlen.

            marko Marko Mäkelä added a comment - I don’t think that we should try to support the table-rebuilding case in InnoDB. If the SQL layer sets the correct flags, InnoDB should support this via the existing flags for adding and dropping (effectively rebuilding) all affected indexes. The only InnoDB change should be related to determining whether the character encoding of CHAR(n) columns can be changed instantaneously. For ROW_FORMAT=REDUNDANT , InnoDB cannot allow that, because the columns would always be stored as n*mbmaxlen bytes. For other ROW_FORMAT than REDUNDANT , InnoDB can allow instantaneous CHAR(n) changes from utf8mb3 to utf8mb4, because it internally uses a variable-length encoding ( n*mbminlen … n*mbmaxlen bytes) when mbminlen!=mbmaxlen .

            Eugene, please adjust your patch on top of the recent changes in 10.4.
            Thanks.

            bar Alexander Barkov added a comment - Eugene, please adjust your patch on top of the recent changes in 10.4. Thanks.

            The set of patches in https://github.com/MariaDB/server/commits/bb-10.4-MDEV-17301-charset-nocopy are OK.
            Please join this patch:
            NFC: move generic code to a generic method
            with the main patch:
            MDEV-17301 Change of COLLATE unnecessarily requires ALGORITHM=COPY

            Ok to push after this (and after fixing buildbot compilation failures).

            bar Alexander Barkov added a comment - The set of patches in https://github.com/MariaDB/server/commits/bb-10.4-MDEV-17301-charset-nocopy are OK. Please join this patch: NFC: move generic code to a generic method with the main patch: MDEV-17301 Change of COLLATE unnecessarily requires ALGORITHM=COPY Ok to push after this (and after fixing buildbot compilation failures).

            MDEV-26294 essentially reverted this optimization in 10.4.26 and 10.5.17 when the column is part of an index. In 10.6.9, the affected indexes will be rebuilt using the correct collation information.

            marko Marko Mäkelä added a comment - MDEV-26294 essentially reverted this optimization in 10.4.26 and 10.5.17 when the column is part of an index. In 10.6.9, the affected indexes will be rebuilt using the correct collation information.

            MDEV-29436 has been filed because changing column collation for text still requires COPY algorithm

            thiru Thirunarayanan Balathandayuthapani added a comment - MDEV-29436 has been filed because changing column collation for text still requires COPY algorithm

            People

              kevg Eugene Kosov (Inactive)
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.