Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18584

Avoid copying when altering CHAR column in InnoDB table

Details

    Description

      We do not allow instant character set changes of CHAR columns in InnoDB. Initially, we allowed it for ROW_FORMAT=REDUNDANT as part of MDEV-15563, but that had to be reverted in MDEV-18627.

      In the DYNAMIC, COMPACT and COMPRESSED formats, CHAR columns could be instantly extended in those special cases when they are internally stored as variable-length:

      • when the column length (chars*mbmaxlen) exceeds 255 bytes
      • when using a variable-length character set (mbminlen!=mbmaxlen), such as UTF-8
      • when the column type is CHAR(0)

      For the REDUNDANT format, it is best to disallow such instantaneous changes for CHAR columns, and let them remain fixed-size, always explicitly storing the same length for every column. The mbminlen!=mbmaxlen optimization was introduced for COMPACT,DYNAMIC,COMPRESSED only.

      Attachments

        Issue Links

          Activity

            If we implement this, we must be careful not to introduce any bug like MDEV-26294 in case the columns are indexed.

            marko Marko Mäkelä added a comment - If we implement this, we must be careful not to introduce any bug like MDEV-26294 in case the columns are indexed.

            I think that the changes would be around the following code:

            diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
            index 7acbe79d732..a761af2a8ec 100644
            --- a/storage/innobase/handler/ha_innodb.cc
            +++ b/storage/innobase/handler/ha_innodb.cc
            @@ -21026,16 +21026,8 @@ bool ha_innobase::can_convert_string(const Field_string *field,
               if (new_type.type_handler() != field->type_handler())
                 return false;
             
            -  if (new_type.char_length != field->char_length())
            -    return false;
            -
               const Charset field_cs(field->charset());
             
            -  if (new_type.length != field->max_display_length() &&
            -      (!m_prebuilt->table->not_redundant() ||
            -       field_cs.mbminlen() == field_cs.mbmaxlen()))
            -    return false;
            -
               if (new_type.charset != field->charset())
               {
                 if (!field_cs.encoding_allows_reinterpret_as(new_type.charset))
            

            The conditions that the patch is removing will have to be adapted and likely moved later within the function. The condition m_prebuilt->table->not_redundant() && new_type.char_length >= field->char_length() && }} must hold in this optimization. Additionally, either {{field_cs.mbminlen() < field_cs.mbmaxlen()) must hold or the column must be already long enough so that it is internally stored in a variable-length format.

            We must carefully review what happens when retrieving data from a variable-length stored CHAR column that is shorter than n*mbminlen bytes. If the data is being padded correctly, then there should be no issue with that. Also, as already noted, we must check what happens when an index is created on the full column or a prefix of it.

            marko Marko Mäkelä added a comment - I think that the changes would be around the following code: diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index 7acbe79d732..a761af2a8ec 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -21026,16 +21026,8 @@ bool ha_innobase::can_convert_string(const Field_string *field, if (new_type.type_handler() != field->type_handler()) return false; - if (new_type.char_length != field->char_length()) - return false; - const Charset field_cs(field->charset()); - if (new_type.length != field->max_display_length() && - (!m_prebuilt->table->not_redundant() || - field_cs.mbminlen() == field_cs.mbmaxlen())) - return false; - if (new_type.charset != field->charset()) { if (!field_cs.encoding_allows_reinterpret_as(new_type.charset)) The conditions that the patch is removing will have to be adapted and likely moved later within the function. The condition m_prebuilt->table->not_redundant() && new_type.char_length >= field->char_length() && }} must hold in this optimization. Additionally, either {{field_cs.mbminlen() < field_cs.mbmaxlen()) must hold or the column must be already long enough so that it is internally stored in a variable-length format. We must carefully review what happens when retrieving data from a variable-length stored CHAR column that is shorter than n*mbminlen bytes. If the data is being padded correctly, then there should be no issue with that. Also, as already noted, we must check what happens when an index is created on the full column or a prefix of it.

            People

              thiru Thirunarayanan Balathandayuthapani
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.