[MDEV-18584] Avoid copying when altering CHAR column in InnoDB table Created: 2019-02-14  Updated: 2022-11-09

Status: Confirmed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Storage Engine - InnoDB
Affects Version/s: 10.4.3
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Thirunarayanan Balathandayuthapani
Resolution: Unresolved Votes: 0
Labels: instant, types

Issue Links:
Problem/Incident
is caused by MDEV-15564 Avoid table rebuild in ALTER TABLE on... Closed
Relates
relates to MDEV-26294 Duplicate entries in unique index not... Closed

 Description   

We do not allow instant character set changes of CHAR columns in InnoDB. Initially, we allowed it for ROW_FORMAT=REDUNDANT as part of MDEV-15563, but that had to be reverted in MDEV-18627.

In the DYNAMIC, COMPACT and COMPRESSED formats, CHAR columns could be instantly extended in those special cases when they are internally stored as variable-length:

  • when the column length (chars*mbmaxlen) exceeds 255 bytes
  • when using a variable-length character set (mbminlen!=mbmaxlen), such as UTF-8
  • when the column type is CHAR(0)

For the REDUNDANT format, it is best to disallow such instantaneous changes for CHAR columns, and let them remain fixed-size, always explicitly storing the same length for every column. The mbminlen!=mbmaxlen optimization was introduced for COMPACT,DYNAMIC,COMPRESSED only.



 Comments   
Comment by Marko Mäkelä [ 2022-02-21 ]

If we implement this, we must be careful not to introduce any bug like MDEV-26294 in case the columns are indexed.

Comment by Marko Mäkelä [ 2022-11-09 ]

I think that the changes would be around the following code:

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 7acbe79d732..a761af2a8ec 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -21026,16 +21026,8 @@ bool ha_innobase::can_convert_string(const Field_string *field,
   if (new_type.type_handler() != field->type_handler())
     return false;
 
-  if (new_type.char_length != field->char_length())
-    return false;
-
   const Charset field_cs(field->charset());
 
-  if (new_type.length != field->max_display_length() &&
-      (!m_prebuilt->table->not_redundant() ||
-       field_cs.mbminlen() == field_cs.mbmaxlen()))
-    return false;
-
   if (new_type.charset != field->charset())
   {
     if (!field_cs.encoding_allows_reinterpret_as(new_type.charset))

The conditions that the patch is removing will have to be adapted and likely moved later within the function. The condition m_prebuilt->table->not_redundant() && new_type.char_length >= field->char_length() && }} must hold in this optimization. Additionally, either {{field_cs.mbminlen() < field_cs.mbmaxlen()) must hold or the column must be already long enough so that it is internally stored in a variable-length format.

We must carefully review what happens when retrieving data from a variable-length stored CHAR column that is shorter than n*mbminlen bytes. If the data is being padded correctly, then there should be no issue with that. Also, as already noted, we must check what happens when an index is created on the full column or a prefix of it.

Generated at Thu Feb 08 08:45:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.