[MDEV-27864] Alter table modify column for same data type copying table Created: 2022-02-16  Updated: 2023-05-02  Resolved: 2023-05-02

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.4.21
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Scott Sommerville (Inactive) Assignee: Alexander Barkov
Resolution: Incomplete Votes: 3
Labels: alter, ddl, json
Environment:

Amazon RDS


Attachments: Text File cs0373101-orig-10.4.txt     Text File cs0373101-orig-10.5.txt     File cs0373101-orig.sql    
Issue Links:
Relates
relates to MDEV-15564 Avoid table rebuild in ALTER TABLE on... Closed
relates to MDEV-27018 IF and COALESCE lose "json" property Closed
relates to MDEV-30669 Changing the Data Type of a Column fr... Open

 Description   

The customer is observing the copying of the entire table when the DDL statement "alter table <table_name> modify column <column_name> json;" is ran against a table with an already existing data-type of JSON for the same column. I have recreated this in my local tests and have been able to compare with non JSON data-types where the table is not copied/recreated.
Below you can see the table with the JSON field (dummy data) being recreate/copied:

MariaDB [CS0373101]> select count(*) from test2;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.249 sec)
 
MariaDB [CS0373101]> alter table test2 ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Query OK, 0 rows affected (4.673 sec)
Records: 0 Duplicates: 0 Warnings: 0
 
MariaDB [CS0373101]> update test2 set output=replace(output,"'",'"');
Query OK, 1000000 rows affected (6.885 sec)
Rows matched: 1000000 Changed: 1000000 Warnings: 0
 
MariaDB [CS0373101]> alter table test2 modify column `output` json;
Query OK, 1000000 rows affected (5.297 sec)
Records: 1000000 Duplicates: 0 Warnings: 0
 
MariaDB [CS0373101]> show create table test2;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| test2 | CREATE TABLE `test2` (
`output` text CHARACTER SET latin1 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.001 sec)
 
MariaDB [CS0373101]> select * from test2 limit 10;
+------------------------------------------------------------------+
| output |
+------------------------------------------------------------------+
| {"name": "Eric Miller", "email": "doylecharlotte@singh.com"} |
| {"name": "Sarah Cooper", "email": "sbaker@molina-rodriguez.com"} |
| {"name": "John Williams", "email": "shepardsteven@griffith.com"} |
| {"name": "James Scott", "email": "james43@taylor.info"} |
| {"name": "Crystal West", "email": "garciasean@hotmail.com"} |
| {"name": "Theresa Bush", "email": "julie61@gmail.com"} |
| {"name": "Gerald Hamilton", "email": "lori32@gmail.com"} |
| {"name": "Derek Conway", "email": "svasquez@becker-neal.com"} |
| {"name": "Natalie Mitchell", "email": "dyermario@yahoo.com"} |
| {"name": "Taylor Smith", "email": "deborahjenkins@brown.com"} |
+------------------------------------------------------------------+
10 rows in set (0.001 sec)
 
MariaDB [CS0373101]> alter table test2 modify if exists output json;
Query OK, 1000000 rows affected (5.687 sec)
Records: 1000000 Duplicates: 0 Warnings: 0

Conversely, I created another similar table, but this time with the data type of text and the recreation/copying of the table does not occur as shown below:

MariaDB [CS0373101]> show create table test3;
+-------+--------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+--------------------------------------------------------------------------------------------+
| test3 | CREATE TABLE `test3` (
`output` text DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
MariaDB [CS0373101]> select * from test3 limit 10;
+---------------------------------------------------------------------+
| output |
+---------------------------------------------------------------------+
| {'name': 'Steven Carter', 'email': 'williamskaren@hotmail.com'} |
| {'name': 'David Paul', 'email': 'teresa62@becker.org'} |
| {'name': 'Nicole Miller', 'email': 'bryan03@gmail.com'} |
| {'name': 'Samantha Alexander', 'email': 'allentodd@yahoo.com'} |
| {'name': 'Eric Jacobs', 'email': 'greggblack@yahoo.com'} |
| {'name': 'Kimberly Ortiz', 'email': 'escott@mitchell-nelson.com'} |
| {'name': 'Margaret Mitchell', 'email': 'franklinrichard@gmail.com'} |
| {'name': 'Jennifer Roth', 'email': 'prollins@jones.com'} |
| {'name': 'Michael Rivera', 'email': 'roy15@yahoo.com'} |
| {'name': 'Amy Campbell', 'email': 'christina97@martin.com'} |
+---------------------------------------------------------------------+
10 rows in set (0.001 sec)
 
MariaDB [CS0373101]> alter table test3 modify if exists output text;
Query OK, 0 rows affected (0.020 sec)
Records: 0 Duplicates: 0 Warnings: 0
 
MariaDB [CS0373101]>

Therefore, there appears to be an issue for the 10.4.21 MariaDB Community edition where despite a field already being of a JSON data-type, the table is being recreated.

This is proving problematic for the customer, as they have very large tables with the JSON data-type column and the resultant table recreation when the DDL statement is ran, is causing delays in their environment.



 Comments   
Comment by Sergei Petrunia [ 2022-03-28 ]

Trying on 10.8:

create table t1 (col1 text);
insert into t1 select '{"foo":"bar"}' from seq_1_to_100000;

MariaDB [test]> alter table t1 modify if exists col1 json;
Query OK, 100000 rows affected (4.477 sec)             
Records: 100000  Duplicates: 0  Warnings: 0

It seems to have copied the table here.

MariaDB [test]> alter table t1 modify if exists col1 json;
Query OK, 0 rows affected (0.013 sec)
Records: 0  Duplicates: 0  Warnings: 0

If the column is already defined as JSON, it didn't.

Comment by Marko Mäkelä [ 2022-05-10 ]

I think that this needs to be fixed in a similar way to MDEV-15564, outside storage engines. The SQL layer should check if the encoding of the column is actually changing.

Comment by Alexander Barkov [ 2022-06-03 ]

The test for a possible ALTER fails here in 10.4:

    if (new_field)
    {
      /* Field is not dropped. Evaluate changes bitmap for it. */
 
      /*
        Check if type of column has changed.
      */
      bool is_equal= field->is_equal(*new_field);
      if (!is_equal)
      {
        if (field->can_be_converted_by_engine(*new_field))
        {

field->is_equal() returns:

  • false in 10.4
  • true in 10.5

The problem does not exist in 10.5. It was most likely fixed in 10.5 by this commit:

commit e4b302e436c0a1e0d477a8e8e8e513112fd3ec7f
 
    MDEV-27018 IF and COALESCE lose "json" property

Comment by Alexander Barkov [ 2022-06-07 ]

This script:

SET profiling=1;
SET profiling_history_size=1;
 
CREATE OR REPLACE TABLE stat (state TEXT, types TEXT);
 
CREATE OR REPLACE TABLE t1 (a TEXT);
ALTER TABLE t1 MODIFY a JSON;
INSERT INTO stat
SELECT state, '1. TEXT to JSON' AS types FROM INFORMATION_SCHEMA.PROFILING WHERE state LIKE 'Copy to tmp table';
 
ALTER TABLE t1 MODIFY a JSON;
INSERT INTO stat
SELECT state, '2. JSON to JSON' AS types FROM INFORMATION_SCHEMA.PROFILING WHERE state LIKE 'Copy to tmp table';
 
ALTER TABLE t1 MODIFY a TEXT;
INSERT INTO stat
SELECT state, '3. JSON to TEXT' AS types FROM INFORMATION_SCHEMA.PROFILING WHERE state LIKE 'Copy to tmp table';
 
ALTER TABLE t1 MODIFY a TEXT;
INSERT INTO stat
SELECT state, '4. TEXT to TEXT' AS types FROM INFORMATION_SCHEMA.PROFILING WHERE state LIKE 'Copy to tmp table';
 
SELECT state, types FROM stat;

in 10.4.26 returns:

+-------------------+-----------------+
| state             | types           |
+-------------------+-----------------+
| Copy to tmp table | 1. TEXT to JSON |
| Copy to tmp table | 1. TEXT to JSON |
| Copy to tmp table | 2. JSON to JSON |
| Copy to tmp table | 2. JSON to JSON |
| Copy to tmp table | 3. JSON to TEXT |
| Copy to tmp table | 3. JSON to TEXT |
+-------------------+-----------------

in 10.5.14 (also tested in 10.5.17) returns:

+-------------------+-----------------+
| state             | types           |
+-------------------+-----------------+
| copy to tmp table | 1. TEXT to JSON |
| copy to tmp table | 3. JSON to TEXT |
+-------------------+-----------------+

in 10.5.13 (an in 10.5.14 before the fix for MDEV-27018, e.g. as of 28e166d6435741c46e0ea789657b739f78eb0425) returns:

+-------------------+-----------------+
| state             | types           |
+-------------------+-----------------+
| copy to tmp table | 1. TEXT to JSON |
| copy to tmp table | 2. JSON to JSON |
| copy to tmp table | 3. JSON to TEXT |
+-------------------+-----------------+

A temporary table for JSON->JSON conversion:

  • is created in 10.4.26 and in 10.5.x before the patch for MDEV-27018)
  • is not created starting from 10.5.14 (after the patch for MDEV-27018)
Generated at Thu Feb 08 09:56:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.