[MDEV-24763] ALTER TABLE fails to rename a column in SYS_FIELDS Created: 2021-02-02  Updated: 2021-02-12  Resolved: 2021-02-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.0, 10.1, 10.3.28, 10.4.18, 10.5.9, 10.6.0, 10.2
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10

Type: Bug Priority: Critical
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: corruption


 Description   

--source include/have_innodb.inc
CREATE TABLE t3 ( col1 INT, col_int INTEGER, col_text TEXT ) ENGINE = InnoDB;
ALTER TABLE t3 ADD PRIMARY KEY ( col_text(9), col_int ) ;
ALTER TABLE t3 ADD INDEX ( col_int, col1 ) ;
ALTER TABLE t3 ADD COLUMN col_int_g_copy INTEGER GENERATED ALWAYS AS (col_int)  ;
ALTER TABLE t3 CHANGE COLUMN IF EXISTS col1 otto INT ;
# The last statement leads to crash.
DROP TABLE t3;
 
Version: '10.6.0-MariaDB-debug-log'  socket: '/home/mleich/Server_bin/10.6_debug/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
2021-02-02 12:58:25 4 [ERROR] InnoDB: No matching column for `col1` in index `col_int` of table `test`.`t3`
210202 12:58:25 [ERROR] mysqld got signal 11 ;
Query (0x7f4e54012dc0): ALTER TABLE t3 CHANGE COLUMN IF EXISTS col1 otto INT
 
Connection ID (thread ID): 4
Status: NOT_KILLED
 
#0  __pthread_kill (threadid=<optimized out>, signo=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
#1  0x0000556d35791e13 in my_write_core (sig=11) at /home/mleich/Server/10.6/mysys/stacktrace.c:424
#2  0x0000556d34e83b08 in handle_fatal_signal (sig=11) at /home/mleich/Server/10.6/sql/signal_handler.cc:330
#3  <signal handler called>
#4  0x0000556d3533ea7c in dict_stats_try_drop_table (thd=0x7f4e54000cf8, name=..., table_name=...) at /home/mleich/Server/10.6/storage/innobase/handler/handler0alter.cc:10096
#5  0x0000556d3533ebf7 in innobase_reload_table (thd=0x7f4e54000cf8, table=0x0, table_name=..., ctx=...) at /home/mleich/Server/10.6/storage/innobase/handler/handler0alter.cc:10133
#6  0x0000556d353417fc in ha_innobase::commit_inplace_alter_table (this=0x7f4e541c87d0, altered_table=0x7f4e7d62d960, ha_alter_info=0x7f4e7d62d8c0, commit=true) at /home/mleich/Server/10.6/storage/innobase/handler/handler0alter.cc:11263
#7  0x0000556d34e94a3e in handler::ha_commit_inplace_alter_table (this=0x7f4e541c87d0, altered_table=0x7f4e7d62d960, ha_alter_info=0x7f4e7d62d8c0, commit=true) at /home/mleich/Server/10.6/sql/handler.cc:4855
#8  0x0000556d34c27462 in mysql_inplace_alter_table (thd=0x7f4e54000cf8, table_list=0x7f4e54012ed0, table=0x7f4e5419c2a8, altered_table=0x7f4e7d62d960, ha_alter_info=0x7f4e7d62d8c0, target_mdl_request=0x7f4e7d62dd30, alter_ctx=0x7f4e7d62e880) at /home/mleich/Server/10.6/sql/sql_table.cc:8138
#9  0x0000556d34c2ebd8 in mysql_alter_table (thd=0x7f4e54000cf8, new_db=0x7f4e54005768, new_name=0x7f4e54005b68, create_info=0x7f4e7d62f490, table_list=0x7f4e54012ed0, alter_info=0x7f4e7d62f3c0, order_num=0, order=0x0, ignore=false, if_exists=false) at /home/mleich/Server/10.6/sql/sql_table.cc:10683
#10 0x0000556d34cd4d71 in Sql_cmd_alter_table::execute (this=0x7f4e540136c0, thd=0x7f4e54000cf8) at /home/mleich/Server/10.6/sql/sql_alter.cc:539
#11 0x0000556d34b3034d in mysql_execute_command (thd=0x7f4e54000cf8) at /home/mleich/Server/10.6/sql/sql_parse.cc:5880
#12 0x0000556d34b365e1 in mysql_parse (thd=0x7f4e54000cf8, rawbuf=0x7f4e54012dc0 "ALTER TABLE t3 CHANGE COLUMN IF EXISTS col1 otto INT", length=52, parser_state=0x7f4e7d630520) at /home/mleich/Server/10.6/sql/sql_parse.cc:7906
#13 0x0000556d34b22af5 in dispatch_command (command=COM_QUERY, thd=0x7f4e54000cf8, packet=0x7f4e540087b9 "ALTER TABLE t3 CHANGE COLUMN IF EXISTS col1 otto INT ", packet_length=53) at /home/mleich/Server/10.6/sql/sql_parse.cc:1833
#14 0x0000556d34b2143d in do_command (thd=0x7f4e54000cf8) at /home/mleich/Server/10.6/sql/sql_parse.cc:1365
#15 0x0000556d34cca462 in do_handle_one_connection (connect=0x556d37f62df8, put_in_cache=true) at /home/mleich/Server/10.6/sql/sql_connect.cc:1410
#16 0x0000556d34cca1d0 in handle_one_connection (arg=0x556d37f50f08) at /home/mleich/Server/10.6/sql/sql_connect.cc:1312
#17 0x0000556d3520e213 in pfs_spawn_thread (arg=0x556d37ebab18) at /home/mleich/Server/10.6/storage/perfschema/pfs.cc:2201
#18 0x00007f4e88f8d7fc in start_thread (arg=0x7f4e7d631700) at pthread_create.c:465
#19 0x00007f4e881c3b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
 
10.6 8a495d7f90f64566d083d9ccd04cd95023a40931 2021-01-30T10:50:14+03:00
origin/10.5 7c052cdf0bc53f8ee3387186993710bd9754b65d 2021-01-17T13:21:55+01:00
origin/10.4 542d769ea1a22a7a6a87c9fe76ff911a162ade44 2021-01-28T07:39:34+02:00
origin/10.3 df1eefb2ad138846269d40372678af805589700a 2021-01-07T17:53:04+01:00
no replay on some 10.2 development tree mid of January



 Comments   
Comment by Elena Stepanova [ 2021-02-02 ]

The failure started happening on 10.3 after this commit:

commit 9a645dae9e59ec398cfda33529c44002625ddc87
Author: Nikita Malyavin
Date:   Mon Dec 21 22:54:27 2020 +1000
 
    MDEV-23632 ALTER TABLE...ADD KEY creates corrupted index on virtual column

Comment by Marko Mäkelä [ 2021-02-02 ]

I believe that the root cause of this problem may affect 10.2 as well, but I am not sure of that yet. The provided test case does depend on MDEV-11369 (instant ADD COLUMN), which is only present starting with 10.3.

The root cause why the table was not found is that innobase_rename_column_try() is looking for SYS_FIELDS.POS=4<<16 instead of SYS_FIELDS.POS=1<<16. Hence, the column was not renamed.

Debugging this with INFORMATION_SCHEMA.INNODB_SYS_FIELDS is not easy, because it will not display the actual value of the POS column unless the following patch is applied:

diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc
index 386b99bcaad..ce77b04a32b 100644
--- a/storage/innobase/dict/dict0load.cc
+++ b/storage/innobase/dict/dict0load.cc
@@ -2072,8 +2072,9 @@ dict_load_field_low(
 		return("SYS_FIELDS.POS mismatch");
 	}
 
+	prefix_len = pos_and_prefix_len & 0xFFFFUL;
+	if (!index) position = pos_and_prefix_len; else
 	if (first_field || pos_and_prefix_len > 0xFFFFUL) {
-		prefix_len = pos_and_prefix_len & 0xFFFFUL;
 		position = (pos_and_prefix_len & 0xFFFF0000UL)  >> 16;
 	} else {
 		prefix_len = 0;

The prefix_len is not being displayed at all.

I am working on trying to find out why we are trying to look up the wrong field offset, and how to provide some more robust error handling for a dictionary corruption scenario.

Comment by Marko Mäkelä [ 2021-02-02 ]

I do not think that we can easily fix the consequences of the bug. Even if innobase_reload_table() checked for the null pointer, we would crash in various places later due to m_prebuilt->table==NULL. The root cause is in innobase_rename_column_try(), and I think that it should affect MariaDB 10.0, 10.1, MySQL 5.6, 5.7 as well. I am working on a test case for 10.2.

Comment by Marko Mäkelä [ 2021-02-02 ]

I repeated the root cause on 10.2:

--source include/have_innodb.inc
CREATE TABLE t1 (a INT, b TEXT, c INT, PRIMARY KEY(b(9)), INDEX(c,a))
ENGINE=InnoDB;
ALTER TABLE t1 CHANGE COLUMN a u INT;
SELECT * FROM information_schema.innodb_sys_fields where name='a';
DROP TABLE t1;

Without the fix, the SELECT will wrongly return one row for the old column name a. The column was supposed to be renamed from a to u, but because it is not the first field in the secondary index, and because the primary key (but not the secondary index) contains a column prefix, the UPDATE SYS_FIELDS will fail to find the record with the wrong search key (INDEX_ID,POS).

Comment by Matthias Leich [ 2021-02-03 ]

Results of RQG testing on
origin/10.3 59eda73eff1a22ac0373d818bc802c05e82b5449 2021-02-01T13:17:17+02:00
including a patch 
Test battery for broad range coverage:  No new bad effects.

Generated at Thu Feb 08 09:32:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.