[MDEV-26141] Crash in instant ALTER TABLE after IMPORT TABLESPACE Created: 2021-07-14  Updated: 2023-12-11

Status: Confirmed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 3
Labels: None

Attachments: File ML_import5.test    
Issue Links:
Blocks
is blocked by MDEV-11658 Simpler, faster IMPORT of InnoDB tables Open
Problem/Incident
is caused by MDEV-15562 Instant DROP COLUMN or changing the o... Closed
Relates
relates to MDEV-18322 Assertion `!"wrong page type"' or Ass... Closed
relates to MDEV-26022 InnoDB page data size mismatch crash Closed
relates to MDEV-28786 InnoDB crash leads to pagesize compar... Closed

 Description   

SET SESSION sql_mode = 'NO_ENGINE_SUBSTITUTION' ;
SET DEFAULT_STORAGE_ENGINE= 'InnoDB' ;
USE test ;
CREATE TABLE `t1` (
`col_int` int,
`col_varchar_255` varchar(255),
`col_text` text) ENGINE=innodb ;
INSERT /*! IGNORE */ INTO t1 VALUES 
(1028915200, 'c', 'image') ,
(NULL, 'j', 'historic') ,
(6, 'loyal', 'retreat') ,
(NULL, 'depth', 'male') ,
(1, 'squad', 'too') ,
(-1846607872, 'roof', 'head') ,
(NULL, 'o', 'turkey') ,
(NULL, 'w', 'deal') ,
(3, 'f', 'cousin') ,
(-996605952, 'i', 'online') ;
FLUSH TABLES t1 FOR EXPORT ;
UNLOCK TABLES ;
CREATE TABLE imp_t1 LIKE t1 ;
ALTER TABLE imp_t1 DISCARD TABLESPACE ;
'copy ibd file around'
ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST ;
ALTER TABLE imp_t1 DISCARD TABLESPACE ;
Warnings:
Warning 1814    Tablespace has been discarded for table `imp_t1`
Warning 1812    Tablespace is missing for table 'test/imp_t1'
ALTER TABLE imp_t1 IMPORT TABLESPACE ;
Warnings:
Warning 1810    IO Read error: (2, No such file or directory) Error opening './test/imp_t1.cfg', will attempt to import without schema verification
ALTER TABLE imp_t1 MODIFY COLUMN col_int INT FIRST ;
ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST ;
ERROR HY000: Lost connection to server during query
DROP TABLE t1;
ERROR HY000: Server has gone away
 
2021-07-14  4:26:54 4 [Note] InnoDB: Resuming purge
2021-07-14  4:26:54 4 [Note] InnoDB: Sync to disk
2021-07-14  4:26:54 4 [Note] InnoDB: Sync to disk - done!
2021-07-14  4:26:54 4 [Note] InnoDB: Phase I - Update all pages
2021-07-14  4:26:54 4 [Note] InnoDB: Sync to disk
2021-07-14  4:26:54 4 [Note] InnoDB: Sync to disk - done!
2021-07-14  4:26:54 4 [Note] InnoDB: Phase III - Flush changes to disk
2021-07-14  4:26:54 4 [Note] InnoDB: Phase IV - Flush complete
2021-07-14  4:26:54 4 [ERROR] [FATAL] InnoDB: Page old data size 418 new data size 408, page old max ins size 15828 new max ins size 15838
210714  4:26:54 [ERROR] mysqld got signal 6 ;
...
Query (0x7fb554010650): ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST
Connection ID (thread ID): 4
Status: NOT_KILLED
 
Thread 1 (Thread 0x7fb581463700 (LWP 1597530)):
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055c312396c7f in my_write_core (sig=sig@entry=6) at /data/Server/10.6G/mysys/stacktrace.c:424
#2  0x000055c311f16010 in handle_fatal_signal (sig=6) at /data/Server/10.6G/sql/signal_handler.cc:344
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00007fb582e91859 in __GI_abort () at abort.c:79
#6  0x000055c311bf1d2b in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /data/Server/10.6G/storage/innobase/ut/ut0ut.cc:519
#7  0x000055c311bf255f in btr_page_reorganize_low (index=0x7fb554073838, mtr=<optimized out>, cursor=<optimized out>, cursor=<optimized out>) at /data/Server/10.6G/storage/innobase/include/ut0ut.h:327
#8  0x000055c31228366b in btr_page_reorganize (cursor=cursor@entry=0x7fb58145e2d8, index=index@entry=0x7fb554073838, mtr=mtr@entry=0x7fb58145e640) at /data/Server/10.6G/storage/innobase/btr/btr0btr.cc:1609
#9  0x000055c3122a2da1 in btr_cur_pessimistic_update (flags=flags@entry=10, cursor=cursor@entry=0x7fb58145e2d0, offsets=offsets@entry=0x7fb58145e2b8, offsets_heap=offsets_heap@entry=0x7fb58145e2c0, entry_heap=<optimized out>, big_rec=big_rec@entry=0x7fb58145e2c8, update=<optimized out>, cmpl_info=1, thr=0x7fb554055878, trx_id=40, mtr=0x7fb58145e640) at /data/Server/10.6G/storage/innobase/btr/btr0cur.cc:5134
#10 0x000055c3121b1d70 in innobase_instant_try (altered_table=<optimized out>, altered_table=<optimized out>, table=<optimized out>, trx=<optimized out>, ctx=<optimized out>, ha_alter_info=<optimized out>) at /data/Server/10.6G/storage/innobase/handler/handler0alter.cc:5933
#11 commit_try_norebuild (ha_alter_info=ha_alter_info@entry=0x7fb58145eda0, ctx=ctx@entry=0x7fb554011f88, altered_table=altered_table@entry=0x7fb58145ee60, old_table=<optimized out>, trx=trx@entry=0x7fb58285a1f8, table_name=<optimized out>) at /data/Server/10.6G/storage/innobase/handler/handler0alter.cc:10363
#12 0x000055c3121a77c2 in ha_innobase::commit_inplace_alter_table (this=<optimized out>, altered_table=<optimized out>, ha_alter_info=<optimized out>, commit=<optimized out>) at /data/Server/10.6G/storage/innobase/handler/handler0alter.cc:10999
#13 0x000055c311dac3d9 in mysql_inplace_alter_table (target_mdl_request=0x7fb58145f720, alter_ctx=0x7fb581460870, trigger_param=0x7fb58145f230, ddl_log_state=0x7fb58145ed40, ha_alter_info=0x7fb58145eda0, altered_table=0x7fb58145ee60, table=0x7fb554082028, table_list=0x7fb554010780, thd=0x7fb554000c58) at /data/Server/10.6G/sql/sql_table.cc:7439
#14 mysql_alter_table (thd=thd@entry=0x7fb554000c58, new_db=new_db@entry=0x7fb5540054a8, new_name=new_name@entry=0x7fb5540058c0, create_info=create_info@entry=0x7fb581461670, table_list=<optimized out>, table_list@entry=0x7fb554010780, alter_info=alter_info@entry=0x7fb581461580, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/Server/10.6G/sql/sql_table.cc:10202
#15 0x000055c311e0a4da in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x7fb554000c58) at /data/Server/10.6G/sql/structs.h:568
#16 0x000055c311d0b626 in mysql_execute_command (thd=0x7fb554000c58, is_called_from_prepared_stmt=<optimized out>) at /data/Server/10.6G/sql/sql_parse.cc:5995
#17 0x000055c311cfc446 in mysql_parse (rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, thd=0x7fb554000c58) at /data/Server/10.6G/sql/sql_parse.cc:8028
#18 mysql_parse (thd=0x7fb554000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /data/Server/10.6G/sql/sql_parse.cc:7950
#19 0x000055c311d07e12 in dispatch_command (command=COM_QUERY, thd=0x7fb554000c58, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /data/Server/10.6G/sql/sql_class.h:1340
#20 0x000055c311d09a98 in do_command (thd=0x7fb554000c58, blocking=blocking@entry=true) at /data/Server/10.6G/sql/sql_parse.cc:1406
#21 0x000055c311e05837 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55c314e0ea58, put_in_cache=put_in_cache@entry=true) at /data/Server/10.6G/sql/sql_connect.cc:1410
#22 0x000055c311e05b2d in handle_one_connection (arg=0x55c314e0ea58) at /data/Server/10.6G/sql/sql_connect.cc:1312
#23 0x00007fb5833ba609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#24 0x00007fb582f8e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
 
origin/10.6, 10.6 315380a4d16ddb16461d906a23be341e354c30ed 2021-07-01T19:06:53+03:00
 
origin/bb-10.6-MDEV-26131 ffc41e1cf56809f82e97daad4869fecac4b46fed 2021-07-13T23:31:16+05:30  
which fixes MDEV-26131 shows the same problem.



 Comments   
Comment by Eugene Kosov (Inactive) [ 2021-09-28 ]

MTR test case looks like this:

--source include/have_innodb.inc
 
--let MYSQLD_DATADIR= `select @@datadir`
 
SET SESSION sql_mode = 'NO_ENGINE_SUBSTITUTION' ;
SET DEFAULT_STORAGE_ENGINE= 'InnoDB' ;
USE test ;
CREATE TABLE `t1` (
`col_int` int,
`col_varchar_255` varchar(255),
`col_text` text) ENGINE=innodb ;
INSERT /*! IGNORE */ INTO t1 VALUES 
(1028915200, 'c', 'image') ,
(NULL, 'j', 'historic') ,
(6, 'loyal', 'retreat') ,
(NULL, 'depth', 'male') ,
(1, 'squad', 'too') ,
(-1846607872, 'roof', 'head') ,
(NULL, 'o', 'turkey') ,
(NULL, 'w', 'deal') ,
(3, 'f', 'cousin') ,
(-996605952, 'i', 'online') ;
FLUSH TABLES t1 FOR EXPORT ;
UNLOCK TABLES ;
 
--copy_file $MYSQLD_DATADIR/test/t1.ibd $MYSQLD_DATADIR/test/tmp.ibd
 
CREATE TABLE imp_t1 LIKE t1 ;
ALTER TABLE imp_t1 DISCARD TABLESPACE ;
ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST ;
ALTER TABLE imp_t1 DISCARD TABLESPACE ;
--copy_file $MYSQLD_DATADIR/test/tmp.ibd $MYSQLD_DATADIR/test/imp_t1.ibd
ALTER TABLE imp_t1 IMPORT TABLESPACE ;
ALTER TABLE imp_t1 MODIFY COLUMN col_int INT FIRST ;
ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST ;

Comment by Marko Mäkelä [ 2021-11-22 ]

kevg, thank you. Nitpick: the first copy_file should be between FLUSH TABLES…FOR EXPORT and UNLOCK TABLES.

The test case appears to simulate a user error: importing a table where the columns are in the wrong order, while not using a .cfg file (so that this metadata mismatch could be caught).

Comment by Alice Sherepa [ 2022-06-09 ]

with Eugene's test:

10.6 892c426371b4be558d32fdeb

 
2022-06-09 17:18:49 4 [ERROR] InnoDB: Page old data size 418 new data size 408, page old max ins size 15828 new max ins size 15838
mariadbd: /10.6/src/storage/innobase/include/rem0rec.h:274: rec_comp_status_t rec_get_status(const rec_t*): Assertion `bits <= REC_STATUS_INSTANT' failed.
220609 17:18:49 [ERROR] mysqld got signal 6 ;
 
Server version: 10.6.9-MariaDB-debug-log
 
??:0(__assert_fail)[0x7f1a6c49bfd6]
include/rem0rec.h:275(rec_get_status(unsigned char const*))[0x558e4b7e7e58]
rem/rem0rec.cc:851(rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned short*, unsigned long, unsigned long, char const*, unsigned int, mem_block_info_t**))[0x558e4ba1a7e1]
handler/handler0alter.cc:5988(innobase_instant_try(Alter_inplace_info const*, ha_innobase_inplace_ctx*, TABLE const*, TABLE const*, trx_t*))[0x558e4b7b1686]
handler/handler0alter.cc:10448(commit_try_norebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, TABLE*, TABLE const*, trx_t*, char const*))[0x558e4b8055fb]
handler/handler0alter.cc:11196(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x558e4b7de091]
sql/handler.cc:5220(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x558e4ac2c550]
sql/sql_table.cc:7471(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, MDL_request*, st_ddl_log_state*, TRIGGER_RENAME_PARAM*, Alter_table_ctx*))[0x558e4a632a6a]
sql/sql_table.cc:10278(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x558e4a647e7c]
sql/sql_alter.cc:542(Sql_cmd_alter_table::execute(THD*))[0x558e4a8098c8]
sql/sql_parse.cc:5996(mysql_execute_command(THD*, bool))[0x558e4a3a176c]
sql/sql_parse.cc:8029(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x558e4a3aee48]
sql/sql_parse.cc:1898(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x558e4a38509c]
sql/sql_parse.cc:1409(do_command(THD*, bool))[0x558e4a381e11]
sql/sql_connect.cc:1418(do_handle_one_connection(CONNECT*, bool))[0x558e4a7ec100]
sql/sql_connect.cc:1314(handle_one_connection)[0x558e4a7eb98c]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x558e4b51ce36]
nptl/pthread_create.c:478(start_thread)[0x7f1a6c9b6609]
addr2line: DWARF error: section .debug_info is larger than its filesize! (0x93ef57 vs 0x530ea0)
??:0(clone)[0x7f1a6c587133]
 
Query (0x62b0000c42a8): ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST

on non-debug:

2022-06-09 17:24:10 4 [ERROR] [FATAL] InnoDB: Page old data size 418 new data size 408, page old max ins size 15828 new max ins size 15838
220609 17:24:10 [ERROR] mysqld got signal 6 ;
 
Server version: 10.6.7-MariaDB
 
??:0(gsignal)[0x7fcd5581d00b]
??:0(abort)[0x7fcd557fc859]
ut/ut0vec.cc:38(ib_vector_create(ib_alloc_t*, unsigned long, unsigned long))[0x5621dfc43f70]
btr/btr0btr.cc:1266(btr_page_reorganize_low(page_cur_t*, dict_index_t*, mtr_t*) [clone .isra.88])[0x5621dfc4a46c]
btr/btr0btr.cc:1565(btr_page_reorganize(page_cur_t*, dict_index_t*, mtr_t*))[0x5621dfc4a877]
btr/btr0cur.cc:5126(btr_cur_pessimistic_update(unsigned long, btr_cur_t*, unsigned short**, mem_block_info_t**, mem_block_info_t*, big_rec_t**, upd_t*, unsigned long, que_thr_t*, unsigned long, mtr_t*))[0x5621dfc6bf24]
handler/handler0alter.cc:5904(commit_try_norebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, TABLE*, TABLE const*, trx_t*, char const*))[0x5621dfb23030]
handler/handler0alter.cc:11067(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x5621dfb1b0be]
sql/sql_table.cc:7469(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x5621df662058]
sql/sql_alter.cc:548(Sql_cmd_alter_table::execute(THD*))[0x5621df6c7b6e]
sql/sql_parse.cc:5996(mysql_execute_command(THD*, bool))[0x5621df5c4d75]
sql/sql_parse.cc:8030(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x5621df5c959b]
sql/sql_parse.cc:1955(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x5621df5cb7e7]
sql/sql_parse.cc:1406(do_command(THD*, bool))[0x5621df5cce83]
sql/sql_connect.cc:1418(do_handle_one_connection(CONNECT*, bool))[0x5621df6c2be7]
sql/sql_connect.cc:1318(handle_one_connection)[0x5621df6c2e84]
perfschema/pfs.cc:2204(pfs_spawn_thread)[0x5621dfa560cc]
nptl/pthread_create.c:478(start_thread)[0x7fcd55d0d609]
 
 
Query (0x7fccf8012210): ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST

Comment by Marko Mäkelä [ 2022-06-10 ]

alice, this looks like a duplicate of MDEV-18519. Can you please double-check the revision that you tested? 892c426371b4be558d32fdeb should not crash there, but issue an error message instead.

Comment by Alice Sherepa [ 2022-06-10 ]

repeated on 10.6 51ca5d517ef67f97e3c38d57174f8

Comment by Marko Mäkelä [ 2022-07-26 ]

alice, 10.6 51ca5d517ef67f97e3c38d57174f8 is after MDEV-13542, which should have replaced the crash with an error report. Can you please post the details for the crash that you reproduced?

Comment by Alice Sherepa [ 2022-07-26 ]

marko, I ran mtr test, that was provided in the first comment by Eugene Kosov. Repeated now also on 10.6 b9eb63618edcad6206, built with -DWITH_ASAN=ON

2022-07-26 15:18:00 4 [ERROR] InnoDB: Page old data size 418 new data size 408, page old max ins size 15828 new max ins size 15838
mariadbd: /10.6/src/storage/innobase/include/rem0rec.h:274: rec_comp_status_t rec_get_status(const rec_t*): Assertion `bits <= REC_STATUS_INSTANT' failed.
220726 15:18:00 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.6.9-MariaDB-debug-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63859 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x62b0000bd288
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fec4f852c50 thread_stack 0x100000
sanitizer_common/sanitizer_common_interceptors.inc:4022(__interceptor_backtrace.part.0)[0x7fec5ec55d40]
mysys/stacktrace.c:212(my_print_stacktrace)[0x56111fdf281f]
sql/signal_handler.cc:236(handle_fatal_signal)[0x56111e972496]
sigaction.c:0(__restore_rt)[0x7fec5e6e7420]
??:0(gsignal)[0x7fec5e1d000b]
??:0(abort)[0x7fec5e1af859]
/lib/x86_64-linux-gnu/libc.so.6(+0x22729)[0x7fec5e1af729]
??:0(__assert_fail)[0x7fec5e1c0fd6]
include/rem0rec.h:275(rec_get_status(unsigned char const*))[0x56111f4a224e]
rem/rem0rec.cc:851(rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned short*, unsigned long, unsigned long, char const*, unsigned int, mem_block_info_t**))[0x56111f6d9b19]
handler/handler0alter.cc:6115(innobase_instant_try(Alter_inplace_info const*, ha_innobase_inplace_ctx*, TABLE const*, TABLE const*, trx_t*))[0x56111f46bd3a]
handler/handler0alter.cc:10548(commit_try_norebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, TABLE*, TABLE const*, trx_t*, char const*))[0x56111f4c1027]
handler/handler0alter.cc:11297(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x56111f497edf]
sql/handler.cc:5220(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x56111e9a0478]
sql/sql_table.cc:7467(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, MDL_request*, st_ddl_log_state*, TRIGGER_RENAME_PARAM*, Alter_table_ctx*))[0x56111e3a5e92]
sql/sql_table.cc:10274(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x56111e3bb2a4]
sql/sql_alter.cc:542(Sql_cmd_alter_table::execute(THD*))[0x56111e57ccc2]
sql/sql_parse.cc:5996(mysql_execute_command(THD*, bool))[0x56111e114b0a]
sql/sql_parse.cc:8029(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x56111e1221e6]
sql/sql_parse.cc:1898(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x56111e0f843a]
sql/sql_parse.cc:1409(do_command(THD*, bool))[0x56111e0f51af]
sql/sql_connect.cc:1418(do_handle_one_connection(CONNECT*, bool))[0x56111e55f4fa]
sql/sql_connect.cc:1314(handle_one_connection)[0x56111e55ed86]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x56111f1d787e]
nptl/pthread_create.c:478(start_thread)[0x7fec5e6db609]
??:0(clone)[0x7fec5e2ac133]
 
Query (0x62b0000c42a8): ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST

Comment by Hugo Wen [ 2022-07-26 ]

Another occurrence of this issue:

The first crash happened during an UPDATE DML: `update t set a=2,b=now() where id=38479845`
Error log of the first crash:

2022-07-26 11:14:25 22804036 [ERROR] [FATAL] InnoDB: Page old data size 15162 new data size 6230, page old max ins size 867 new max ins size 9799
220726 11:14:25 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.5.13-MariaDB-log
key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=460
max_threads=2626
thread_count=358
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6091296 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x400c87333758
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x400c3f0c6060 thread_stack 0x40000
/bin/mysqld(my_print_stacktrace+0x2c)[0xaaaabaf2bfb8]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x400c8735b2f0): update t set a=2,b=now() where id=38479845
 
Connection ID (thread ID): 22804036
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /db
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            unlimited            unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             unlimited            unlimited            processes 
Max open files            65535                65535                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       252382               252382               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: /tmp/core-%e-%p

Then the server stuck in crash loop and couldn't start:
Stack trace of the crashing thread now, caller functions are different with the one in Jira description, but both in function `btr_page_reorganize`:

 
Thread 40 (Thread 0x400c5d7fe910 (LWP 25198)):
#0  0x000040001e91d4f0 in raise () from /lib64/libc.so.6
#1  0x000040001e91ed34 in abort () from /lib64/libc.so.6
#2  0x0000aaaaddb0837c in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /MariaDB/storage/innobase/ut/ut0ut.cc:576
#3  0x0000aaaaddb0ed04 in btr_page_reorganize_low (index=index@entry=0x400c5d808800, mtr=mtr@entry=0x400c5d7faf08, cursor=<optimized out>, cursor=<optimized out>) at /MariaDB/storage/innobase/btr/btr0btr.cc:1411
#4  0x0000aaaaddb0f044 in btr_page_reorganize (cursor=cursor@entry=0x400c5d7f9fa0, index=index@entry=0x400c5d808800, mtr=mtr@entry=0x400c5d7faf08) at /MariaDB/storage/innobase/btr/btr0btr.cc:1629
#5  0x0000aaaadda07c7c in ibuf_insert_to_index_page_low (entry=0x400c5d808610, entry@entry=0x0, block=block@entry=0x4001337cc300, index=index@entry=0x400c5d808800, offsets=0x400c5d7f9f28, offsets@entry=0x400c5d7f9fe8, heap=<optimized out>, mtr=mtr@entry=0x400c5d7faf08, 
    page_cur=page_cur@entry=0x400c5d7f9fa0) at /MariaDB/storage/innobase/ibuf/ibuf0ibuf.cc:3680
#6  0x0000aaaadda0ca0c in ibuf_insert_to_index_page (mtr=0x400c5d7fafc8, index=0x400c5d808800, block=0x4001337cc300, entry=0x0) at /MariaDB/storage/innobase/ibuf/ibuf0ibuf.cc:3881
#7  ibuf_merge_or_delete_for_page (block=block@entry=0x4001337cc300, page_id=..., zip_size=0, zip_size@entry=187650851803136) at /MariaDB/storage/innobase/ibuf/ibuf0ibuf.cc:4425
#8  0x0000aaaaddb3d66c in buf_page_get_low (page_id=..., zip_size=187650851803136, zip_size@entry=0, rw_latch=70372876156928, rw_latch@entry=2, guess=<optimized out>, mode=70372828062976, file=0x15 <error: Cannot access memory at address 0x15>, line=43690, 
    mtr=0x400c5d7fb6e0, err=err@entry=0x400c5d7fb8b4, allow_ibuf_merge=allow_ibuf_merge@entry=true) at /MariaDB/storage/innobase/buf/buf0buf.cc:3382
#9  0x0000aaaaddb3d6d4 in buf_page_get_gen (page_id=page_id@entry=..., zip_size=0, zip_size@entry=187650835869280, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0xaaaadd69e9e8 <extent_split_impl+96> "\373\003", line=line@entry=360, 
    mtr=mtr@entry=0x9, err=0x400c5d7fb8b4, err@entry=0x400c5d7fba80, allow_ibuf_merge=<optimized out>) at /MariaDB/storage/innobase/buf/buf0buf.cc:3464
#10 0x0000aaaaddb204f4 in btr_cur_search_to_nth_level_func (index=index@entry=0x40001f368000, level=70421852437120, level@entry=0, tuple=0x3a0efff, tuple@entry=0x400c5d838080, mode=PAGE_CUR_UNSUPP, mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=2, 
    cursor=0x400c5d7fba80, cursor@entry=0x400c5d7fc760, ahi_latch=0x4001337a8f08, ahi_latch@entry=0x0, file=0xaaaadd69e9e8 <extent_split_impl+96> "\373\003", 
    file@entry=0xaaaaddfe9dc0 "/MariaDB/storage/innobase/row/row0row.cc", line=line@entry=1303, mtr=0x400c5d7fc9c8, autoinc=autoinc@entry=0)
    at /MariaDB/storage/innobase/btr/btr0cur.cc:1616
#11 0x0000aaaaddaa21c8 in btr_pcur_open_low (index=index@entry=0x40001f368000, tuple=tuple@entry=0x400c5d838080, latch_mode=latch_mode@entry=2, cursor=cursor@entry=0x400c5d7fc760, line=line@entry=1303, mtr=mtr@entry=0x400c5d7fc9c8, autoinc=0, 
    file=0xaaaaddfe9dc0 "/MariaDB/storage/innobase/row/row0row.cc", mode=PAGE_CUR_LE, level=0) at /MariaDB/storage/innobase/include/btr0pcur.ic:441
#12 0x0000aaaaddaa2318 in row_search_index_entry (index=index@entry=0x40001f368000, entry=entry@entry=0x400c5d838080, mode=2, pcur=pcur@entry=0x400c5d7fc760, mtr=mtr@entry=0x400c5d7fc9c8)
    at /MariaDB/storage/innobase/row/row0row.cc:1303
#13 0x0000aaaaddbfba0c in row_undo_mod_del_mark_or_remove_sec_low (node=node@entry=0x400c5d814080, thr=thr@entry=0x400c5d8083b0, index=index@entry=0x40001f368000, entry=entry@entry=0x400c5d838080, mode=<optimized out>, mode@entry=2)
    at /MariaDB/storage/innobase/row/row0umod.cc:565
#14 0x0000aaaaddbfd84c in row_undo_mod_del_mark_or_remove_sec (entry=0x400c5d838080, index=<optimized out>, thr=0x400c5d8083b0, node=0x400c5d814080) at /MariaDB/storage/innobase/row/row0umod.cc:670
#15 row_undo_mod_upd_exist_sec (thr=0x400c5d8083b0, node=0x400c5d814080) at /MariaDB/storage/innobase/row/row0umod.cc:1148
#16 row_undo_mod (node=node@entry=0x400c5d814080, thr=thr@entry=0x400c5d8083b0) at /MariaDB/storage/innobase/row/row0umod.cc:1363
#17 0x0000aaaaddaaafec in row_undo (thr=0x400c5d8083b0, node=0x400c5d814080) at /MariaDB/storage/innobase/row/row0undo.cc:424
#18 row_undo_step (thr=thr@entry=0x400c5d8083b0) at /MariaDB/storage/innobase/row/row0undo.cc:476
#19 0x0000aaaadda5d4c8 in que_thr_step (thr=0x400c5d8083b0) at /MariaDB/storage/innobase/que/que0que.cc:944
#20 que_run_threads_low (thr=0x400c5d8083b0) at /MariaDB/storage/innobase/que/que0que.cc:1008
#21 que_run_threads (thr=0x400c5d8083b0) at /MariaDB/storage/innobase/que/que0que.cc:1048
#22 0x0000aaaaddae43b0 in trx_rollback_active (trx=trx@entry=0x40002b386e50) at /MariaDB/storage/innobase/trx/trx0roll.cc:628
#23 0x0000aaaaddae497c in trx_rollback_recovered (all=all@entry=true) at /MariaDB/storage/innobase/trx/trx0roll.cc:783
#24 0x0000aaaaddae5010 in trx_rollback_all_recovered () at /MariaDB/storage/innobase/trx/trx0roll.cc:837
#25 0x000040001e8bd22c in start_thread () from /lib64/libpthread.so.0
#26 0x000040001e9bde5c in thread_start () from /lib64/libc.so.6
 

Comment by Marko Mäkelä [ 2022-08-01 ]

Thank you, alice, I can repeat this with that test.

10.6 63478e72de6940abae6bb2d3b35c9b1ffa7180d9

mysqltest: At line 35: query 'ALTER TABLE imp_t1 MODIFY COLUMN col_varchar_255 VARCHAR(255) FIRST ' failed: <Unknown> (2013): Lost connection to server during query
#3  0x00007ffb80658242 in __GI___assert_fail (assertion=0x562e98781360 "bits <= REC_STATUS_INSTANT", file=0x562e987813a0 "/mariadb/10.6/storage/innobase/include/rem0rec.h", line=253, 
    function=0x562e98781400 "rec_comp_status_t rec_get_status(const rec_t*)") at assert.c:101
#4  0x0000562e97a71655 in rec_get_status (rec=rec@entry=0x7ffb7dd781f3 "") at /mariadb/10.6/storage/innobase/include/rem0rec.h:253
#5  0x0000562e97c28eec in rec_get_offsets_func (rec=0x7ffb7dd781f3 "", index=index@entry=0x61600097b108, offsets=offsets@entry=0x615000034688, n_core=n_core@entry=6, 
    n_fields=n_fields@entry=18446744073709551615, file=file@entry=0x562e9877a2e0 "/mariadb/10.6/storage/innobase/handler/handler0alter.cc", line=6123, heap=0x7ffb766c9eb0)
    at /mariadb/10.6/storage/innobase/rem/rem0rec.cc:851
#6  0x0000562e97a98ff2 in innobase_instant_try (ha_alter_info=ha_alter_info@entry=0x7ffb766cbc50, ctx=ctx@entry=0x62b0000b7ca8, altered_table=altered_table@entry=0x7ffb766cc7d0, 
    table=table@entry=0x6190000d7a98, trx=trx@entry=0x7ffb7e0fce40) at /mariadb/10.6/storage/innobase/handler/handler0alter.cc:6123

The rec address passed to rec_get_status() would have been valid (the hidden metadata record for instant ADD/DROP) before btr_page_reorganize_low() rebuilt the page during btr_cur_pessimistic_update(). After page_copy_rec_list_end_no_locks(), the metadata record would be at byte offset 0x71 from the start of the page, instead of the old offset 0x1f3. The cursor position was not adjusted due to the detected and reported size mismatch.

If we ignore the root cause of this bug (the size mismatch) for a while, in 10.6 there is a problem that the error handling after MDEV-13542 could be improved. With a couple of minor things fixed, it would still crash when attempting to roll back the ALTER TABLE transaction:

10.6 34cdc0062882ddfaf48bad0c079266d8c00c486c

2022-08-01 16:26:25 4 [ERROR] InnoDB: Page old data size 418 new data size 408, page old max ins size 15828 new max ins size 15838
2022-08-01 16:26:25 0x7f2160d60640  InnoDB: Assertion failure in file /mariadb/10.6/storage/innobase/btr/btr0cur.cc line 4520
InnoDB: Failing assertion: !rec_offs_any_null_extern(rec, *offsets) || thr_get_trx(thr) == trx_roll_crash_recv_trx

wenhug, the stack trace in your crash must be something else. It looks like ROLLBACK of some DML. Please note that disabling the change buffer will not instantly fix all dormant corruption due to the change buffer; it will only prevent the future use of the change buffer. See MDEV-27734 and related bugs.

This scenario of this bug is strictly related to MDEV-15562.

Comment by Marko Mäkelä [ 2022-08-01 ]

Here is a simpler test case, better illustrating a design problem that MariaDB inherited from MySQL 4.1:

--source include/have_innodb.inc
--let MYSQLD_DATADIR= `select @@datadir`
CREATE TABLE t1 (a INT, b varchar(255), c text) ENGINE=innodb ;
INSERT INTO t1 VALUES(NULL,'','');
FLUSH TABLES t1 FOR EXPORT;
--copy_file $MYSQLD_DATADIR/test/t1.ibd $MYSQLD_DATADIR/test/tmp.ibd
--copy_file $MYSQLD_DATADIR/test/t1.cfg $MYSQLD_DATADIR/test/tmp.cfg
UNLOCK TABLES;
CREATE TABLE imp_t1 LIKE t1 ;
ALTER TABLE imp_t1 DISCARD TABLESPACE ;
ALTER TABLE imp_t1 MODIFY COLUMN b VARCHAR(255) FIRST;
--move_file $MYSQLD_DATADIR/test/tmp.ibd $MYSQLD_DATADIR/test/imp_t1.ibd
#--move_file $MYSQLD_DATADIR/test/tmp.cfg $MYSQLD_DATADIR/test/imp_t1.cfg
ALTER TABLE imp_t1 IMPORT TABLESPACE ;
ALTER TABLE imp_t1 MODIFY COLUMN a INT FIRST;
ALTER TABLE imp_t1 MODIFY COLUMN b VARCHAR(255) FIRST;

If the INSERT or the MODIFY COLUMN a is removed, the test will not crash. Because we are not providing a .cfg file like we should, the metadata mismatch that was deliberately caused between DISCARD and {IMPORT}} cannot be detected at IMPORT time. If the last move_file is enabled, the test will fail like this:

10.6 84d26f98c7f2007d9414c33dcbfdddd3a24c9a60

mysqltest: At line 14: query 'ALTER TABLE imp_t1 IMPORT TABLESPACE ' failed: ER_TABLE_SCHEMA_MISMATCH (1808): Schema mismatch (Column b ordinal value mismatch, it's at 0 in the table and 1 in the tablespace meta-data file)

We really need MDEV-11658 and have to get rid of the broken-by-design DISCARD TABLESPACE workflow.

Comment by Roel Van de Paar [ 2023-04-08 ]

Possibly related to MDEV-18322

Comment by Marko Mäkelä [ 2023-12-11 ]

MDEV-18322 is something different, not related to importing a tablespace that is inconsistent with the table definition.

Generated at Thu Feb 08 09:43:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.