[MDEV-29158] DB restarting in a loop due to corrupted file-level record Created: 2022-07-22  Updated: 2022-08-24  Resolved: 2022-08-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - RocksDB
Affects Version/s: 10.6.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Michael Qin (Inactive) Assignee: Marko Mäkelä
Resolution: Incomplete Votes: 0
Labels: None


 Description   

Initial error log and stack trace:

2022-07-21 20:29:32 0x2ab31fd10700  InnoDB: Assertion failure in file /local/p4clients/pkgbuild-B8mzg/workspace/src/RDSMariaDB/storage/innobase/trx/trx0trx.cc line 918
InnoDB: Failing assertion: trx->lock.table_locks.empty()
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mariadbd startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
220721 20:29:32 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.6.5-MariaDB
key_buffer_size=4096
read_buffer_size=262144
max_used_connections=4
max_threads=10002
thread_count=5
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 23289949 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x2acbc600f258
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x2ab31fd0e5b8 thread_stack 0x40000
Can't start addr2line
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace+0x2e)[0x5582024733ae]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0x51b)[0x558201c6d78b]
/lib64/libpthread.so.0(+0xf100)[0x2ab320ad9100]
/lib64/libc.so.6(gsignal+0x37)[0x2ab320d1b5f7]
/lib64/libc.so.6(abort+0x148)[0x2ab320d1cce8]
/rdsdbbin/mysql/bin/mysqld(+0x74bf8a)[0x55820196ff8a]
/rdsdbbin/mysql/bin/mysqld(+0x10b5f81)[0x5582022d9f81]
/rdsdbbin/mysql/bin/mysqld(+0x1025d54)[0x558202249d54]
/rdsdbbin/mysql/bin/mysqld(+0x10260a0)[0x55820224a0a0]
/rdsdbbin/mysql/bin/mysqld(+0x105b81c)[0x55820227f81c]
/rdsdbbin/mysql/bin/mysqld(+0xf91454)[0x5582021b5454]
/rdsdbbin/mysql/bin/mysqld(+0xf9f0bd)[0x5582021c30bd]
/rdsdbbin/mysql/bin/mysqld(_Z18mysql_rename_tableP10handlertonPK25st_mysql_const_lex_stringS3_S3_S3_P34st_mysql_const_unsigned_lex_stringj+0x160)[0x558201af07a0]
/rdsdbbin/mysql/bin/mysqld(_Z17mysql_alter_tableP3THDPK25st_mysql_const_lex_stringS3_P14HA_CREATE_INFOP10TABLE_LISTP10Alter_infojP8st_orderbb+0x54e1)[0x558201afc7c1]
/rdsdbbin/mysql/bin/mysqld(_ZN19Sql_cmd_alter_table7executeEP3THD+0x3c9)[0x558201b5d929]
/rdsdbbin/mysql/bin/mysqld(_Z21mysql_execute_commandP3THDb+0x2b1d)[0x558201a6768d]
/rdsdbbin/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x1c9)[0x558201a6b139]
/rdsdbbin/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjb+0xcea)[0x558201a622da]
/rdsdbbin/mysql/bin/mysqld(_Z10do_commandP3THDb+0xf7)[0x558201a61417]
/rdsdbbin/mysql/bin/mysqld(_Z24do_handle_one_connectionP7CONNECTb+0x3c3)[0x558201b58ed3]
/rdsdbbin/mysql/bin/mysqld(handle_one_connection+0x34)[0x558201b591f4]
/rdsdbbin/mysql/bin/mysqld(+0xf0f2fc)[0x5582021332fc]
/lib64/libpthread.so.0(+0x7dc5)[0x2ab320ad1dc5]
/lib64/libc.so.6(clone+0x6d)[0x2ab320ddcc9d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x2acbc604fd70): ALTER TABLE `x`.`y` ENGINE=InnoDB, ALGORITHM=COPY
 
Connection ID (thread ID): 109723
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /rdsdbdata/db
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            unlimited            unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             unlimited            unlimited            processes 
Max open files            1048576              1048576              files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       509814               509814               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: /rdsdbdata/tmp/core-%e-%p

Engine error log looping on the following:

2022-07-22 18:18:19 0 [Warning] option 'group_concat_max_len': unsigned value 18446744073709547520 adjusted to 4294967295
2022-07-22 18:18:19 0 [Note] /rdsdbbin/mysql/bin/mysqld (server 10.6.5-MariaDB) starting as process 15986 ...
2022-07-22 18:18:19 0 [Warning] You need to use --log-bin to make --log-slave-updates work.
2022-07-22 18:18:19 0 [Note] RocksDB: 2 column families found
2022-07-22 18:18:19 0 [Note] RocksDB: Column Families at start:
2022-07-22 18:18:19 0 [Note]   cf=default
2022-07-22 18:18:19 0 [Note]     write_buffer_size=67108864
2022-07-22 18:18:19 0 [Note]     target_file_size_base=67108864
2022-07-22 18:18:19 0 [Note]   cf=__system__
2022-07-22 18:18:19 0 [Note]     write_buffer_size=67108864
2022-07-22 18:18:19 0 [Note]     target_file_size_base=67108864
2022-07-22 18:18:24 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
2022-07-22 18:18:24 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
2022-07-22 18:18:24 0 [Note] MyRocks storage engine plugin has been successfully initialized.
2022-07-22 18:18:24 0 [Warning] InnoDB: innodb_open_files 4294967295 should not be greater than the open_files_limit 810047
2022-07-22 18:18:24 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-07-22 18:18:24 0 [Note] InnoDB: Number of pools: 1
2022-07-22 18:18:24 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-07-22 18:18:24 0 [Note] InnoDB: Using Linux native AIO
2022-07-22 18:18:24 0 [Note] InnoDB: Initializing buffer pool, total size = 99857989632, chunk size = 134217728
2022-07-22 18:18:25 0 [Note] InnoDB: Completed initialization of buffer pool
2022-07-22 18:18:25 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=9760157309095,9760157309095
2022-07-22 18:18:40 0 [Note] InnoDB: Read redo log up to LSN=9760942364672
2022-07-22 18:18:46 0 [ERROR] InnoDB: Corrupted file-level record; set innodb_force_recovery=1 to ignore.
2022-07-22 18:18:46 0 [Note] InnoDB: Dump from the start of the mini-transaction (LSN=9761571365197) to 100 bytes after the record:
 len 104; hex a03a00002f7264736462646174612f64622f696e6e6f64622f69626461746131002e2f6d656d6265725f3138313534362f2373716c2d616c7465722d373237342d31616339622e69626400004900e10a97b62804089bb40d0107e1b483c207e1b782e402df02c26f; asc  :  /rdsdbdata/db/innodb/ibdata1 ./x/#sql-alter-7274-1ac9b.ibd  I     (                    o;
2022-07-22 18:18:46 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2022-07-22 18:18:46 0 [Note] InnoDB: Starting shutdown...
2022-07-22 18:18:47 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-07-22 18:18:47 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2022-07-22 18:18:47 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2022-07-22 18:18:47 0 [ERROR] Aborting

Is this also related to https://jira.mariadb.org/browse/MDEV-13542?



 Comments   
Comment by Marko Mäkelä [ 2022-07-25 ]

This is not directly related to MDEV-13542, which is mainly about preventing crashes when encountering corrupted data. There is no crash in this case.

The 104-byte hexadecimal dump is for a FILE_RENAME record (0xa0) for tablespace 0x3a, for renaming
/rdsdbdata/db/innodb/ibdata1 to ./member_181546/#sql-alter-7274-1ac9b.ibd. This is very strange, because typically the file name ibdata1 is associated with the InnoDB system tablespace, which carries the tablespace identifier 0. InnoDB should really never rename the system tablespace, nor write a FILE_RENAME record about renaming it.

It seems to me that the database member_181546 contains a table that had been created while innodb_file_per_table=0 was in effect, and an ALTER TABLE statement was rebuilding the table. With the default innodb_file_per_table=1, the data for all InnoDB tables (or partitions) will be stored in .ibd files.

Could this have been fixed by MDEV-28752 in the upcoming 10.6.9 release? I tried the following test on the current 10.6 development snapshot as well as a snapshot right before that fix, but in both cases the ibdata1 did exist after running the test.

--source include/have_innodb.inc
SET GLOBAL innodb_file_per_table=0;
CREATE TABLE t(a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
INSERT INTO t VALUES(1,0),(2,0);
SET GLOBAL innodb_file_per_table=1;
--error ER_DUP_ENTRY
ALTER TABLE t FORCE, ADD UNIQUE INDEX(b);

Can you provide the minimal SQL statements for repeating the error?

Generated at Thu Feb 08 10:06:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.