Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Cannot Reproduce
-
10.6.5
-
None
-
OpenSUSE 15.3
Description
Hello there,
A few weeks ago, I started upgrading our MariaDB machines from OpenSUSE 15.2 to 15.3. As part of the upgrade, MariaDB got updated from (I am fairly sure) 10.5.10 to 10.6.5.
At first I upgraded just one slave machine, which has been running fine.
Today, I upgraded the remaining two slaves as well as the master (in that order).
Then, suddenly, 2 hours after the upgrade, one of the slaves crashed and started restarting and crashing on start:
2022-03-02 16:28:21 4 [Note] Slave: connected to master 'repl@spire:3306',replication resumed in log 'mysql-bin.004769' at position 88403470
|
2022-03-02 18:13:49 0 [ERROR] InnoDB: Unable to find a record to delete-mark
|
InnoDB: tuple DATA TUPLE: 4 fields;
|
0: len 9; hex 617070735f706f7374; asc apps_post;;
|
1: len 7; hex 7075626c697368; asc publish;;
|
2: len 5; hex 99ab64aa13; asc d ;;
|
3: len 8; hex 00000000002dc510; asc - ;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 4; compact format; info bits 0
|
0: len 9; hex 617070735f706f7374; asc apps_post;;
|
1: len 7; hex 7075626c697368; asc publish;;
|
2: len 5; hex 99ab645980; asc dY ;;
|
3: len 8; hex 00000000002e4c4c; asc .LL;;
|
2022-03-02 18:13:49 0 [ERROR] InnoDB: page [page id: space=59276, page number=92474] (240 records, index id 44529).
|
2022-03-02 18:13:49 0 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
|
This repeats over and over until I see this:
2022-03-02 18:13:49 0 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
|
2022-03-02 18:13:53 0 [ERROR] InnoDB: Corruption of an index tree: table `***********`.`wp_posts` index `post_modified_gmt_type_status`, father ptr page no 32786, child page no 92474
|
PHYSICAL RECORD: n_fields 4; compact format; info bits 0
|
0: len 5; hex 617070735f; asc apps_;;
|
1: len 9; hex 706f73747075626c69; asc postpubli;;
|
2: len 7; hex 736899ab637282; asc sh cr ;;
|
3: len 8; hex 00000000002db631; asc - 1;;
|
2022-03-02 18:13:53 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 163
|
PHYSICAL RECORD: n_fields 5; compact format; info bits 16
|
0: len 5; hex 8000000000; asc ;;
|
1: len 9; hex 61706b5f73706c6974; asc apk_split;;
|
2: len 5; hex 6472616674; asc draft;;
|
3: len 8; hex 00000000000f62f0; asc b ;;
|
4: len 4; hex 00008012; asc ;;
|
2022-03-02 18:13:53 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 165
|
2022-03-02 18:13:53 0 [ERROR] [FATAL] InnoDB: You should dump + drop + reimport the table to fix the corruption. If the crash happens at database startup. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery. Then dump + drop + reimport.
|
220302 18:13:53 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.6.5-MariaDB-log
|
key_buffer_size=536870912
|
read_buffer_size=262144
|
max_used_connections=232
|
max_threads=2050
|
thread_count=25
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5300842 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x560f8bc5f328
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7fad76ffcb78 thread_stack 0x49000
|
??:0(my_print_stacktrace)[0x560f88883ddd]
|
??:0(handle_fatal_signal)[0x560f883648f5]
|
??:0(sem_open)[0x7fb451453f80]
|
??:0(__gconv_transform_internal_ascii)[0x7fb45056c18b]
|
??:0(__gconv_transform_utf8_internal)[0x7fb45056d585]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f88740fd1]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f8874580d]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f887458e6]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f8874b051]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f8875d906]
|
??:0(std::pair<std::_Rb_tree_iterator<std::pair<unsigned long const, bool> >, bool> std::_Rb_tree<unsigned long, std::pair<unsigned long const, bool>, std::_Select1st<std::pair<unsigned long const, bool> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, bool> > >::_M_emplace_unique<unsigned long&, bool>(unsigned long&, bool&&))[0x560f8876561f]
|
??:0(std::unique_lock<std::mutex>::unlock())[0x560f886e6cba]
|
??:0(std::unique_lock<std::mutex>::unlock())[0x560f886e7e97]
|
??:0(std::unique_lock<std::mutex>::unlock())[0x560f886e852a]
|
??:0(std::unique_lock<std::mutex>::unlock())[0x560f886a662d]
|
??:0(std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*))[0x560f88718b2f]
|
??:0(std::unique_lock<std::mutex>::unlock())[0x560f88707adf]
|
??:0(tpool::task_group::execute(tpool::task*))[0x560f88825921]
|
??:0(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x560f888246ef]
|
??:0(__gnu_debug::_Safe_sequence_base::_M_revalidate_singular())[0x7fb450f38cd4]
|
/lib64/libpthread.so.0(+0x894a)[0x7fb45144894a]
|
??:0(eval_expr)[0x7fb450632d0f]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x0): (null)
|
Connection ID (thread ID): 0
|
Status: NOT_KILLED
|
|
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
|
|
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
|
information that should help you find out what is causing the crash.
|
|
We think the query pointer is invalid, but we will try to print it anyway.
|
Query:
|
|
Writing a core file...
|
Working directory at /var/lib/mysql
|
Resource Limits:
|
Limit Soft Limit Hard Limit Units
|
Max cpu time unlimited unlimited seconds
|
Max file size unlimited unlimited bytes
|
Max data size unlimited unlimited bytes
|
Max stack size 8388608 unlimited bytes
|
Max core file size unlimited unlimited bytes
|
Max resident set unlimited unlimited bytes
|
Max processes 128328 128328 processes
|
Max open files 1048576 1048576 files
|
Max locked memory 8388608 8388608 bytes
|
Max address space unlimited unlimited bytes
|
Max file locks unlimited unlimited locks
|
Max pending signals 128328 128328 signals
|
Max msgqueue size 819200 819200 bytes
|
Max nice priority 0 0
|
Max realtime priority 0 0
|
Max realtime timeout unlimited unlimited us
|
Core pattern: |/bin/false
|
|
2022-03-02 18:14:03 0 [Warning] You need to use --log-bin to make --binlog-format work.
|
2022-03-02 18:14:03 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
|
2022-03-02 18:14:03 0 [Note] InnoDB: Number of pools: 1
|
2022-03-02 18:14:03 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
|
2022-03-02 18:14:03 0 [Note] InnoDB: Using Linux native AIO
|
2022-03-02 18:14:03 0 [Note] InnoDB: Initializing buffer pool, total size = 25769803776, chunk size = 134217728
|
2022-03-02 18:14:03 0 [Note] InnoDB: Completed initialization of buffer pool
|
2022-03-02 18:14:03 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=17030859767851,17030859785816
|
2022-03-02 18:14:03 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 1 row operations to undo
|
2022-03-02 18:14:03 0 [Note] InnoDB: Trx id counter is 215031286094
|
2022-03-02 18:14:03 0 [Note] InnoDB: Starting final batch to recover 12213 pages from redo log.
|
2022-03-02 18:14:05 0 [Note] InnoDB: Last binlog file './mysql-bin.000524', position 142764333
|
2022-03-02 18:14:05 0 [Note] InnoDB: 128 rollback segments are active.
|
2022-03-02 18:14:05 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
|
2022-03-02 18:14:05 0 [Note] InnoDB: Rolled back recovered transaction 215031286093
|
2022-03-02 18:14:05 0 [Note] InnoDB: Rollback of non-prepared transactions completed
|
2022-03-02 18:14:05 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
|
2022-03-02 18:14:05 0 [Note] InnoDB: Creating shared tablespace for temporary tables
|
2022-03-02 18:14:05 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
|
2022-03-02 18:14:05 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
|
2022-03-02 18:14:05 0 [Note] InnoDB: 10.6.5 started; log sequence number 17030939799231; transaction id 215031286096
|
2022-03-02 18:14:05 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
|
2022-03-02 18:14:05 0 [Note] Plugin 'FEEDBACK' is disabled.
|
2022-03-02 18:14:05 0 [Note] Server socket created on IP: '192.168.202.72'.
|
2022-03-02 18:14:05 4 [Note] Slave I/O thread: Start asynchronous replication to master 'repl@spire:3306' in log 'mysql-bin.004771' at position 484601512
|
2022-03-02 18:14:05 5 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.004771' at position 483941269, relay log './mysqld-relay-bin.012431' position: 483931147
|
2022-03-02 18:14:05 0 [Note] /usr/sbin/mysqld: ready for connections.
|
Version: '10.6.5-MariaDB-log' socket: '/var/run/mysql/mysql.sock' port: 3306 MariaDB package
|
2022-03-02 18:14:05 4 [Note] Slave I/O thread: connected to master 'repl@spire:3306',replication started in log 'mysql-bin.004771' at position 484601512
|
2022-03-02 18:14:05 0 [ERROR] InnoDB: Corruption of an index tree: table `***********`.`wp_posts` index `post_modified_gmt_type_status`, father ptr page no 32786, child page no 92474
|
I found https://jira.mariadb.org/browse/MDEV-22373 and related issues, but they're all marked as fixed.
Any ideas? I'm dreading this happening again to other slaves and taking down our db setup (can it also happen to the master and crash it?.
Attachments
Issue Links
- relates to
-
MDEV-25796 Failing assertion: !cursor->index->is_committed() in row0ins.cc
- Open
-
MDEV-27734 Set innodb_change_buffering=none by default
- Closed
-
MDEV-22373 Unable to find a record to delete-mark ends up crashing mysqld process after upgrading from 10.1.43 to 10.4
- Closed