We had the situation that we could not start nodes any more because the SST throws a stack trace after a prepared transactions.
I have no clue how we came into this situation. But it was after restarting nodes and after upgrade from 10.4.17 to 10.4.18. But the system was previously not set-up 100% correct.
2021-03-15 20:12:11 0 [Note] WSREP: Last wsrep seqno to be recovered 5636369
2021-03-15 20:12:11 0 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.
2021-03-15 20:12:11 0 [ERROR] Aborting
terminate called after throwing an instance of 'wsrep::runtime_error'
what(): State wait was interrupted
210315 20:12:11 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.4.18-MariaDB-1:10.4.18+maria~focal-log
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1074864 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fb6b0002118
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
stack_bottom = 0x7fb6cbffdb78 thread_stack 0x49000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): (null)
Connection ID (thread ID): 3
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
We think the query pointer is invalid, but we will try to print it anyway.
Writing a core file...
Working directory at /var/lib/mysql
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 87696 87696 processes
Max open files 200000 200000 files
Max locked memory 200000 200000 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 87696 87696 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E
Workaround: Stop node, remove contents of datadir, mysql_install_db, then sst worked.