[MDEV-32561] WSREP FSM failure: no such a transition REPLICATING -> COMMITTED Created: 2023-10-24  Updated: 2023-12-12  Resolved: 2023-11-21

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.31
Fix Version/s: 11.3.1, 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Critical
Reporter: Julius Goryavsky Assignee: Jan Lindström
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-30172 Galera test case cleanup Stalled
Issue split
split from MDEV-32024 Galera library 26.4.16 fails with eve... Closed
Relates
relates to MDEV-24061 Galera stack smashing detected during... Closed
relates to MDEV-24091 Galera crashes with "WSREP: FSM: no s... Closed

 Description   

WSREP FSM failure: "no such a transition REPLICATING -> COMMITTED" when executing galera_swquences mtr test: https://buildbot.mariadb.net/buildbot/builders/kvm-rpm-centos74-amd64/builds/36412

2023-10-21 19:09:30 2 [Note] WSREP: Server centos74-amd64 synced with group
2023-10-21 19:09:30 2 [ERROR] Slave SQL: Error 'This version of MariaDB doesn't yet support 'non-InnoDB sequences in Galera cluster'' on query. Default database: 'test'. Query: 'ALTER TABLE t ENGINE=MyISAM', Internal MariaDB error code: 1235
2023-10-21 19:09:30 2 [Warning] WSREP: Ignoring error 'This version of MariaDB doesn't yet support 'non-InnoDB sequences in Galera cluster'' on query. Default database: 'test'. Query: 'ALTER TABLE t ENGINE=MyISAM', Error_code: 1235
2023-10-21 19:09:30 16 [ERROR] WSREP: FSM: no such a transition REPLICATING -> COMMITTED
231021 19:09:30 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.4.32-MariaDB-log source revision: 0cda037e87f05a73429fbc5dbc43f3bf1e9a88e7
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=9
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63559 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x5587fbe68008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f2ee9ebfc40 thread_stack 0x49000
mysys/stacktrace.c:175(my_print_stacktrace)[0x5587f265d7ce]
sql/signal_handler.cc:238(handle_fatal_signal)[0x5587f20ae867]
sigaction.c:0(__restore_rt)[0x7f2ef2c015e0]
/lib64/libc.so.6(gsignal+0x37)[0x7f2ef20561f7]
/lib64/libc.so.6(abort+0x148)[0x7f2ef20578e8]
src/fsm.hpp:56(galera::FSM<galera::TrxHandle::State, galera::TrxHandle::Transition>::shift_to(galera::TrxHandle::State, int))[0x7f2eeeacfcda]
src/replicator_smm.cpp:1423(galera::ReplicatorSMM::commit_order_leave(galera::TrxHandleSlave&, wsrep_buf const*))[0x7f2eeeadf4bb]
detail/shared_count.hpp:371(galera_commit_order_leave)[0x7f2eeeacb468]
src/wsrep_provider_v26.cpp:969(wsrep::wsrep_provider_v26::commit_order_leave(wsrep::ws_handle const&, wsrep::ws_meta const&, wsrep::mutable_buffer const&))[0x5587f26eb1a1]
src/transaction.cpp:579(wsrep::transaction::ordered_commit())[0x5587f26e5080]
sql/log.cc:7824(MYSQL_BIN_LOG::queue_for_group_commit(MYSQL_BIN_LOG::group_commit_entry*))[0x5587f219a1eb]
sql/log.cc:7856(MYSQL_BIN_LOG::write_transaction_to_binlog_events(MYSQL_BIN_LOG::group_commit_entry*))[0x5587f219eb3a]
sql/log.cc:7482(MYSQL_BIN_LOG::write_transaction_to_binlog(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x5587f219ef60]
sql/log.cc:516(binlog_cache_mngr::reset(bool, bool))[0x5587f219f11d]
sql/log.cc:1814(binlog_commit_flush_stmt_cache(THD*, bool, binlog_cache_mngr*))[0x5587f219f344]
sql/log.cc:2092(binlog_rollback(handlerton*, THD*, bool))[0x5587f219f52f]
sql/handler.cc:1955(ha_rollback_trans(THD*, bool))[0x5587f20b28cb]
sql/handler.cc:1746(ha_commit_trans(THD*, bool))[0x5587f20b3434]
sql/transaction.cc:438(trans_commit_stmt(THD*))[0x5587f1fb19df]
sql/sql_class.h:4028(THD::get_stmt_da())[0x5587f1eb2aaa]
sql/sql_parse.cc:8013(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5587f1eba87b]
sql/sql_class.h:4028(THD::get_stmt_da())[0x5587f1ebb0e6]
sql/sql_parse.cc:1843(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x5587f1ebe0b6]
sql/sql_parse.cc:1379(do_command(THD*))[0x5587f1ebe732]
sql/sql_connect.cc:1420(do_handle_one_connection(CONNECT*))[0x5587f1fa3172]
sql/sql_connect.cc:1326(handle_one_connection)[0x5587f1fa325d]
perfschema/pfs.cc:1872(pfs_spawn_thread)[0x5587f233018d]
pthread_create.c:0(start_thread)[0x7f2ef2bf9e25]
/lib64/libc.so.6(clone+0x6d)[0x7f2ef211934d]



 Comments   
Comment by Jan Lindström [ 2023-11-21 ]

This is fixed on MDEV-25089 so that offending ALTER is refused before we start TOI so it will not get executed on other nodes (i.e. in appliers).

Generated at Thu Feb 08 10:32:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.