[MDEV-30372] Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed Created: 2023-01-10  Updated: 2023-11-23  Resolved: 2023-10-09

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4, 10.5, 10.6
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Jan Lindström
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File bt_all.txt     File node1.err     File transactions-bkp.yy     File transactions.zz    
Issue Links:
Issue split
split to MDEV-31285 Assertion `state() == s_executing || ... Closed
Problem/Incident
is caused by MDEV-29293 MariaDB stuck on starting commit stat... Closed

 Description   

Testcase

perl runall-new.pl \
   --basedir=/test/mtest/MDEV-29684/10.6/mariadb-10.6.12-linux-x86_64 \
   --vardir=/home/ramesh/galera-rqg-test \
   --mysqld=--wsrep-provider=/test/mtest/MDEV-29684/10.6/mariadb-10.6.12-linux-x86_64/lib/libgalera_smm.so \
   --gendata=conf/transactions/transactions.zz \
   --grammar=conf/transactions/transactions-bkp.yy \
   --threads=16 \
   --galera=mms \
   --mysqld=--wsrep_sst_method=rsync \
   --mysqld=--core \
   --mysqld=--general-log \
   --mysqld=--general-log-file=queries.log \
   --mysqld=--log-output=file \
   --mysqld=--wsrep-debug=0 \
   --mysqld=--wsrep-sync-wait=15 \
   --mysqld=--wsrep_retry_autocommit=0 \
   --mysqld=--wsrep_slave_threads=12 \
   --mysqld=--wsrep_log_conflicts=1 \
   --mysqld=--wsrep_on=1 \
   --mysqld=--default-storage-engine=innodb \
   --mysqld=--sort_buffer_size=200M \
   --mysqld=--innodb-lock-wait-timeout=1 \
   --mysqld=--gtid_domain_id=10 \
   --mysqld=--wsrep_gtid_domain_id=100 \
   --mysqld=--wsrep_gtid_mode=ON \
   --mysqld=--wsrep_slave_threads=4 \
   --mysqld=--slave_parallel_threads=4 \
   --mysqld=--server_id=11 \
   --mysqld=--gtid_strict_mode=1 \
   --mysqld=--log_slave_updates=ON \
   --mysqld=--log_bin=binlog \
   --mysqld=--binlog_format=ROW \
   --mysqld=--master_info_repository=TABLE \
   --mysqld=--relay_log_info_repository=TABLE 
   

Leads to

#0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x0000556371aa433f in my_write_core (sig=6) at /test/mtest/MDEV-29684/10.6/mysys/stacktrace.c:424
#2  0x00005563711c42d9 in handle_fatal_signal (sig=6) at /test/mtest/MDEV-29684/10.6/sql/signal_handler.cc:355
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x000014f5e2382859 in __GI_abort () at abort.c:79
#6  0x000014f5e2382729 in __assert_fail_base (fmt=0x14f5e2518588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x556372173c20 "state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay", file=0x556372172ed8 "/test/mtest/MDEV-29684/10.6/wsrep-lib/src/transaction.cpp", line=661, function=<optimized out>) at assert.c:92
#7  0x000014f5e2393fd6 in __GI___assert_fail (
    assertion=0x556372173c20 "state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay", file=0x556372172ed8 "/test/mtest/MDEV-29684/10.6/wsrep-lib/src/transaction.cpp", line=661, function=0x556372173bf0 "int wsrep::transaction::before_rollback()") at assert.c:101
#8  0x0000556371b6273b in wsrep::transaction::before_rollback (this=0x14f4300075b0) at /test/mtest/MDEV-29684/10.6/wsrep-lib/src/transaction.cpp:661
#9  0x00005563711e33c7 in wsrep::client_state::before_rollback (this=0x14f430007548) at /test/mtest/MDEV-29684/10.6/wsrep-lib/include/wsrep/client_state.hpp:526
#10 0x00005563711c6768 in wsrep_before_rollback (thd=0x14f430000db8, all=false) at /test/mtest/MDEV-29684/10.6/sql/wsrep_trans_observer.h:402
#11 0x00005563711cb8ea in ha_rollback_trans (thd=0x14f430000db8, all=false) at /test/mtest/MDEV-29684/10.6/sql/handler.cc:2187
#12 0x00005563711caf38 in ha_commit_trans (thd=0x14f430000db8, all=false) at /test/mtest/MDEV-29684/10.6/sql/handler.cc:1966
#13 0x0000556370ffc7b7 in trans_commit_stmt (thd=0x14f430000db8) at /test/mtest/MDEV-29684/10.6/sql/transaction.cc:472
#14 0x0000556370e2d111 in mysql_execute_command (thd=0x14f430000db8, is_called_from_prepared_stmt=false) at /test/mtest/MDEV-29684/10.6/sql/sql_parse.cc:6060
#15 0x0000556370e32bac in mysql_parse (thd=0x14f430000db8, 
    rawbuf=0x14f4300141a0 "INSERT INTO `table20_int_autoinc` ( `pk`, `col_int_key`, `col_int`) VALUES ( NULL, 100, 100 ), ( NULL, 100, 100 )  /* QNO 443 CON_ID 26 */", length=138, parser_state=0x14f58094d320)
    at /test/mtest/MDEV-29684/10.6/sql/sql_parse.cc:8019
#16 0x0000556370e322a9 in wsrep_mysql_parse (thd=0x14f430000db8, 
    rawbuf=0x14f4300141a0 "INSERT INTO `table20_int_autoinc` ( `pk`, `col_int_key`, `col_int`) VALUES ( NULL, 100, 100 ), ( NULL, 100, 100 )  /* QNO 443 CON_ID 26 */", length=138, parser_state=0x14f58094d320)
    at /test/mtest/MDEV-29684/10.6/sql/sql_parse.cc:7831
#17 0x0000556370e1ef38 in dispatch_command (command=COM_QUERY, thd=0x14f430000db8, 
    packet=0x14f43000b8c9 " INSERT INTO `table20_int_autoinc` ( `pk`, `col_int_key`, `col_int`) VALUES ( NULL, 100, 100 ), ( NULL, 100, 100 )  /* QNO 443 CON_ID 26 */ ", packet_length=140, blocking=true)
    at /test/mtest/MDEV-29684/10.6/sql/sql_parse.cc:1883
#18 0x0000556370e1d9e6 in do_command (thd=0x14f430000db8, blocking=true) at /test/mtest/MDEV-29684/10.6/sql/sql_parse.cc:1409
#19 0x0000556370fdfaef in do_handle_one_connection (connect=0x5563746820c8, put_in_cache=true) at /test/mtest/MDEV-29684/10.6/sql/sql_connect.cc:1416
#20 0x0000556370fdf84b in handle_one_connection (arg=0x5563746820c8) at /test/mtest/MDEV-29684/10.6/sql/sql_connect.cc:1318
#21 0x000055637151510b in pfs_spawn_thread (arg=0x5563740c4cb8) at /test/mtest/MDEV-29684/10.6/storage/perfschema/pfs.cc:2201
#22 0x000014f5e28ae609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#23 0x000014f5e247f133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95



 Comments   
Comment by Ramesh Sivaraman [ 2023-01-13 ]

Attached full stack and error log(wsrep-debug=1) node1.err bt_all.txt

Comment by Ramesh Sivaraman [ 2023-02-08 ]

seppo CTAS with system version table command crashes Galera secondary node with similar assertion.

CREATE TABLE t WITH SYSTEM VERSIONING AS SELECT 1 AS i;

Leads to

11.0.1 6252a281b5020d28685777c11ee27688a9db3d66 (Debug)

Core was generated by `/test/GAL_MD270123-mariadb-11.0.1-linux-x86_64-dbg/bin/mysqld --defaults-file=/'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x1465bc0fa700 (LWP 3419933))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055f942ac83d0 in my_write_core (sig=6) at /test/11.0/mysys/stacktrace.c:424
#2  0x000055f942229ce8 in handle_fatal_signal (sig=6) at /test/11.0/sql/signal_handler.cc:357
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x0000146609612859 in __GI_abort () at abort.c:79
#6  0x0000146609612729 in __assert_fail_base (fmt=0x1466097a8588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x55f9432463f0 "state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay", file=0x55f9432456b8 "/test/11.0/wsrep-lib/src/transaction.cpp", line=661, function=<optimized out>) at assert.c:92
#7  0x0000146609623fd6 in __GI___assert_fail (
    assertion=0x55f9432463f0 "state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay", file=0x55f9432456b8 "/test/11.0/wsrep-lib/src/transaction.cpp", line=661, function=0x55f9432463c0 "int wsrep::transaction::before_rollback()") at assert.c:101
#8  0x000055f942ba477f in wsrep::transaction::before_rollback (this=0x146594007950) at /test/11.0/wsrep-lib/src/transaction.cpp:661
#9  0x000055f942248d6b in wsrep::client_state::before_rollback (this=0x1465940078e8) at /test/11.0/wsrep-lib/include/wsrep/client_state.hpp:526
#10 0x000055f94222c17b in wsrep_before_rollback (thd=0x146594000ff8, all=true) at /test/11.0/sql/wsrep_trans_observer.h:404
#11 0x000055f9422312fa in ha_rollback_trans (thd=0x146594000ff8, all=true) at /test/11.0/sql/handler.cc:2185
#12 0x000055f942050389 in trans_rollback (thd=0x146594000ff8) at /test/11.0/sql/transaction.cc:372
#13 0x000055f9425cfcf0 in Wsrep_high_priority_service::rollback (this=0x1465bc0f9bc0, ws_handle=..., ws_meta=...) at /test/11.0/sql/wsrep_high_priority_service.cc:382
#14 0x000055f942b96335 in apply_write_set (server_state=..., high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/11.0/wsrep-lib/src/server_state.cpp:342
#15 0x000055f942b9a0f0 in wsrep::server_state::on_apply (this=0x55f9460924d0, high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/11.0/wsrep-lib/src/server_state.cpp:1119
#16 0x000055f942bb7c79 in wsrep::high_priority_service::apply (this=0x1465bc0f9bc0, ws_handle=..., ws_meta=..., data=...) at /test/11.0/wsrep-lib/include/wsrep/high_priority_service.hpp:47
#17 0x000055f942bb48bd in (anonymous namespace)::apply_cb (ctx=0x1465bc0f9bc0, wsh=0x1465bc0f8ba0, flags=65, buf=0x1465bc0f8bb0, meta=0x1465bc0f8e60, exit_loop=0x1465bc0f8e1f)
    at /test/11.0/wsrep-lib/src/wsrep_provider_v26.cpp:507
#18 0x0000146609091ee5 in galera::TrxHandleSlave::apply (this=this@entry=0x146594022390, recv_ctx=recv_ctx@entry=0x1465bc0f9bc0, 
    apply_cb=0x55f942bb468a <(anonymous namespace)::apply_cb(void*, wsrep_ws_handle_t const*, uint32_t, wsrep_buf_t const*, wsrep_trx_meta_t const*, wsrep_bool_t*)>, meta=..., 
    exit_loop=exit_loop@entry=@0x1465bc0f8e1f: false) at /test/galera_4x_dbg/galera/src/trx_handle.cpp:396
#19 0x00001466090a86ae in galera::ReplicatorSMM::apply_trx (this=0x55f9460a7d20, recv_ctx=0x1465bc0f9bc0, ts=...) at /test/galera_4x_dbg/galera/src/replicator_smm.cpp:516
#20 0x00001466090ac908 in galera::ReplicatorSMM::process_trx (this=0x55f9460a7d20, recv_ctx=0x1465bc0f9bc0, ts_ptr=...) at /test/galera_4x_dbg/galera/src/replicator_smm.cpp:2138
#21 0x00001466090de319 in galera::GcsActionSource::process_writeset (this=0x55f9460a0870, recv_ctx=0x1465bc0f9bc0, act=..., exit_loop=@0x1465bc0f979f: false)
    at /test/galera_4x_dbg/galera/src/gcs_action_source.cpp:62
#22 0x00001466090ded63 in galera::GcsActionSource::dispatch (this=0x55f9460a0870, recv_ctx=0x1465bc0f9bc0, act=..., exit_loop=<optimized out>) at /test/galera_4x_dbg/galera/src/gcs_action_source.cpp:110
#23 0x00001466090df322 in galera::GcsActionSource::process (this=0x55f9460a0870, recv_ctx=0x1465bc0f9bc0, exit_loop=@0x1465bc0f979f: false) at /test/galera_4x_dbg/galera/src/gcs_action_source.cpp:186
#24 0x00001466090a9d50 in galera::ReplicatorSMM::async_recv (this=0x55f9460a7d20, recv_ctx=0x1465bc0f9bc0) at /test/galera_4x_dbg/galera/src/replicator_smm.cpp:402
#25 0x000014660907e641 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at /test/galera_4x_dbg/galera/src/wsrep_provider.cpp:264
#26 0x000055f942bb5da4 in wsrep::wsrep_provider_v26::run_applier (this=0x55f945febee0, applier_ctx=0x1465bc0f9bc0) at /test/11.0/wsrep-lib/src/wsrep_provider_v26.cpp:858
#27 0x000055f9425f5ee7 in wsrep_replication_process (thd=0x146594000ff8, arg=0x55f94682c040) at /test/11.0/sql/wsrep_thd.cc:57
#28 0x000055f9425e4530 in start_wsrep_THD (arg=0x55f94682c040) at /test/11.0/sql/wsrep_mysqld.cc:3653
#29 0x000055f94254d0f9 in pfs_spawn_thread (arg=0x55f946825238) at /test/11.0/storage/perfschema/pfs.cc:2201
#30 0x0000146609b3e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#31 0x000014660970f133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) 

Comment by Jan Lindström [ 2023-05-16 ]

I will split this to two MDEVs because second issue happens on applier and looks different so it will be handled on MDEV-31285.

Generated at Thu Feb 08 10:15:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.