[MDEV-30550] Assertion `state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay' failed Created: 2023-02-02  Updated: 2024-01-03

Status: Confirmed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4, 10.5, 10.6
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Daniele Sciascia
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File n1.cnf     File n2.cnf    
Issue Links:
Relates
relates to MDEV-24981 LOAD INDEX may cause rollback of prep... Open
relates to MDEV-33129 Crash in wsrep::wsrep_provider_v26::r... Open

 Description   

CREATE TABLE t1 (c1 BIGINT NOT NULL);
INSERT INTO t1  (c1) VALUES(10);
SET GLOBAL wsrep_on = OFF;
XA START 't';
SET @@session.query_prealloc_size   = 0;
SET SESSION max_session_mem_used = 8192;
LOAD INDEX INTO CACHE t1 IGNORE LEAVES;
SET SESSION wsrep_dirty_reads=1;
SET GLOBAL wsrep_on = TRUE;
SET SESSION wsrep_trx_fragment_unit = 'statements';
SET SESSION wsrep_trx_fragment_size = 3;
SET GLOBAL wsrep_cluster_address='gcomm://';
SAVEPOINT my_sp;
SELECT 1;
CREATE TABLE tbl2(c1 VARCHAR(20)) engine=InnoDB;

Leads to:

10.6.12 (Debug)

mysqld: /test/10.6_dbg/wsrep-lib/src/transaction.cpp:883: int wsrep::transaction::after_statement(): Assertion `state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay' failed.

10.6.12 (Debug)

Core was generated by `/test/GAL_MD270123-mariadb-10.6.12-linux-x86_64-dbg/bin/mysqld --defaults-file='.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x14d2b9440700 (LWP 366714))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055ab79c41987 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
#2  0x000055ab794f43d7 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:357
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x000014d2ef249859 in __GI_abort () at abort.c:79
#6  0x000014d2ef249729 in __assert_fail_base (fmt=0x14d2ef3df588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55ab7a31ae98 "state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay", file=0x55ab7a319698 "/test/10.6_dbg/wsrep-lib/src/transaction.cpp", line=883, function=<optimized out>) at assert.c:92
#7  0x000014d2ef25afd6 in __GI___assert_fail (assertion=assertion@entry=0x55ab7a31ae98 "state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay", file=file@entry=0x55ab7a319698 "/test/10.6_dbg/wsrep-lib/src/transaction.cpp", line=line@entry=883, function=function@entry=0x55ab7a31adb0 "int wsrep::transaction::after_statement()") at assert.c:101
#8  0x000055ab79da1db7 in wsrep::transaction::after_statement (this=this@entry=0x14d284007340) at /test/10.6_dbg/wsrep-lib/include/wsrep/transaction.hpp:64
#9  0x000055ab79d87b7a in wsrep::client_state::after_statement (this=this@entry=0x14d2840072d8) at /test/10.6_dbg/wsrep-lib/src/client_state.cpp:281
#10 0x000055ab7922e3c2 in wsrep_after_statement (thd=0x14d284000d48) at /test/10.6_dbg/sql/sql_class.h:5429
#11 wsrep_mysql_parse (thd=thd@entry=0x14d284000d48, rawbuf=0x14d2840844f0 "CREATE TABLE tbl2(c1 VARCHAR(20)) engine=InnoDB", length=47, parser_state=parser_state@entry=0x14d2b943f310) at /test/10.6_dbg/sql/sql_parse.cc:7861
#12 0x000055ab7923c200 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14d284000d48, packet=packet@entry=0x14d28400ac79 "CREATE TABLE tbl2(c1 VARCHAR(20)) engine=InnoDB", packet_length=packet_length@entry=47, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_class.h:1365
#13 0x000055ab7923e62c in do_command (thd=0x14d284000d48, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_parse.cc:1409
#14 0x000055ab7938320a in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55ab7c3b4e28, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1416
#15 0x000055ab793836dc in handle_one_connection (arg=0x55ab7c3b4e28) at /test/10.6_dbg/sql/sql_connect.cc:1318
#16 0x000014d2ef75a609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#17 0x000014d2ef346133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

10.5.19 (Debug)

mysqld: /test/10.5_dbg/wsrep-lib/src/transaction.cpp:883: int wsrep::transaction::after_statement(): Assertion `state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay' failed.

10.5.19 (Debug)

Core was generated by `/test/GAL_MD270123-mariadb-10.5.19-linux-x86_64-dbg/bin/mysqld --defaults-file='.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x147b8415a700 (LWP 308211))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055c9e263d22c in my_write_core (sig=sig@entry=6) at /test/10.5_dbg/mysys/stacktrace.c:424
#2  0x000055c9e1e7d6c5 in handle_fatal_signal (sig=6) at /test/10.5_dbg/sql/signal_handler.cc:356
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x0000147ba7cdc859 in __GI_abort () at abort.c:79
#6  0x0000147ba7cdc729 in __assert_fail_base (fmt=0x147ba7e72588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55c9e2d0a898 "state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay", file=0x55c9e2d09098 "/test/10.5_dbg/wsrep-lib/src/transaction.cpp", line=883, function=<optimized out>) at assert.c:92
#7  0x0000147ba7cedfd6 in __GI___assert_fail (assertion=assertion@entry=0x55c9e2d0a898 "state() == s_executing || state() == s_prepared || state() == s_committed || state() == s_aborted || state() == s_must_replay", file=file@entry=0x55c9e2d09098 "/test/10.5_dbg/wsrep-lib/src/transaction.cpp", line=line@entry=883, function=function@entry=0x55c9e2d0a7b0 "int wsrep::transaction::after_statement()") at assert.c:101
#8  0x000055c9e2799b1f in wsrep::transaction::after_statement (this=this@entry=0x147af4007070) at /test/10.5_dbg/wsrep-lib/include/wsrep/transaction.hpp:64
#9  0x000055c9e277f8e2 in wsrep::client_state::after_statement (this=this@entry=0x147af4007008) at /test/10.5_dbg/wsrep-lib/src/client_state.cpp:281
#10 0x000055c9e1bd951a in wsrep_after_statement (thd=0x147af4000d48) at /test/10.5_dbg/sql/sql_class.h:5174
#11 wsrep_mysql_parse (thd=thd@entry=0x147af4000d48, rawbuf=0x147af4082ab0 "CREATE TABLE tbl2(c1 VARCHAR(20)) engine=InnoDB", length=47, parser_state=parser_state@entry=0x147b84159310, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.5_dbg/sql/sql_parse.cc:7920
#12 0x000055c9e1be779c in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x147af4000d48, packet=packet@entry=0x147af400a9a9 "CREATE TABLE tbl2(c1 VARCHAR(20)) engine=InnoDB", packet_length=packet_length@entry=47, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.5_dbg/sql/sql_class.h:1297
#13 0x000055c9e1be9feb in do_command (thd=0x147af4000d48) at /test/10.5_dbg/sql/sql_parse.cc:1375
#14 0x000055c9e1d2566b in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55c9e5c49e68, put_in_cache=put_in_cache@entry=true) at /test/10.5_dbg/sql/sql_connect.cc:1416
#15 0x000055c9e1d25b3c in handle_one_connection (arg=0x55c9e5c49e68) at /test/10.5_dbg/sql/sql_connect.cc:1318
#16 0x0000147ba81ed609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#17 0x0000147ba7dd9133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95



 Comments   
Comment by Ramesh Sivaraman [ 2023-02-03 ]

Attached cnf files from galera nodes n1.cnf n2.cnf

Comment by Daniele Sciascia [ 2023-03-09 ]

I believe that the root cause of this issue is related to MDEV-24981.
If we simplify the test a little:

CREATE TABLE t1 (c1 BIGINT NOT NULL);
INSERT INTO t1 VALUES(10);
SET GLOBAL wsrep_on = OFF;
XA START 't';
SET @@session.query_prealloc_size   = 0;
Warnings:
Warning 1292    Truncated incorrect query_prealloc_size value: '0'
SET SESSION max_session_mem_used = 8192;
LOAD INDEX INTO CACHE t1 IGNORE LEAVES;
Table   Op      Msg_type        Msg_text
test.t1 preload_keys    Error   The MariaDB server is running with the --max-session-mem-used=8192 option so it cannot execute this statement
test.t1 preload_keys    status  Operation failed
SELECT @@session.in_transaction;
@@session.in_transaction
0
CREATE TABLE t2 (f1 INTEGER);
galera.MDEV-30550 'innodb'               [ fail ]
        Test ended at 2023-03-09 10:21:25
 
CURRENT_TEST: galera.MDEV-30550
mysqltest: At line 33: query 'CREATE TABLE t2 (f1 INTEGER)' failed: 1399: XAER_RMFAIL: The command cannot be executed when global transaction is in the  ACTIVE state

We find that after LOAD INDEX statement, in_transaction is 0. Meaning that the transaction started by XA START was rolled back. However, the last CREATE shows that the the XA transaction is still in ACTIVE state. It doesn't make sense, and likely the transaction was not rolled back correctly. And we know that there already are issue related to LOAD INDEX and rollback of XAs (MDEV-24981). I would wait for MDEV-24981 to be fixed before having a look at this.

Comment by Daniele Sciascia [ 2023-03-09 ]

ramesh I wonder if you should avoid the combination of Galera + XA + LOAD INDEX in your testing. We already know that XA + LOAD INDEX is not working properly, even without Galera (MDEV-24981).
Many tickets involving XA + LOAD INDEX + Galera have been created already. And in my opinion having more tickets with the same combination is not useful at this point.
First make sure that XA + LOAD INDEX is working correctly. When that works correctly, we can resume testing Galera + XA + LOAD INDEX.

Comment by Ramesh Sivaraman [ 2023-03-09 ]

sciascid Sure will remove the combination of Galera + XA + LOAD INDEX from QA runs.

Generated at Thu Feb 08 10:17:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.