[MDEV-25535] Assertion `server_id.is_undefined() == false' failed in int wsrep::transaction::certify_fragment(wsrep::unique_lock<wsrep::mutex>&) Created: 2021-04-27  Updated: 2022-02-22  Resolved: 2022-02-22

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4, 10.5, 10.6
Fix Version/s: 10.4.23, 10.5.14, 10.6.6

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Ramesh Sivaraman
Resolution: Fixed Votes: 0
Labels: not-10.2, not-10.3


 Description   

SET autocommit=OFF;
SET GLOBAL log_output=4;
CREATE TABLE t2 (user_str TEXT);
SET GLOBAL general_log=on;
INSERT INTO t2 VALUES (4978+0.75);
SET GLOBAL wsrep_cluster_address='';
SET SESSION wsrep_trx_fragment_size=1;
INSERT INTO t2 VALUES (10);
SAVEPOINT event_logging_1;
CREATE TABLE IF NOT EXISTS t3 (id INT) ENGINE=InnoDB;

Leads to:

10.4.19 57caff245cbdcbdfda6f022de7f354f05a790656 (Debug)

mysqld: /test/10.4_dbg/wsrep-lib/src/transaction.cpp:1486: int wsrep::transaction::certify_fragment(wsrep::unique_lock<wsrep::mutex>&): Assertion `server_id.is_undefined() == false' failed.

10.4.19 57caff245cbdcbdfda6f022de7f354f05a790656 (Debug)

Core was generated by `/test/GAL_MD200421-mariadb-10.4.19-linux-x86_64-dbg/bin/mysqld --defaults-file='.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x1465b80a0700 (LWP 3644910))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055847f3b019d in my_write_core (sig=sig@entry=6) at /test/10.4_dbg/mysys/stacktrace.c:386
#2  0x000055847eab8cda in handle_fatal_signal (sig=6) at /test/10.4_dbg/sql/signal_handler.cc:356
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00001465cc139859 in __GI_abort () at abort.c:79
#6  0x00001465cc139729 in __assert_fail_base (fmt=0x1465cc2cf588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55847f4e3378 "server_id.is_undefined() == false", file=0x55847f9da038 "/test/10.4_dbg/wsrep-lib/src/transaction.cpp", line=1486, function=<optimized out>) at assert.c:92
#7  0x00001465cc14af36 in __GI___assert_fail (assertion=assertion@entry=0x55847f4e3378 "server_id.is_undefined() == false", file=file@entry=0x55847f9da038 "/test/10.4_dbg/wsrep-lib/src/transaction.cpp", line=line@entry=1486, function=function@entry=0x55847f9daba0 "int wsrep::transaction::certify_fragment(wsrep::unique_lock<wsrep::mutex>&)") at assert.c:101
#8  0x000055847f44cf76 in wsrep::transaction::certify_fragment (this=this@entry=0x146544006fc8, lock=@0x1465b809e4e0: {_M_device = 0x146544006f28, _M_owns = false}) at /test/10.4_dbg/wsrep-lib/src/transaction.cpp:1487
#9  0x000055847f44da1b in wsrep::transaction::streaming_step (this=this@entry=0x146544006fc8, lock=@0x1465b809e4e0: {_M_device = 0x146544006f28, _M_owns = false}, force=force@entry=false) at /test/10.4_dbg/wsrep-lib/src/transaction.cpp:1391
#10 0x000055847f44dac7 in wsrep::transaction::after_row (this=this@entry=0x146544006fc8) at /test/10.4_dbg/wsrep-lib/src/transaction.cpp:265
#11 0x000055847eabbbd3 in wsrep::client_state::after_row (this=0x146544006f60) at /test/10.4_dbg/wsrep-lib/include/wsrep/streaming_context.hpp:94
#12 wsrep_after_row (thd=0x146544000d90) at /test/10.4_dbg/sql/wsrep_trans_observer.h:172
#13 wsrep_after_row (thd=0x146544000d90) at /test/10.4_dbg/sql/handler.cc:6568
#14 0x000055847eacb95f in handler::ha_write_row (this=0x14654439f698, buf=0x1465440380f8 "`\207\262\317\001\265\220\031") at /test/10.4_dbg/sql/handler.cc:6766
#15 0x000055847ec102b3 in Log_to_csv_event_handler::log_general (this=<optimized out>, thd=0x146544000d90, event_time=<optimized out>, user_host=0x1465b809f0e0 "root[root] @ localhost []", user_host_len=25, thread_id_arg=17, command_type=0x55847f48c7d7 "Query", command_type_len=5, sql_text=0x146544013228 "CREATE TABLE IF NOT EXISTS t3 (id INT) ENGINE=InnoDB", sql_text_len=52, client_cs=0x55847fd4a6c0 <my_charset_utf8_general_ci>) at /test/10.4_dbg/sql/log.cc:807
#16 0x000055847ec12a2b in LOGGER::general_log_write (this=this@entry=0x55847fd8c4a0 <logger>, thd=thd@entry=0x146544000d90, command=command@entry=COM_QUERY, query=query@entry=0x146544013228 "CREATE TABLE IF NOT EXISTS t3 (id INT) ENGINE=InnoDB", query_length=query_length@entry=52) at /test/10.4_dbg/sql/log.cc:1400
#17 0x000055847ec12c90 in general_log_write (thd=thd@entry=0x146544000d90, command=command@entry=COM_QUERY, query=0x146544013228 "CREATE TABLE IF NOT EXISTS t3 (id INT) ENGINE=InnoDB", query_length=52) at /test/10.4_dbg/sql/log.cc:6677
#18 0x000055847e7ef330 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x146544000d90, packet=packet@entry=0x14654401b661 "CREATE TABLE IF NOT EXISTS t3 (id INT) ENGINE=InnoDB", packet_length=packet_length@entry=52, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.4_dbg/sql/sql_class.h:1183
#19 0x000055847e7f2cde in do_command (thd=0x146544000d90) at /test/10.4_dbg/sql/sql_parse.cc:1373
#20 0x000055847e92d88f in do_handle_one_connection (connect=connect@entry=0x558482bd5db0) at /test/10.4_dbg/sql/sql_connect.cc:1412
#21 0x000055847e92d9ae in handle_one_connection (arg=0x558482bd5db0) at /test/10.4_dbg/sql/sql_connect.cc:1316
#22 0x00001465cc6f6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#23 0x00001465cc236293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.4.19 (dbg)



 Comments   
Comment by Ramesh Sivaraman [ 2021-05-04 ]

Found similar crash with slightly different stack. The testcase crashes 10.5 and 10.6

CREATE TABLE t1 AS SELECT 1 AS c1;
SET GLOBAL wsrep_ignore_apply_errors=0;
SET SESSION wsrep_trx_fragment_size=1;
SET SESSION wsrep_trx_fragment_unit='statements';
SET AUTOCOMMIT=0;
CREATE TABLE t1 (id INT);
SAVEPOINT SVP001;

Leads to:

10.6.0 8751aa7397b2e698fa0b46ec3e60abb9e2fd7e1b (Debug)

mysqld: /test/10.6_dbg/wsrep-lib/src/transaction.cpp:1486: int wsrep::transaction::certify_fragment(wsrep::unique_lock<wsrep::mutex>&): Assertion `server_id.is_undefined() == false' failed.

10.6.0 8751aa7397b2e698fa0b46ec3e60abb9e2fd7e1b (Debug)

Core was generated by `/test/GAL_MD200421-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --defaults-file=/'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x1525a811f700 (LWP 3055503))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x0000558ccd90670f in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
#2  0x0000558ccd09bc86 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:343
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00001525dc90d859 in __GI_abort () at abort.c:79
#6  0x00001525dc90d729 in __assert_fail_base (fmt=0x1525dcaa3588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x558ccdcafbf0 "server_id.is_undefined() == false", file=0x558ccdfff3f8 "/test/10.6_dbg/wsrep-lib/src/transaction.cpp", line=1486, function=<optimized out>) at assert.c:92
#7  0x00001525dc91ef36 in __GI___assert_fail (assertion=assertion@entry=0x558ccdcafbf0 "server_id.is_undefined() == false", file=file@entry=0x558ccdfff3f8 "/test/10.6_dbg/wsrep-lib/src/transaction.cpp", line=line@entry=1486, function=function@entry=0x558ccdffff60 "int wsrep::transaction::certify_fragment(wsrep::unique_lock<wsrep::mutex>&)") at assert.c:101
#8  0x0000558ccda34b04 in wsrep::transaction::certify_fragment (this=this@entry=0x1525640073f8, lock=@0x1525a811e1e0: {_M_device = 0x152564007358, _M_owns = false}) at /test/10.6_dbg/wsrep-lib/src/transaction.cpp:1487
#9  0x0000558ccda355a9 in wsrep::transaction::streaming_step (this=this@entry=0x1525640073f8, lock=@0x1525a811e1e0: {_M_device = 0x152564007358, _M_owns = false}, force=force@entry=false) at /test/10.6_dbg/wsrep-lib/src/transaction.cpp:1391
#10 0x0000558ccda37165 in wsrep::transaction::after_statement (this=this@entry=0x1525640073f8) at /test/10.6_dbg/wsrep-lib/src/transaction.cpp:822
#11 0x0000558ccda1df1a in wsrep::client_state::after_statement (this=this@entry=0x152564007390) at /test/10.6_dbg/wsrep-lib/src/client_state.cpp:276
#12 0x0000558cccdc4c84 in wsrep_after_statement (thd=0x152564000db8) at /test/10.6_dbg/sql/sql_class.h:5331
#13 wsrep_mysql_parse (thd=thd@entry=0x152564000db8, rawbuf=0x152564013bb0 "SAVEPOINT SVP001", length=16, parser_state=parser_state@entry=0x1525a811e410) at /test/10.6_dbg/sql/sql_parse.cc:7861
#14 0x0000558cccdd426b in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x152564000db8, packet=packet@entry=0x15256400b369 "SAVEPOINT SVP001", packet_length=packet_length@entry=16, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_class.h:1331
#15 0x0000558cccdd7653 in do_command (thd=0x152564000db8, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_parse.cc:1406
#16 0x0000558cccf3226d in do_handle_one_connection (connect=<optimized out>, connect@entry=0x558ccfb55078, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1410
#17 0x0000558cccf32875 in handle_one_connection (arg=arg@entry=0x558ccfb55078) at /test/10.6_dbg/sql/sql_connect.cc:1312
#18 0x0000558ccd3e32f7 in pfs_spawn_thread (arg=0x558ccfbcc9a8) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
#19 0x00001525dce1b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x00001525dca0a293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Daniele Sciascia [ 2022-02-10 ]

Looks like a duplicate of MDEV-27615, which was recently fixed in 10.4.
Assigning back to Ramesh to verify that this has been fixed and close the issue.

Comment by Ramesh Sivaraman [ 2022-02-14 ]

sciascid Cluster becomes unstable when the following test case executed on the bug fix build. Is this expected?

CREATE TABLE t1 AS SELECT 1 AS c1;
SET GLOBAL wsrep_ignore_apply_errors=0;
SET SESSION wsrep_trx_fragment_size=1;
SET SESSION wsrep_trx_fragment_unit='statements';
SET AUTOCOMMIT=0;
CREATE TABLE t1 (id INT);

Error log

10.8.3 8274c207dfd1bdac10e71f17ce3acccf330c8db6 (Debug)

2022-02-14  7:51:24 12 [Note] WSREP: wsrep_thd_binlog_reset
2022-02-14  7:51:24 12 [Note] WSREP: TOI Begin: CREATE TABLE t1 (id INT)
2022-02-14  7:51:24 12 [Note] WSREP: wsrep_TOI_begin for CREATE TABLE t1 (id INT)
2022-02-14  7:51:24 12 [Note] WSREP: mysql_execute_command for CREATE TABLE t1 (id INT)
2022-02-14  7:51:24 12 [Note] WSREP: TO END: 5: CREATE TABLE t1 (id INT)
2022-02-14  7:51:24 12 [Note] WSREP: Set WSREPXid for InnoDB:  f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5
2022-02-14  7:51:24 12 [Note] WSREP: Error buffer for thd 12 seqno 5, 46 bytes: ' Table 't1' already exists, Error_code: 1050;'
2022-02-14  7:51:24 0 [Note] WSREP: Member 0(galapq) initiates vote on f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5,9dfcfc5408ec1bed:  Table 't1' already exists, Error_code: 1050;
2022-02-14  7:51:24 0 [Note] WSREP: Votes over f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5:
   9dfcfc5408ec1bed:   1/2
Waiting for more votes.
2022-02-14  7:51:24 0 [Note] WSREP: Member 1(galapq) responds to vote on f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5,0000000000000000: Success
2022-02-14  7:51:24 0 [Note] WSREP: Votes over f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5:
   0000000000000000:   1/2
   9dfcfc5408ec1bed:   1/2
Winner: 0000000000000000
2022-02-14  7:51:24 12 [ERROR] WSREP: Inconsistency detected: Inconsistent by consensus on f4b1f97d-8d59-11ec-bb9c-3b648b89cc2b:5
	 at /test/mtest/10.8_galera/galera/src/replicator_smm.cpp:process_apply_error():1355
2022-02-14  7:51:24 12 [Note] WSREP: Closing send monitor...
2022-02-14  7:51:24 12 [Note] WSREP: Closed send monitor.
2022-02-14  7:51:24 12 [Note] WSREP: gcomm: terminating thread
2022-02-14  7:51:24 12 [Note] WSREP: gcomm: joining thread
2022-02-14  7:51:24 12 [Note] WSREP: gcomm: closing backend
2022-02-14  7:51:25 12 [Note] WSREP: view(view_id(NON_PRIM,f4b166c8-b3f1,2) memb {
	f4b166c8-b3f1,0
} joined {
} left {
} partitioned {
	f54eee10-ac91,0
})
2022-02-14  7:51:25 12 [Note] WSREP: PC protocol downgrade 1 -> 0
2022-02-14  7:51:25 12 [Note] WSREP: view((empty))
2022-02-14  7:51:25 12 [Note] WSREP: gcomm: closed
2022-02-14  7:51:25 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2022-02-14  7:51:25 0 [Note] WSREP: Flow-control interval: [16, 16]
2022-02-14  7:51:25 0 [Note] WSREP: Received NON-PRIMARY.
2022-02-14  7:51:25 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 5)
2022-02-14  7:51:25 0 [Note] WSREP: New SELF-LEAVE.
2022-02-14  7:51:25 0 [Note] WSREP: Flow-control interval: [0, 0]
2022-02-14  7:51:25 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2022-02-14  7:51:25 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 5)
2022-02-14  7:51:25 0 [Note] WSREP: RECV thread exiting 0: Success

Comment by Daniele Sciascia [ 2022-02-22 ]

Setting wsrep_ignore_apply_errors=0 causes node 1 to initiate a error voting round when the second attempt at creating t1 fails (t1 already exists, and the last CREATE TABLE fails, as expected). When CREATE TABLE is replicated to node 2 it also finds an error, but it is ignored, because node 2 has default value of wsrep_ignore_apply_errors. During error voting the two nodes do not agree and the outcome of CREATE TABLE, and therefore one of them drops out of the cluster. This is expected, and unrelated to the issue in this ticket.

Comment by Daniele Sciascia [ 2022-02-22 ]

Assigning back to Ramesh, to verify / close the ticket.

Comment by Ramesh Sivaraman [ 2022-02-22 ]

sciascidthanks for the info, closing this issue as fixed.

Generated at Thu Feb 08 09:38:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.