[MDEV-27317] Assertion `xid_seqno > wsrep_seqno' failed in void trx_rseg_update_wsrep_checkpoint(buf_block_t*, const XID*, mtr_t*) Created: 2021-12-20  Updated: 2023-03-31  Resolved: 2023-03-31

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.28, 10.6, 10.8
Fix Version/s: 11.1.0, 10.11.3, 10.4.29, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Julius Goryavsky
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-29086 Assertion `xid_seqno > wsrep_seqno' f... Open

 Description   

10.8.0 a44ba7aa3b5de62b08ebcf46bea2b7da17587e7b

Core was generated by `/test/mtest/GAL_MD161221-mariadb-10.8.0-linux-x86_64-dbg-uca/bin/mysqld --defau'.
Program terminated with signal SIGABRT, Aborted.
#0  0x0000000070000002 in ?? ()
[Current thread is 1 (Thread 0x5d0513986700 (LWP 3782116))]
(gdb) bt
#0  0x0000000070000002 in ?? ()
#1  0x00003aa02eed2766 in _raw_syscall () at /home/roc/rr/rr/src/preload/raw_syscall.S:120
#2  0x00003aa02eece04e in traced_raw_syscall (call=<optimized out>) at /home/roc/rr/rr/src/preload/syscallbuf.c:272
#3  0x00003aa02eed14d1 in syscall_hook_internal (call=0x14ac97500fa0) at /home/roc/rr/rr/src/preload/syscallbuf.c:3295
#4  syscall_hook (call=0x14ac97500fa0) at /home/roc/rr/rr/src/preload/syscallbuf.c:3329
#5  0x00003aa02eecde50 in _syscall_hook_trampoline () at /home/roc/rr/rr/src/preload/syscall_hook.S:313
#6  0x00003aa02eecdeaf in __morestack () at /home/roc/rr/rr/src/preload/syscall_hook.S:458
#7  0x00003aa02eecdf08 in _syscall_hook_trampoline_89_c2_f7_da () at /home/roc/rr/rr/src/preload/syscall_hook.S:504
#8  0x000062901f855f0c in __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#9  0x0000564a73981a6f in my_write_core (sig=sig@entry=6) at /test/mtest/10.8_dbg/mysys/stacktrace.c:424
#10 0x0000564a7315e4a5 in handle_fatal_signal (sig=6) at /test/mtest/10.8_dbg/sql/signal_handler.cc:345
#11 <signal handler called>
#12 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#13 0x0000564a724d7859 in __GI_abort () at abort.c:79
#14 0x0000564a724d7729 in __assert_fail_base (fmt=0x564a7266d588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x564a73de4556 "xid_seqno > wsrep_seqno", 
    file=0x564a73de4648 "/test/mtest/10.8_dbg/storage/innobase/trx/trx0rseg.cc", line=124, function=<optimized out>) at assert.c:92
#15 0x0000564a724e8f36 in __GI___assert_fail (assertion=assertion@entry=0x564a73de4556 "xid_seqno > wsrep_seqno", file=file@entry=0x564a73de4648 "/test/mtest/10.8_dbg/storage/innobase/trx/trx0rseg.cc", 
    line=line@entry=124, function=function@entry=0x564a73de4800 "void trx_rseg_update_wsrep_checkpoint(buf_block_t*, const XID*, mtr_t*)") at assert.c:101
#16 0x0000564a737494fb in trx_rseg_update_wsrep_checkpoint (rseg_header=rseg_header@entry=0x7ffdcc5cd430, xid=xid@entry=0x51575da612a0, mtr=mtr@entry=0x5d0513983700)
    at /test/mtest/10.8_dbg/storage/innobase/trx/trx0rseg.cc:124
#17 0x0000564a73729a70 in trx_purge_add_undo_to_history (trx=trx@entry=0x51575da601e0, undo=@0x51575da61238: 0x145d502dc968, mtr=mtr@entry=0x5d0513983700)
    at /test/mtest/10.8_dbg/storage/innobase/trx/trx0purge.cc:303
#18 0x0000564a73758149 in trx_write_serialisation_history (mtr=0x5d0513983700, trx=0x51575da601e0) at /test/mtest/10.8_dbg/storage/innobase/trx/trx0trx.cc:1049
#19 trx_t::commit_low (this=this@entry=0x51575da601e0, mtr=0x5d0513983700) at /test/mtest/10.8_dbg/storage/innobase/trx/trx0trx.cc:1424
#20 0x0000564a73759baf in trx_t::commit_persist (this=this@entry=0x51575da601e0) at /test/mtest/10.8_dbg/storage/innobase/trx/trx0trx.cc:1462
#21 0x0000564a73759d81 in trx_t::commit (this=this@entry=0x51575da601e0) at /test/mtest/10.8_dbg/storage/innobase/trx/trx0trx.cc:1468
#22 0x0000564a73759fbe in trx_commit_for_mysql (trx=trx@entry=0x51575da601e0) at /test/mtest/10.8_dbg/storage/innobase/trx/trx0trx.cc:1584
#23 0x0000564a7350efeb in innobase_commit_low (trx=trx@entry=0x51575da601e0) at /test/mtest/10.8_dbg/storage/innobase/handler/ha_innodb.cc:4308
#24 0x0000564a7350f06c in innobase_commit_ordered_2 (trx=trx@entry=0x51575da601e0, thd=thd@entry=0x145d50000ff8) at /test/mtest/10.8_dbg/storage/innobase/handler/ha_innodb.cc:4414
#25 0x0000564a7350f351 in innobase_commit (hton=<optimized out>, thd=0x145d50000ff8, commit_trx=<optimized out>) at /test/mtest/10.8_dbg/storage/innobase/handler/ha_innodb.cc:4515
#26 0x0000564a73163f00 in commit_one_phase_2 (thd=thd@entry=0x145d50000ff8, all=all@entry=true, trans=trans@entry=0x145d50004978, is_real_trans=is_real_trans@entry=true)
    at /test/mtest/10.8_dbg/sql/handler.cc:2052
#27 0x0000564a73164219 in ha_commit_one_phase (thd=thd@entry=0x145d50000ff8, all=all@entry=true) at /test/mtest/10.8_dbg/sql/handler.cc:2005
#28 0x0000564a731784c7 in ha_commit_trans (thd=thd@entry=0x145d50000ff8, all=all@entry=true) at /test/mtest/10.8_dbg/sql/handler.cc:1799
#29 0x0000564a72ff01b8 in trans_commit (thd=thd@entry=0x145d50000ff8) at /test/mtest/10.8_dbg/sql/transaction.cc:266
#30 0x0000564a734c790c in Wsrep_high_priority_service::commit (this=0x5d0513985d10, ws_handle=..., ws_meta=...) at /test/mtest/10.8_dbg/sql/wsrep_high_priority_service.cc:329
#31 0x0000564a73adab6a in apply_write_set (server_state=..., high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/mtest/10.8_dbg/wsrep-lib/src/server_state.cpp:332
#32 0x0000564a73adba8b in wsrep::server_state::on_apply (this=<optimized out>, high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/mtest/10.8_dbg/wsrep-lib/src/server_state.cpp:1145
#33 0x0000564a73aec215 in wsrep::high_priority_service::apply (data=..., ws_meta=..., ws_handle=..., this=0x5d0513985d10) at /test/mtest/10.8_dbg/wsrep-lib/include/wsrep/high_priority_service.hpp:47
#34 (anonymous namespace)::apply_cb (ctx=ctx@entry=0x5d0513985d10, wsh=wsh@entry=0x5d0513984d20, flags=flags@entry=65, buf=buf@entry=0x5d0513984d30, meta=meta@entry=0x5d0513984fe0, 
    exit_loop=exit_loop@entry=0x5d0513984f9f) at /test/mtest/10.8_dbg/wsrep-lib/src/wsrep_provider_v26.cpp:504
#35 0x000062901f5fa3b5 in galera::TrxHandleSlave::apply (this=this@entry=0x7d8f6c4ef7a0, recv_ctx=recv_ctx@entry=0x5d0513985d10, 
    apply_cb=0x564a73aec092 <(anonymous namespace)::apply_cb(void*, wsrep_ws_handle_t const*, uint32_t, wsrep_buf_t const*, wsrep_trx_meta_t const*, wsrep_bool_t*)>, meta=..., 
    exit_loop=exit_loop@entry=@0x5d0513984f9f: false) at /test/mtest/10.8_galera_dbg/galera/src/trx_handle.cpp:396
#36 0x000062901f610d9e in galera::ReplicatorSMM::apply_trx (this=0x564a75840d40, recv_ctx=0x5d0513985d10, ts=...) at /test/mtest/10.8_galera_dbg/galera/src/replicator_smm.cpp:508
#37 0x000062901f614fb8 in galera::ReplicatorSMM::process_trx (this=0x564a75840d40, recv_ctx=0x5d0513985d10, ts_ptr=...) at /test/mtest/10.8_galera_dbg/galera/src/replicator_smm.cpp:2128
#38 0x000062901f646259 in galera::GcsActionSource::process_writeset (this=0x564a7584c300, recv_ctx=0x5d0513985d10, act=..., exit_loop=@0x5d051398591f: false)
--Type <RET> for more, q to quit, c to continue without paging--
    at /test/mtest/10.8_galera_dbg/galera/src/gcs_action_source.cpp:62
#39 0x000062901f646cb3 in galera::GcsActionSource::dispatch (this=0x564a7584c300, recv_ctx=0x5d0513985d10, act=..., exit_loop=<optimized out>) at /test/mtest/10.8_galera_dbg/galera/src/gcs_action_source.cpp:110
#40 0x000062901f647272 in galera::GcsActionSource::process (this=0x564a7584c300, recv_ctx=0x5d0513985d10, exit_loop=@0x5d051398591f: false) at /test/mtest/10.8_galera_dbg/galera/src/gcs_action_source.cpp:186
#41 0x000062901f612440 in galera::ReplicatorSMM::async_recv (this=0x564a75840d40, recv_ctx=0x5d0513985d10) at /test/mtest/10.8_galera_dbg/galera/src/replicator_smm.cpp:394
#42 0x000062901f5e6881 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at /test/mtest/10.8_galera_dbg/galera/src/wsrep_provider.cpp:261
#43 0x0000564a73aec676 in wsrep::wsrep_provider_v26::run_applier (this=<optimized out>, applier_ctx=<optimized out>) at /test/mtest/10.8_dbg/wsrep-lib/src/wsrep_provider_v26.cpp:796
#44 0x0000564a734e62dd in wsrep_replication_process (thd=0x145d50000ff8, arg=<optimized out>) at /test/mtest/10.8_dbg/sql/wsrep_server_state.h:51
#45 0x0000564a734d369d in start_wsrep_THD (arg=arg@entry=0x564a75879fb0) at /test/mtest/10.8_dbg/sql/wsrep_mysqld.h:575
#46 0x0000564a7345df65 in pfs_spawn_thread (arg=0x564a75868e88) at /test/mtest/10.8_dbg/storage/perfschema/pfs.cc:2201
#47 0x000062901f84d609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#48 0x0000564a725d4293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95



 Comments   
Comment by Ramesh Sivaraman [ 2022-03-22 ]

Reproduced the crash using RQG scripts, this issue is slightly sporadic. Sometime node1 crashes similar to MDEV-19973

Testcase

Start node1
Initiate RQG data load
#before initiating data load set innodb dirty page flushing  percentage to 0
SET GLOBAL innodb_max_dirty_pages_pct=0;
SET GLOBAL innodb_max_dirty_pages_pct_lwm=0;
 
perl gendata.pl --dsn=dbi:mysql:host=127.0.0.1:port=3306:socket=/tmp/mysql.sock:user=root:database=test --spec=conf/mariadb/oltp.zz
perl gendata.pl --dsn=dbi:mysql:host=127.0.0.1:port=3306:socket=/tmp/mysql.sock:user=root:database=test --spec=conf/engines/innodb/full_text_search.zz
 
Initiate RQG OLTP run 
 
perl gentest.pl --dsn=dbi:mysql:host=127.0.0.1:port=3306:socket=/tmp/mysql.sock:user=root:database=test --grammar=conf/engines/innodb/full_text_search.yy --threads=32 --duration=300 --queries=10000
Start node2 after initiating RQG OLTP run (Node 2 crashes with the following stack) 

#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x00005635bffa03ad in my_write_core (sig=sig@entry=6) at /test/mtest/10.6_dbg/mysys/stacktrace.c:424
#2  0x00005635bf750e39 in handle_fatal_signal (sig=6) at /test/mtest/10.6_dbg/sql/signal_handler.cc:345
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x0000145b8519c859 in __GI_abort () at abort.c:79
#6  0x0000145b8519c729 in __assert_fail_base (fmt=0x145b85332588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5635c03f9711 "xid_seqno > wsrep_seqno", 
    file=0x5635c03f9880 "/test/mtest/10.6_dbg/storage/innobase/trx/trx0rseg.cc", line=108, function=<optimized out>) at assert.c:92
#7  0x0000145b851adf36 in __GI___assert_fail (assertion=assertion@entry=0x5635c03f9711 "xid_seqno > wsrep_seqno", file=file@entry=0x5635c03f9880 "/test/mtest/10.6_dbg/storage/innobase/trx/trx0rseg.cc", 
    line=line@entry=108, function=function@entry=0x5635c03f9d88 "void trx_rseg_update_wsrep_checkpoint(buf_block_t*, const XID*, mtr_t*)") at assert.c:101
#8  0x00005635bfd65548 in trx_rseg_update_wsrep_checkpoint (rseg_header=rseg_header@entry=0x145b39d55b60, xid=xid@entry=0x1459c1cbc680, mtr=mtr@entry=0x1459c1cbc010)
    at /test/mtest/10.6_dbg/storage/innobase/trx/trx0rseg.cc:108
#9  0x00005635bfd663b0 in trx_rseg_update_wsrep_checkpoint (xid=xid@entry=0x1459c1cbc680, mtr=mtr@entry=0x1459c1cbc010) at /test/mtest/10.6_dbg/storage/innobase/trx/trx0rseg.cc:146
#10 0x00005635bfd675d7 in trx_rseg_update_wsrep_checkpoint (xid=xid@entry=0x1459c1cbc680) at /test/mtest/10.6_dbg/storage/innobase/trx/trx0rseg.cc:176
#11 0x00005635bfb1e3c3 in innobase_wsrep_set_checkpoint (hton=0x5635c20d5158, xid=0x1459c1cbc680) at /test/mtest/10.6_dbg/storage/innobase/handler/ha_innodb.cc:18747
#12 0x00005635bfae670c in set_SE_checkpoint (unused=unused@entry=0x0, plugin=<optimized out>, arg=arg@entry=0x1459c1cbc680) at /test/mtest/10.6_dbg/sql/wsrep_xid.cc:127
#13 0x00005635bf4730cd in plugin_foreach_with_mask (thd=thd@entry=0x0, func=func@entry=0x5635bfae667d <set_SE_checkpoint(THD*, plugin_ref, void*)>, type=type@entry=1, state_mask=state_mask@entry=8, 
    arg=arg@entry=0x1459c1cbc680) at /test/mtest/10.6_dbg/sql/sql_plugin.cc:2511
#14 0x00005635bfae6855 in wsrep_set_SE_checkpoint (xid=...) at /test/mtest/10.6_dbg/sql/wsrep_xid.cc:134
#15 0x00005635bfae6895 in wsrep_set_SE_checkpoint (wsgtid=..., gtid=...) at /test/mtest/10.6_dbg/sql/wsrep_xid.cc:142
#16 0x00005635bfae4898 in Wsrep_high_priority_service::log_dummy_write_set (this=0x1459c1cbdd10, ws_handle=..., ws_meta=..., err=...) at /test/mtest/10.6_dbg/wsrep-lib/include/wsrep/provider.hpp:143
#17 0x00005635c00ee8ed in apply_write_set (server_state=..., high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/mtest/10.6_dbg/wsrep-lib/src/server_state.cpp:283
#18 0x00005635c00efefd in wsrep::server_state::on_apply (this=<optimized out>, high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /test/mtest/10.6_dbg/wsrep-lib/src/server_state.cpp:1114
#19 0x00005635c010033a in wsrep::high_priority_service::apply (data=..., ws_meta=..., ws_handle=..., this=0x1459c1cbdd10) at /test/mtest/10.6_dbg/wsrep-lib/include/wsrep/high_priority_service.hpp:47
#20 (anonymous namespace)::apply_cb (ctx=ctx@entry=0x1459c1cbdd10, wsh=wsh@entry=0x1459c1cbcd20, flags=flags@entry=67, buf=buf@entry=0x1459c1cbcd30, meta=meta@entry=0x1459c1cbcfe0, 
    exit_loop=exit_loop@entry=0x1459c1cbcf9f) at /test/mtest/10.6_dbg/wsrep-lib/src/wsrep_provider_v26.cpp:507
#21 0x0000145b84bf9945 in galera::TrxHandleSlave::apply (this=this@entry=0x145978d024f0, recv_ctx=recv_ctx@entry=0x1459c1cbdd10, 
    apply_cb=0x5635c01001b7 <(anonymous namespace)::apply_cb(void*, wsrep_ws_handle_t const*, uint32_t, wsrep_buf_t const*, wsrep_trx_meta_t const*, wsrep_bool_t*)>, meta=..., 
    exit_loop=exit_loop@entry=@0x1459c1cbcf9f: false) at /test/mtest/10.6_galera/galera/src/trx_handle.cpp:396
#22 0x0000145b84c100de in galera::ReplicatorSMM::apply_trx (this=0x5635c1fb46b0, recv_ctx=0x1459c1cbdd10, ts=...) at /test/mtest/10.6_galera/galera/src/replicator_smm.cpp:516
#23 0x0000145b84c14348 in galera::ReplicatorSMM::process_trx (this=0x5635c1fb46b0, recv_ctx=0x1459c1cbdd10, ts_ptr=...) at /test/mtest/10.6_galera/galera/src/replicator_smm.cpp:2136
#24 0x0000145b84c45e39 in galera::GcsActionSource::process_writeset (this=0x5635c1fc1b00, recv_ctx=0x1459c1cbdd10, act=..., exit_loop=@0x1459c1cbd91f: false)
    at /test/mtest/10.6_galera/galera/src/gcs_action_source.cpp:62
#25 0x0000145b84c46893 in galera::GcsActionSource::dispatch (this=0x5635c1fc1b00, recv_ctx=0x1459c1cbdd10, act=..., exit_loop=<optimized out>) at /test/mtest/10.6_galera/galera/src/gcs_action_source.cpp:110
#26 0x0000145b84c46e52 in galera::GcsActionSource::process (this=0x5635c1fc1b00, recv_ctx=0x1459c1cbdd10, exit_loop=@0x1459c1cbd91f: false) at /test/mtest/10.6_galera/galera/src/gcs_action_source.cpp:186
#27 0x0000145b84c11780 in galera::ReplicatorSMM::async_recv (this=0x5635c1fb46b0, recv_ctx=0x1459c1cbdd10) at /test/mtest/10.6_galera/galera/src/replicator_smm.cpp:402
#28 0x0000145b84be5e71 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at /test/mtest/10.6_galera/galera/src/wsrep_provider.cpp:263
#29 0x00005635c0101012 in wsrep::wsrep_provider_v26::run_applier (this=<optimized out>, applier_ctx=<optimized out>) at /test/mtest/10.6_dbg/wsrep-lib/src/wsrep_provider_v26.cpp:858
#30 0x00005635bfb00acb in wsrep_replication_process (thd=0x145978000ff8, arg=<optimized out>) at /test/mtest/10.6_dbg/sql/wsrep_server_state.h:51
#31 0x00005635bfaee165 in start_wsrep_THD (arg=arg@entry=0x5635c2295770) at /test/mtest/10.6_dbg/sql/wsrep_mysqld.h:575
#32 0x00005635bfa7f495 in pfs_spawn_thread (arg=0x5635c2426d58) at /test/mtest/10.6_dbg/storage/perfschema/pfs.cc:2201
#33 0x0000145b856ab609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#34 0x0000145b85299293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Julius Goryavsky [ 2023-03-31 ]

Fix merged with head revision as https://github.com/MariaDB/server/commit/cadc3efcddb078fe8fe19a6121cc2a95be9a97de

Generated at Thu Feb 08 09:52:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.