Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
N/A
Description
When the following SQL is executed in a binlogging server:
SET pseudo_slave_mode=1; |
XA START 'a'; |
XA END 'a'; |
XA PREPARE 'a'; |
XA ROLLBACK 'a'; |
The server will log into the binary log only the XA ROLLBACK part:
| master-bin.000001 | 4 | Format_desc | 1 | 256 | Server ver: 10.6.16-MariaDB-debug-log, Binlog ver: 4 |
|
| master-bin.000001 | 256 | Start_encryption | 1 | 296 | |
|
| master-bin.000001 | 296 | Gtid_list | 1 | 325 | [] |
|
| master-bin.000001 | 325 | Binlog_checkpoint | 1 | 369 | master-bin.000001 |
|
| master-bin.000001 | 369 | Gtid | 1 | 412 | GTID 0-1-1 |
|
| master-bin.000001 | 412 | Query | 1 | 502 | XA ROLLBACK X'61',X'',1
|
Which should not been done. This type of read-only XA should not produce any events, as well as its even prepared XID should not be recovered after reconnect or the server crash-recovery.
Previous info: this also leads to a crash on the slave in a standard master/slave setup:
10.6.14-9-enterprise-MDEV-30165-30619-31949 57b39cb8e1e4b374e0e138b286a0f9cb0669e401 (Optimized) |
2023-09-26 13:17:23 6 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './relaylog.000001' position: 4; GTID position ''
|
2023-09-26 13:18:01 69 [ERROR] XA COMMIT of GTID 0-1-%could not complete after having been logged into binary log
|
10.6.14-9-enterprise-MDEV-30165-30619-31949 57b39cb8e1e4b374e0e138b286a0f9cb0669e401 (Optimized) |
Core was generated by `/test/PATCH_EMD260923-mariadb-10.6.14-9-linux-x86_64-opt/bin/mariadbd --no-defa'.
|
Program terminated with signal SIGSEGV, Segmentation fault.
|
#0 std::__atomic_base<int>::fetch_or (__m=std::memory_order_relaxed,
|
__i=536870912, this=0x0) at /usr/include/c++/11/bits/atomic_base.h:646
|
646 fetch_or(__int_type __i,
|
[Current thread is 1 (Thread 0x154f3d4b1640 (LWP 1079862))]
|
(gdb) bt
|
#0 std::__atomic_base<int>::fetch_or (__m=std::memory_order_relaxed, __i=536870912, this=0x0) at /usr/include/c++/11/bits/atomic_base.h:646
|
#1 XID_cache_element::acquired_to_recovered (this=0x0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/xa.cc:119
|
#2 trans_xa_rollback (thd=thd@entry=0x154e08000c68) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/xa.cc:1081
|
#3 0x00005577e38265e4 in mysql_execute_command (thd=0x154e08000c68, is_called_from_prepared_stmt=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/sql_parse.cc:5897
|
#4 0x00005577e3829156 in mysql_parse (thd=0x154e08000c68, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x154f3d4b08f0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/sql_parse.cc:8063
|
#5 0x00005577e3bacac3 in Query_log_event::do_apply_event (this=0x154f0005e648, rgi=0x154f0005d1d0, query_arg=0x154f0005e7b3 "XA ROLLBACK X'61',X'',1", q_len_arg=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/sql_class.h:242
|
#6 0x00005577e375e086 in Log_event::apply_event (rgi=0x154f0005d1d0, this=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/log_event.h:1499
|
#7 apply_event_and_update_pos_apply (ev=ev@entry=0x154f0005e648, thd=thd@entry=0x154e08000c68, rgi=rgi@entry=0x154f0005d1d0, reason=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/slave.cc:3892
|
#8 0x00005577e3764962 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x154f0005e648, thd=thd@entry=0x154e08000c68, rgi=rgi@entry=0x154f0005d1d0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/slave.cc:4078
|
#9 0x00005577e39b4148 in rpt_handle_event (qev=0x154f0005cd78, rpt=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/rpl_parallel.cc:134
|
#10 0x00005577e39b874c in handle_rpl_parallel_thread (arg=arg@entry=0x154f000233b8) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/sql/rpl_parallel.cc:1476
|
#11 0x00005577e3cf5e91 in pfs_spawn_thread (arg=0x154f000349d8) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_opt/storage/perfschema/pfs.cc:2201
|
#12 0x0000154f63094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
|
#13 0x0000154f63126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
|
Debug:
10.6.14-9-enterprise-MDEV-30165-30619-31949 57b39cb8e1e4b374e0e138b286a0f9cb0669e401 (Debug) |
2023-09-26 13:11:31 6 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './relaylog.000001' position: 4; GTID position ''
|
mariadbd: /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log.cc:2043: bool acquire_xid(THD*): Assertion `thd->is_killed()' failed.
|
10.6.14-9-enterprise-MDEV-30165-30619-31949 57b39cb8e1e4b374e0e138b286a0f9cb0669e401 (Debug) |
Core was generated by `/test/PATCH_EMD260923-mariadb-10.6.14-9-linux-x86_64-dbg/bin/mariadbd --no-defa'.
|
Program terminated with signal SIGABRT, Aborted.
|
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=22889389626944)
|
at ./nptl/pthread_kill.c:44
|
[Current thread is 1 (Thread 0x14d159f01640 (LWP 3999773))]
|
(gdb) bt
|
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=22889389626944) at ./nptl/pthread_kill.c:44
|
#1 __pthread_kill_internal (signo=6, threadid=22889389626944) at ./nptl/pthread_kill.c:78
|
#2 __GI___pthread_kill (threadid=22889389626944, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
|
#3 0x000014d17f442476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
|
#4 0x000014d17f4287f3 in __GI_abort () at ./stdlib/abort.c:79
|
#5 0x000014d17f42871b in __assert_fail_base (fmt=0x14d17f5dd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55d688113b8d "thd->is_killed()", file=0x55d688114498 "/test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log.cc", line=2043, function=<optimized out>) at ./assert/assert.c:92
|
#6 0x000014d17f439e96 in __GI___assert_fail (assertion=0x55d688113b8d "thd->is_killed()", file=0x55d688114498 "/test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log.cc", line=2043, function=0x55d688113b9e "bool acquire_xid(THD*)") at ./assert/assert.c:101
|
#7 0x000055d6877845ac in acquire_xid (thd=thd@entry=0x14cff8000f98) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log.cc:2043
|
#8 0x000055d687798a03 in binlog_rollback_by_xid (hton=0x55d6893c0988, xid=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log.cc:2130
|
#9 0x000055d68764a3e9 in ha_commit_or_rollback_by_xid (xid=<optimized out>, commit=commit@entry=false) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/handler.cc:2355
|
#10 0x000055d6875b28a5 in wsrep_ha_commit_or_rollback_by_xid (thd=thd@entry=0x14cff8000f98, commit=commit@entry=false) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/xa.cc:1445
|
#11 0x000055d6875b5e54 in trans_xa_rollback (thd=thd@entry=0x14cff8000f98) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/xa.cc:1063
|
#12 0x000055d68737c134 in mysql_execute_command (thd=thd@entry=0x14cff8000f98, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/sql_parse.cc:5897
|
#13 0x000055d68737e1f3 in mysql_parse (thd=0x14cff8000f98, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14d159f007f0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/sql_parse.cc:8063
|
#14 0x000055d6877b50b6 in Query_log_event::do_apply_event (this=0x14d10006bc98, rgi=0x14d100066ab0, query_arg=0x14d10006be03 "XA ROLLBACK X'61',X'',1", q_len_arg=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/sql_class.h:242
|
#15 0x000055d6877b5a0c in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log_event_server.cc:1605
|
#16 0x000055d6872a0714 in Log_event::apply_event (rgi=0x14d100066ab0, this=0x14d10006bc98) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/log_event.h:1499
|
#17 apply_event_and_update_pos_apply (ev=ev@entry=0x14d10006bc98, thd=thd@entry=0x14cff8000f98, rgi=rgi@entry=0x14d100066ab0, reason=reason@entry=0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/slave.cc:3892
|
#18 0x000055d6872aa7d9 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x14d10006bc98, thd=thd@entry=0x14cff8000f98, rgi=rgi@entry=0x14d100066ab0) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/slave.cc:4078
|
#19 0x000055d687555383 in rpt_handle_event (qev=qev@entry=0x14d100066658, rpt=rpt@entry=0x14d100027e50) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/rpl_parallel.cc:134
|
#20 0x000055d687559b08 in handle_rpl_parallel_thread (arg=arg@entry=0x14d100027e50) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/sql/rpl_parallel.cc:1476
|
#21 0x000055d687919e72 in pfs_spawn_thread (arg=0x14d100039578) at /test/git-bisect/10.6.14-9-enterprise-MDEV-30165-30619-31949_dbg/storage/perfschema/pfs.cc:2201
|
#22 0x000014d17f494b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
|
#23 0x000014d17f526a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
|
Attachments
Issue Links
- blocks
-
MDEV-31949 slow parallel replication of user xa
-
- Stalled
-
- is blocked by
-
MDEV-32272 lock_release_on_prepare_try() does not release lock if supremum bit is set along with other bits set in lock's bitmap
-
- Closed
-
- is caused by
-
MDEV-31949 slow parallel replication of user xa
-
- Stalled
-
- relates to
-
MDEV-30941 ASAN: heap-use-after-free in std::__atomic_base from trx_t::commit_tables on XA COMMIT with pseudo_slave_mode enabled
-
- Confirmed
-
Roel, to your question, the server segfault is tacked by the 31949. The impossible logging of the "orphan" XA-rollback - is done in the 10.6.
In hindsight perhaps it'd be more proper to have two tickets reported, with the (bug) 31949 requiring the (bug) 10.6 (though the 31949 can be also reached by other path:s like the slave misconfiguration).