[MDEV-32372] Assertion `thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)' failed in close_thread_tables upon ALTER Created: 2023-10-07  Updated: 2023-10-07

Status: Open
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Replication, Storage Engine - InnoDB
Affects Version/s: 10.4, 11.3
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: sporadic

Issue Links:
Relates

 Description   

The following SQL, when executed in a m/s setup and using CLI on the master:

# mysqld options required for replay: --slave-parallel-threads=65
ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
CREATE TABLE t (c1 INT) ENGINE=InnoDB;
INSERT INTO t VALUES (1);
ALTER TABLE t ADD COLUMN c INT;

Will, sporadically, lead to a crash on the slave, but only when --slave-parallel-threads is set.

11.3.0 905c3d61e18ae6222d0d195c43d335046eec65d9 (Debug)

Core was generated by `/test/MD270923-mariadb-11.3.0-linux-x86_64-dbg/bin/mariadbd --no-defaults --sla'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22820648896064)
    at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x14c158abc640 (LWP 1360026))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22820648896064) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=22820648896064) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=22820648896064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x000014c181c42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x000014c181c287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000014c181c2871b in __assert_fail_base (fmt=0x14c181ddd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5616d920d850 "thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x5616d920d1d0 "/test/11.3_dbg/sql/sql_base.cc", line=833, function=<optimized out>) at ./assert/assert.c:92
#6  0x000014c181c39e96 in __GI___assert_fail (assertion=0x5616d920d850 "thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x5616d920d1d0 "/test/11.3_dbg/sql/sql_base.cc", line=833, function=0x5616d920f9c7 "int close_thread_tables(THD*)") at ./assert/assert.c:101
#7  0x00005616d8599495 in close_thread_tables (thd=thd@entry=0x14bfc0000f98) at /test/11.3_dbg/sql/sql_base.cc:833
#8  0x00005616d85994e1 in close_thread_tables_for_query (thd=thd@entry=0x14bfc0000f98) at /test/11.3_dbg/sql/sql_base.cc:792
#9  0x00005616d8624b23 in mysql_execute_command (thd=thd@entry=0x14bfc0000f98, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/11.3_dbg/sql/sql_parse.cc:5824
#10 0x00005616d86255f7 in mysql_parse (thd=0x14bfc0000f98, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14c158abb690) at /test/11.3_dbg/sql/sql_parse.cc:7744
#11 0x00005616d8a9ebc3 in Query_log_event::do_apply_event (this=0x14c0d006fd38, rgi=0x14c0d006f2e0, query_arg=0x14c0d006f253 "ALTER TABLE t ADD COLUMN c INT", q_len_arg=<optimized out>) at /test/11.3_dbg/sql/sql_class.h:247
#12 0x00005616d8a9f45c in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at /test/11.3_dbg/sql/log_event_server.cc:1577
#13 0x00005616d8a8f9f5 in Log_event::apply_event (this=this@entry=0x14c0d006fd38, rgi=rgi@entry=0x14c0d006f2e0) at /test/11.3_dbg/sql/log_event.cc:3875
#14 0x00005616d8547c3d in apply_event_and_update_pos_apply (ev=ev@entry=0x14c0d006fd38, thd=thd@entry=0x14bfc0000f98, rgi=rgi@entry=0x14c0d006f2e0, reason=reason@entry=0) at /test/11.3_dbg/sql/slave.cc:3888
#15 0x00005616d8551da7 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x14c0d006fd38, thd=thd@entry=0x14bfc0000f98, rgi=rgi@entry=0x14c0d006f2e0) at /test/11.3_dbg/sql/slave.cc:4085
#16 0x00005616d8822536 in rpt_handle_event (qev=qev@entry=0x14c0d006fe98, rpt=rpt@entry=0x14c0d0026ce0) at /test/11.3_dbg/sql/rpl_parallel.cc:64
#17 0x00005616d88270da in handle_rpl_parallel_thread (arg=arg@entry=0x14c0d0026ce0) at /test/11.3_dbg/sql/rpl_parallel.cc:1506
#18 0x00005616d8bcc20a in pfs_spawn_thread (arg=0x14c0d00389d8) at /test/11.3_dbg/storage/perfschema/pfs.cc:2201
#19 0x000014c181c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#20 0x000014c181d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

It is observed somewhat regularly in replicating testing. The following testcase variation seems to be reproduce the issue a bit less sporadically on 10.4 and 10.5, but it is hard to say due to the sporadic nature of the bug.

# mysqld options required for replay: --slave-parallel-threads=65
ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
CREATE TABLE t(c1 INT) DEFAULT CHARSET=ujis;
INSERT INTO t VALUES (0);
ALTER TABLE t ADD COLUMN c INT;

10.4.32 47f0135d7aba57d160d1d95ff7f1f6b7bb07c2fb (Debug)

Core was generated by `/test/MD270923-mariadb-10.4.32-linux-x86_64-dbg/bin/mariadbd --no-defaults --gt'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23430516848192)
    at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x154f57a23640 (LWP 2459976))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23430516848192) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=23430516848192) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=23430516848192, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x0000154fcd642476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x0000154fcd6287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000154fcd62871b in __assert_fail_base (fmt=0x154fcd7dd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55f19c2d78e0 "thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x55f19c2d7230 "/test/10.4_dbg/sql/sql_base.cc", line=897, function=<optimized out>) at ./assert/assert.c:92
#6  0x0000154fcd639e96 in __GI___assert_fail (assertion=0x55f19c2d78e0 "thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x55f19c2d7230 "/test/10.4_dbg/sql/sql_base.cc", line=897, function=0x55f19c2d78c0 "void close_thread_tables(THD*)") at ./assert/assert.c:101
#7  0x000055f19b6a3261 in close_thread_tables (thd=thd@entry=0x154e58000d38) at /test/10.4_dbg/sql/sql_base.cc:897
#8  0x000055f19b723949 in mysql_execute_command (thd=thd@entry=0x154e58000d38) at /test/10.4_dbg/sql/sql_parse.cc:6283
#9  0x000055f19b7252ce in mysql_parse (thd=0x154e58000d38, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x154f57a22830, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.4_dbg/sql/sql_parse.cc:8012
#10 0x000055f19baffc44 in Query_log_event::do_apply_event (this=0x154f64030078, rgi=0x154f6402f550, query_arg=0x154f640301b3 "ALTER TABLE t ADD COLUMN c INT", q_len_arg=<optimized out>) at /test/10.4_dbg/sql/sql_class.h:227
#11 0x000055f19bb0060a in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at /test/10.4_dbg/sql/log_event.cc:5422
#12 0x000055f19b649fa6 in Log_event::apply_event (rgi=0x154f6402f550, this=0x154f64030078) at /test/10.4_dbg/sql/log_event.h:1492
#13 apply_event_and_update_pos_apply (ev=ev@entry=0x154f64030078, thd=thd@entry=0x154e58000d38, rgi=rgi@entry=0x154f6402f550, reason=reason@entry=0) at /test/10.4_dbg/sql/slave.cc:3820
#14 0x000055f19b6549f9 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x154f64030078, thd=thd@entry=0x154e58000d38, rgi=rgi@entry=0x154f6402f550) at /test/10.4_dbg/sql/slave.cc:4006
#15 0x000055f19b8afb09 in rpt_handle_event (qev=qev@entry=0x154f64030248, rpt=rpt@entry=0x154f64012bb8) at /test/10.4_dbg/sql/rpl_parallel.cc:64
#16 0x000055f19b8b5a26 in handle_rpl_parallel_thread (arg=0x154f64012bb8) at /test/10.4_dbg/sql/rpl_parallel.cc:1480
#17 0x0000154fcd694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#18 0x0000154fcd726a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

In spite of trying various ways, the issue is not reproducible in MTR. Somehow the timing seems different in MTR. The following testcase for example:

--source include/have_binlog_format_row.inc
--source include/master-slave.inc
--source include/have_innodb.inc
--connection slave
STOP SLAVE;
SET GLOBAL slave_parallel_threads=65;
START SLAVE;
SELECT SLEEP(2);
--connection master
ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
CREATE TABLE t(c1 INT)DEFAULT CHARSET=ujis;
INSERT INTO t VALUES(0);
ALTER TABLE t ADD COLUMN c INT;
--source include/rpl_end.inc

Simply results in the expected

11.3.0 905c3d61e18ae6222d0d195c43d335046eec65d9

2023-10-07 16:16:23 77 [ERROR] Incorrect definition of table mysql.gtid_slave_pos: missing primary key.
2023-10-07 16:16:23 75 [ERROR] Incorrect definition of table mysql.gtid_slave_pos: missing primary key.
2023-10-07 16:16:23 75 [ERROR] Incorrect definition of table mysql.gtid_slave_pos: missing primary key.
2023-10-07 16:16:23 75 [Warning] Slave: Failed to open mysql.gtid_slave_pos Error_code: 1944
2023-10-07 16:16:23 77 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964

Without the crash. Passing the parallel threads option directly to mtr makes no difference, and MTR does not seem to have a way to specify slave-only mariadbd options. UniqueID's/stacks observed thus far:

SIGABRT|close_thread_tables|close_thread_tables_for_query|mysql_execute_command|mysql_parse  # 11.3
SIGABRT|close_thread_tables|mysql_execute_command|mysql_parse|Query_log_event::do_apply_event  # 10.4


Generated at Thu Feb 08 10:30:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.