Details
-
Bug
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.6.21, 10.11, 11.4, 11.8, 12.0(EOL)
-
Can result in hang or crash
-
Q4/2025 Server Maintenance
Description
A parallel replica (optimistic parallel mode) can crash while a backup is being taken (via mariadb-backup). With an initial look, it appears thd::backup_commit_lock::ticket is being nullified by another thread when it is trying to be released by wait_for_commit::wait_for_prior_commit2.
The call stack of the crashing thread:
#17 0x000055a6d18c433d in handle_fatal_signal (sig=11) at /usr/src/debug/MariaDB-/src_0/sql/signal_handler.cc:227
|
#18 <signal handler called>
|
No symbol table info available.
|
#19 0x000055a6d1791f04 in inline_mysql_prlock_wrlock (src_line=1846, src_file=0x55a6d1eef8d0 "/home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/sql/mdl.cc", that=0xb2020db2ea080243) at /usr/src/debug/MariaDB-/src_0/include/mysql/psi/mysql_thread.h:946
|
No locals.
|
#20 MDL_lock::remove_ticket (this=0xb2020db2ea0800b3, pins=0x7ef72c020f88, list=&MDL_lock::m_granted, ticket=0x7efb94e2e0a8) at /usr/src/debug/MariaDB-/src_0/sql/mdl.cc:1846
|
No locals.
|
#21 0x000055a6d1792845 in MDL_context::release_lock (this=<optimized out>, duration=<optimized out>, ticket=0x7efb94e2e0a8) at /usr/src/debug/MariaDB-/src_0/sql/mdl.cc:2915
|
#22 0x000055a6d179287d in MDL_context::release_lock (this=<optimized out>, ticket=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/mdl.cc:2935
|
No locals.
|
#23 0x000055a6d160944f in wait_for_commit::wait_for_prior_commit2 (this=this@entry=0x7ef647fc0ce8, thd=thd@entry=0x7efb1c004418, allow_kill=allow_kill@entry=true) at /usr/src/debug/MariaDB-/src_0/sql/sql_class.cc:8336
|
#24 0x000055a6d17ff426 in wait_for_commit::wait_for_prior_commit (allow_kill=true, thd=0x7efb1c004418, this=0x7ef647fc0ce8) at /usr/src/debug/MariaDB-/src_0/sql/sql_class.h:2408
|
No locals.
|
#25 THD::wait_for_prior_commit (allow_kill=true, this=0x7efb1c004418) at /usr/src/debug/MariaDB-/src_0/sql/sql_class.h:5346
|
No locals.
|
#26 retry_event_group (rgi=<optimized out>, rpt=<optimized out>, orig_qev=<optimized out>) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:955
|
#27 0x000055a6d180270b in handle_rpl_parallel_thread (arg=arg@entry=0x7efb081af568) at /usr/src/debug/MariaDB-/src_0/sql/rpl_parallel.cc:1561
|
#28 0x000055a6d1b19329 in pfs_spawn_thread (arg=0x7efb081b0708) at /usr/src/debug/MariaDB-/src_0/storage/perfschema/pfs.cc:2201
|
#29 0x00007f005aebb1ca in start_thread () from /lib64/libpthread.so.0
|
No symbol table info available.
|
#30 0x00007f005a1fb8d3 in clone () from /lib64/libc.so.6
|
No symbol table info available.
|
The release_lock call happens during the release of the backup lock (note via retry_event_group).
int
|
wait_for_commit::wait_for_prior_commit2(THD *thd, bool allow_kill)
|
{
|
...
|
/*
|
Release MDL_BACKUP_COMMIT LOCK while waiting for other threads to commit
|
This is needed to avoid deadlock between the other threads (which not
|
yet have the MDL_BACKUP_COMMIT_LOCK) and any threads using
|
BACKUP LOCK BLOCK_COMMIT.
|
*/
|
if (thd->backup_commit_lock && thd->backup_commit_lock->ticket)
|
{
|
backup_lock_released= true;
|
thd->mdl_context.release_lock(thd->backup_commit_lock->ticket);
|
thd->backup_commit_lock->ticket= 0;
|
}
|
Prior to the crash, we can see the backup locks taken (via metadata lock info):
+-----------+-------------------+---------------+---------------------+--------------+---------------------+
|
| THREAD_ID | LOCK_MODE | LOCK_DURATION | LOCK_TYPE | TABLE_SCHEMA | TABLE_NAME |
|
+-----------+-------------------+---------------+---------------------+--------------+---------------------+
|
| 14856913 | MDL_BACKUP_START | NULL | Backup lock | | |
|
| 2277384 | MDL_BACKUP_COMMIT | NULL | Backup lock | | |
|
| 2277384 | MDL_SHARED_WRITE | NULL | Table metadata lock | <somedb> | <some_t1> |
|
| 2277384 | MDL_SHARED_WRITE | NULL | Table metadata lock | mysql | gtid_slave_pos |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t2> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t2> |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t3> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t3> |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t4> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t4> |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t5> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t5> |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t6> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t6> |
|
| 14861424 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t7> |
|
| 14861480 | MDL_SHARED_READ | NULL | Table metadata lock | <somedb> | <some_t7> |
|
+-----------+-------------------+---------------+---------------------+--------------+---------------------+
|
Attachments
Issue Links
- is caused by
-
MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong
-
- Closed
-