[MDEV-36403] metadata deadlock on restore system table - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Affects Version/s: 10.11.11
Fix Version/s: None
Component/s: None
Labels:
None

Description

Using mydumper to backup the mysql system table , restore on a replica, then this can deadlock on metadata lock with our workload

Our MariaDB orchestrator and other in the cloud i get inspiration from.

We do regular create table if no exist slow_log_<datetime> insert from slow_log to slow_log_<datetime> ;truncate table slow_log when log_output=TABLE to avoid that table growing for ever

We can code a workaround in our case but this may be nice to fixe as metadata deadlock should not happen by design ?

Using mysqldump --system=ALL does not deadlock so this is probably caused by the DROP TABLE from the myloader

Attachments

Issue Links

duplicates

MDEV-36137 parallel replication deadlock between DML in one domain and DDL in another

Open

Activity

Ascending order - Click to sort in descending order

VAROQUI Stephane added a comment - 2025-03-26 17:16

I can't reproduce it now , found a mariadb client in defunc used to restore adump , that was probably holding the metadata lock

VAROQUI Stephane added a comment - 2025-03-26 17:16 I can't reproduce it now , found a mariadb client in defunc used to restore adump , that was probably holding the metadata lock

Brandon Nesterenko added a comment - 2025-03-26 19:53

Thanks for the report, stephane@skysql.com! It looks like myloader is backing-up and restoring the mysql.gtid_slave_pos when using option --database=mysql. Are you specifically omitting that from your system backup?

It sounds like you have a replica actively running while restoring the system database tables (based on our Zulip conversation). If mysql.gtid_slave_pos is included in that recovery, I think that would be unsafe (knielsen may have more input on that though).

When mysqldump backups up the mysql system database, by default, it ignores mysql.gtid_slave_pos, and if one wants to backup/restore the replication state, one can use the options -dump-slave, master-data, or -gtid. And even then, mysqldump's restore won't modify the mysql.gtid_slave_pos directly, but rather the backup saves the GTID position of the replica, and then restores it using SET GLOBAL gtid_slave_pos='<cached gtid position' while the slave threads aren't running.

Brandon Nesterenko added a comment - 2025-03-26 19:53 Thanks for the report, stephane@skysql.com ! It looks like myloader is backing-up and restoring the mysql.gtid_slave_pos when using option --database=mysql . Are you specifically omitting that from your system backup? It sounds like you have a replica actively running while restoring the system database tables (based on our Zulip conversation). If mysql.gtid_slave_pos is included in that recovery, I think that would be unsafe ( knielsen may have more input on that though). When mysqldump backups up the mysql system database, by default, it ignores mysql.gtid_slave_pos , and if one wants to backup/restore the replication state, one can use the options - dump-slave , master-data , or -gtid . And even then, mysqldump 's restore won't modify the mysql.gtid_slave_pos directly, but rather the backup saves the GTID position of the replica, and then restores it using SET GLOBAL gtid_slave_pos='<cached gtid position' while the slave threads aren't running.

VAROQUI Stephane added a comment - 2025-03-27 12:29

Thanks Brandon, Indeed we did not exclude mysql.gtid_slave_pos from the mydumper

Changing now our default options:

 flags.StringVar(&conf.BackupMyDumperRegex, "backup-mydumper-regex", `^(?!(sys\.|performance_schema\.|information_schema\.|replication_manager_schema\.jobs|mysql\.gtid_slave_pos$))`, "Mydumper regex for backup")

VAROQUI Stephane added a comment - 2025-03-27 12:29 Thanks Brandon, Indeed we did not exclude mysql.gtid_slave_pos from the mydumper Changing now our default options: flags.StringVar(&conf.BackupMyDumperRegex, "backup-mydumper-regex", `^(?!(sys\.|performance_schema\.|information_schema\.|replication_manager_schema\.jobs|mysql\.gtid_slave_pos$))`, "Mydumper regex for backup")

Kristian Nielsen added a comment - 2025-03-28 17:16

"this is probably caused by the DROP TABLE from the myloader"

This sounds like it could be the case.
I think the following could happen, with two transactions T1 and T2 in optimistic parallel replication and a user-initiated (in this case mydumper) DROP TABLE or other DDL:
1. T2 updates mysql.gtid_slave_pos, so has an active transaction with a metadata lock on the table, goes to wait_for_prior_commit on T1.
2. DDL requests exclusive metadata lock on mysql.gtid_slave_pos, goes to wait on T2.
3. T1 requests a shared metadata lock on mysql.gtid_slave_pos, goes to wait on the DDL.
At this point the system is deadlocked until lock_wait_timeout triggers.

If this is what happens, it is somewhat similar to MDEV-36137, but more tricky since a user-initiated DDL is involved.

I'm however wondering how this is supposed to work in general. I don't see how the similar deadlock is prevented for plain user queries?

In fact, I was able to create the same deadlock using two transactions and a DROP TABLE without any replication like below. This is the general problem that we do not have a general deadlock detection on the server layer, so deadlock cycles that cross between subsystems can go undetected:

create table t1 (a INT PRIMARY KEY, b INT);

insert into t1(a) values (1),(2),(3),(4),(5);

create table t2 (a INT PRIMARY KEY, b INT);

insert into t2(a) values (1),(2),(3),(4),(5);

# T2:

begin;

update t2 set b=2 where a=4;

# T1:

begin;

update t1 set b=1 where a=1;

update t2 set b=2 where a=4;

# T3

DROP TABLE t1;

# T2:

update t1 set b=1 where a=4;

# At this point a deadlock occurs until lock wait timeout.

Kristian Nielsen added a comment - 2025-03-28 17:16 "this is probably caused by the DROP TABLE from the myloader" This sounds like it could be the case. I think the following could happen, with two transactions T1 and T2 in optimistic parallel replication and a user-initiated (in this case mydumper) DROP TABLE or other DDL: 1. T2 updates mysql.gtid_slave_pos , so has an active transaction with a metadata lock on the table, goes to wait_for_prior_commit on T1. 2. DDL requests exclusive metadata lock on mysql.gtid_slave_pos , goes to wait on T2. 3. T1 requests a shared metadata lock on mysql.gtid_slave_pos , goes to wait on the DDL. At this point the system is deadlocked until lock_wait_timeout triggers. If this is what happens, it is somewhat similar to MDEV-36137 , but more tricky since a user-initiated DDL is involved. I'm however wondering how this is supposed to work in general. I don't see how the similar deadlock is prevented for plain user queries? In fact, I was able to create the same deadlock using two transactions and a DROP TABLE without any replication like below. This is the general problem that we do not have a general deadlock detection on the server layer, so deadlock cycles that cross between subsystems can go undetected: create table t1 (a INT PRIMARY KEY, b INT); insert into t1(a) values (1),(2),(3),(4),(5); create table t2 (a INT PRIMARY KEY, b INT); insert into t2(a) values (1),(2),(3),(4),(5); # T2: begin; update t2 set b=2 where a=4; # T1: begin; update t1 set b=1 where a=1; update t2 set b=2 where a=4; # T3 DROP TABLE t1; # T2: update t1 set b=1 where a=4; # At this point a deadlock occurs until lock wait timeout.

MariaDB Server

metadata deadlock on restore system table

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration