[MDEV-25590] Deadlock during ongoing transaction and RSU Created: 2021-05-04  Updated: 2022-02-14

Status: Open
Project: MariaDB Server
Component/s: wsrep
Affects Version/s: 10.5.9
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Kamil Holubicki Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None
Environment:

Test case

Galera provider loaded, one cluster node is enough.

session_1:
create database test;
use test;
create table t1(id int primary key auto_increment, k int);
insert into t1(k) values (1),(2),(3),(101),(102),(103);
begin;
update t1 set k=k+1 where id<100;

session_2:

use test;
set wsrep_OSU_method=RSU;
alter table t1 add key(k);

session_1:

commit;

Result:
Both sessions are locked.

Expected result:
When session_1 commits, session_2 should continue and perform ALTER TABLE.

I investigated it a little bit and here are my findings:
1. session_1: holds MDL lock
2. session_2: RSU is started, which causes Galera desync and then pause. Pause causes entering into LocalOrder lock with seqno N
3. session_2: stops on MDL lock which is held by session_1
4. session_1: commits. it tries to replicate, however session_2 keeps LocalOrder lock

I also see that 37deed3f37561f264f65e162146bbc2ad35fb1a2 introduced Galera 4. With Galera 3 when we called wsrep_to_isolation_begin(), regardless of TOI or RSU we set thd->wsrep_exec_mode = TOTAL_ORDER. Then when session 2 detects deadlock, abort action is done if thd->wsrep_exec_mode == TOTAL_ORDER. So we perform abort for both TOI and RSU.

Now with Galera4 we don't have wsrep_exec_mode.
wsrep::client_state::m_toi and wsrep::client_state::m_rsu were introduced. We set them accordingly in wsrep_to_isolation_begin(). The logic that used to check for thd->wsrep_exec_mode previously was refactored to check if wsrep_thd_is_toi(), which returns true only in case of wsrep::client_state::m_toi.
It looks like it was some mechanical refactoring, and wsrep::client_state::m_rsu was simply omitted.



 Description   

Test case

Galera provider loaded, one cluster node is enough.

session_1:
create database test;
use test;
create table t1(id int primary key auto_increment, k int);
insert into t1(k) values (1),(2),(3),(101),(102),(103);
begin;
update t1 set k=k+1 where id<100;

session_2:

use test;
set wsrep_OSU_method=RSU;
alter table t1 add key(k);

session_1:

commit;

Result:
Both sessions are locked.

Expected result:
When session_1 commits, session_2 should continue and perform ALTER TABLE.

I investigated it a little bit and here are my findings:
1. session_1: holds MDL lock
2. session_2: RSU is started, which causes Galera desync and then pause. Pause causes entering into LocalOrder lock with seqno N
3. session_2: stops on MDL lock which is held by session_1
4. session_1: commits. it tries to replicate, however session_2 keeps LocalOrder lock

I also see that 37deed3f37561f264f65e162146bbc2ad35fb1a2 introduced Galera 4. With Galera 3 when we called wsrep_to_isolation_begin(), regardless of TOI or RSU we set thd->wsrep_exec_mode = TOTAL_ORDER. Then when session 2 detects deadlock, abort action is done if thd->wsrep_exec_mode == TOTAL_ORDER. So we perform abort for both TOI and RSU.

Now with Galera4 we don't have wsrep_exec_mode.
wsrep::client_state::m_toi and wsrep::client_state::m_rsu were introduced. We set them accordingly in wsrep_to_isolation_begin(). The logic that used to check for thd->wsrep_exec_mode previously was refactored to check if wsrep_thd_is_toi(), which returns true only in case of wsrep::client_state::m_toi.
It looks like it was some mechanical refactoring, and wsrep::client_state::m_rsu was simply omitted.



 Comments   
Comment by Arthur van Kleef [ 2021-05-28 ]

I experience the same issue in 10.4.14 and 10.4.19.

It's easy to reproduce using metadata locking example and setting wsrep_osu_method=RSU in the session running the DDL.

Metadata locks after starting the transaction in session 1:

MariaDB [(none)]> SELECT * FROM information_schema.METADATA_LOCK_INFO;
+-----------+------------------+---------------+---------------------+--------------+------------+
| THREAD_ID | LOCK_MODE        | LOCK_DURATION | LOCK_TYPE           | TABLE_SCHEMA | TABLE_NAME |
+-----------+------------------+---------------+---------------------+--------------+------------+
|        19 | MDL_SHARED_WRITE | NULL          | Table metadata lock | test         | t          |
+-----------+------------------+---------------+---------------------+--------------+------------+
1 row in set (0.000 sec)

And after starting the DDL operation in session 2 and doing COMMIT in session 1:

MariaDB [(none)]> SELECT * FROM information_schema.METADATA_LOCK_INFO;
+-----------+-------------------------+---------------+----------------------+--------------+------------+
| THREAD_ID | LOCK_MODE               | LOCK_DURATION | LOCK_TYPE            | TABLE_SCHEMA | TABLE_NAME |
+-----------+-------------------------+---------------+----------------------+--------------+------------+
|        20 | MDL_BACKUP_ALTER_COPY   | NULL          | Backup lock          |              |            |
|        19 | MDL_SHARED_WRITE        | NULL          | Table metadata lock  | test         | t          |
|        20 | MDL_SHARED_UPGRADABLE   | NULL          | Table metadata lock  | test         | t          |
|        20 | MDL_INTENTION_EXCLUSIVE | NULL          | Schema metadata lock | test         |            |
+-----------+-------------------------+---------------+----------------------+--------------+------------+
4 rows in set (0.000 sec)

Now both sessions wait forever, until I abort the DDL (ctrl+c) in session 2. Then the COMMIT in session 1 succeeds immediately.

Comment by Arthur van Kleef [ 2022-01-19 ]

It looks like this issue got fixed downstream in Percona, is it be possible to apply the same fix PXC-3645 here?

Comment by Nanne Huiges [ 2022-02-14 ]

This is a bug that comes up from time to time in running migrations, and it is really quite a problem. It would be good to get the Percona fix in!

Generated at Thu Feb 08 09:38:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.