[MDEV-10962] Deadlock with 3 concurrent DELETEs by unique key - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL)
Fix Version/s: 10.4.31, 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1
Component/s: Storage Engine - InnoDB
Labels:
- upstream

Description

As explained in upstream bug reports (that has all the details on the test case):

http://bugs.mysql.com/bug.php?id=82127
https://bugs.launchpad.net/percona-server/+bug/1598822

there is a deadlock scenario with 3 concurrent DELETEs by UNIQUE key that can not be explained by the manual:

CREATE TABLE `tu`(`id` int(11), `a` int(11) DEFAULT NULL, `b` varchar(10) DEFAULT NULL, `c` varchar(10) DEFAULT NULL, PRIMARY KEY(`id`), UNIQUE KEY `u`(`a`,`b`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 STATS_PERSISTENT=0;

insert into tu values(1,1,'a','a'),(2,9999,'xxxx','x'),(3,10000,'b','b'),(4,4,'c','c');

mysqlslap -uroot --concurrency=3 --create-schema=test --no-drop --number-of-queries=1000 --query="delete from tu where a = 9999 and b = 'xxxx'"

mysqlslap: Cannot run query delete from tu where a = 9999 and b = 'xxxx' ERROR : Deadlock found when trying to get lock; try restarting transaction

Deadlock happens both with triggers mentioned in that bug reports and without them (just less often).

The problem was originally noted by customer on MariaDB 5.5.24, but affects all released versions up to those based on InnoDB from 5.7.x for sure.

As there is no visible progress on upstream bugs, I create this bug report for MariaDB to decide if there is anything to fix here or to document clearly in the knowledge base.

Attachments

Issue Links

is blocked by

MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock

Closed

relates to

MDEV-14589 InnoDB should not lock a delete-marked record

Closed

MDEV-16232 Use fewer mini-transactions

Stalled

MDEV-16402 Support Index Condition Pushdown for clustered PK scans

Confirmed

MDEV-16406 Refactor the InnoDB record locks

Open

MDEV-18706 ER_LOCK_DEADLOCK on concurrent read and insert into already locked gap

In Review

MDEV-31833 replication breaks when using optimistic replication and replica is a galera node

Closed

MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration

Closed

MDEV-22698 Deadlock on concurrent acquisition from multiple indexes

Open

MDEV-23560 Deadlock detected on SELECT when only one record being processed

Open

MDEV-28800 SIGABRT due to running out of memory for InnoDB locks

Closed

links to

Bug #82127 Deadlock with 3 concurrent DELETEs by UNIQUE key

lp:1598822 Deadlock with 3 concurrent DELETEs by unique key and triggers reading the same row

mentioned in: Page Loading...

(6 relates to, 2 links to, 1 mentioned in)

Activity

Ascending order - Click to sort in descending order

Valerii Kravchuk created issue - 2016-10-06 07:10

Elena Stepanova made changes - 2016-10-06 13:08

Field	Original Value	New Value
Affects Version/s		5.5 [ 15800 ]
Affects Version/s		10.0 [ 16000 ]
Affects Version/s		10.1 [ 16100 ]

Sergei Golubchik made changes - 2016-10-10 18:11

Remote Link

This issue links to "Bug #82127 Deadlock with 3 concurrent DELETEs by UNIQUE key (Web Link)" [ 27622 ]

Sergei Golubchik made changes - 2016-10-10 18:12

Remote Link

This issue links to "lp:1598822 Deadlock with 3 concurrent DELETEs by unique key and triggers reading the same row (Web Link)" [ 27623 ]

Jan Lindström (Inactive) made changes - 2016-11-01 09:44

Assignee

Jan Lindström [ jplindst ]

Sergei Golubchik made changes - 2016-11-01 19:37

Fix Version/s		5.5 [ 15800 ]
Fix Version/s		10.0 [ 16000 ]

Sergei Golubchik made changes - 2016-11-01 19:37

Fix Version/s

10.1 [ 16100 ]

Sergei Golubchik made changes - 2016-11-01 19:46

Priority

Major [ 3 ]

Critical [ 2 ]

Jan Lindström (Inactive) added a comment - 2016-11-02 06:18

I will firstly comment on case where we do not have any triggers, just tree concurrent transactions. For deadlock we need only two and I will mark then as trx(1) and trx(2) to identify them. Thus, trx (1) is doing delete t where a = 9999 and b = 'xxxx' and takes X-lock for record where a = 9999 and b = 'xxxx' , trx(2) does delete t where a = 9999 and b = 'xxxx' and tries to obtain X-lock for the same record, it can't have it so it needs to wait, trx(1) continues finds a delete marked row, continues to next record that is end of search range and tries to obtain X-lock with GAP, this can't be granted as we already have waiting lock request for the same record done by trx(1) in lock wait queue. Thus, we have trx(1) -> trx(2) -> trx(1) where -> means waiting and this is naturally a deadlock as it means that trx(1) -> trx(1). Why trx(1) takes a GAP-lock? When trx(1) searches matching rows it firstly finds the matching row and naturally places X-lock for that row to protect it. Then trx(1) tries to find more matching rows and finds a delete-marked row that is naturally skipped. Finally, trx(1) find a index-entry that does not match, in this case it takes GAP-lock. This is end of range like GAP-lock protecting the fact that any concurrent transactions can't INSERT or UPDATE a key here. Now question is this really necessary ? For non-unique secondary indexes, absolutely yes if isolation level is REPEATABLE READ or higher. For unique-index if only a part of the multi-key index is used or any of the key part condition is not exact query, absolutely yes if isolation level is REPEATABLE READ or higher, in above case that would mean query like delete from tu where a = 9999 or delete from tu where a = 9999 and b > 'XXX'. Lets consider INSERT ... VALUES(9999, 'xxxx' ) i.e. value that would cause duplicate key if delete is rolled back. Insert does not take row locks, it takes insert intention gap-lock to index record to avoid concurrent INSERTs to same index gap. Then, for unique-indexes this insert would need to find any duplicate index entries and for that it needs to take S-locks, thus insert would wait. Similarly for UPDATE that would change unique-index entry, it would need to do duplicate check using at least S-locks forcing it to wait for delete. DELETE as we already have seen would try to take X-lock to index entry forcing it to wait for first delete. Honestly, I do not see why exact query using all key parts on unique index would take gap-lock even in case when there is delete marked index entry. However, this is not a bug, this is current implementation so it works as designed.

Jan Lindström (Inactive) added a comment - 2016-11-02 06:18 I will firstly comment on case where we do not have any triggers, just tree concurrent transactions. For deadlock we need only two and I will mark then as trx(1) and trx(2) to identify them. Thus, trx (1) is doing delete t where a = 9999 and b = 'xxxx' and takes X-lock for record where a = 9999 and b = 'xxxx' , trx(2) does delete t where a = 9999 and b = 'xxxx' and tries to obtain X-lock for the same record, it can't have it so it needs to wait, trx(1) continues finds a delete marked row, continues to next record that is end of search range and tries to obtain X-lock with GAP, this can't be granted as we already have waiting lock request for the same record done by trx(1) in lock wait queue. Thus, we have trx(1) -> trx(2) -> trx(1) where -> means waiting and this is naturally a deadlock as it means that trx(1) -> trx(1). Why trx(1) takes a GAP-lock? When trx(1) searches matching rows it firstly finds the matching row and naturally places X-lock for that row to protect it. Then trx(1) tries to find more matching rows and finds a delete-marked row that is naturally skipped. Finally, trx(1) find a index-entry that does not match, in this case it takes GAP-lock. This is end of range like GAP-lock protecting the fact that any concurrent transactions can't INSERT or UPDATE a key here. Now question is this really necessary ? For non-unique secondary indexes, absolutely yes if isolation level is REPEATABLE READ or higher. For unique-index if only a part of the multi-key index is used or any of the key part condition is not exact query, absolutely yes if isolation level is REPEATABLE READ or higher, in above case that would mean query like delete from tu where a = 9999 or delete from tu where a = 9999 and b > 'XXX'. Lets consider INSERT ... VALUES(9999, 'xxxx' ) i.e. value that would cause duplicate key if delete is rolled back. Insert does not take row locks, it takes insert intention gap-lock to index record to avoid concurrent INSERTs to same index gap. Then, for unique-indexes this insert would need to find any duplicate index entries and for that it needs to take S-locks, thus insert would wait. Similarly for UPDATE that would change unique-index entry, it would need to do duplicate check using at least S-locks forcing it to wait for delete. DELETE as we already have seen would try to take X-lock to index entry forcing it to wait for first delete. Honestly, I do not see why exact query using all key parts on unique index would take gap-lock even in case when there is delete marked index entry. However, this is not a bug, this is current implementation so it works as designed.

Jan Lindström (Inactive) added a comment - 2016-11-02 08:32

Trigger case is more clear. Select inside a trigger needs consistent view i.e. result set for that select should remain consistent during transaction execution. Trigger will inherit lock mode from original query firing it i.e. in this case X-lock. To maintain consistent view select will take gap locks.

Jan Lindström (Inactive) added a comment - 2016-11-02 08:32 Trigger case is more clear. Select inside a trigger needs consistent view i.e. result set for that select should remain consistent during transaction execution. Trigger will inherit lock mode from original query firing it i.e. in this case X-lock. To maintain consistent view select will take gap locks.

Jan Lindström (Inactive) added a comment - 2016-12-01 10:57

Delete contains two stages (1) select that naturally needs to keep the result set consistent and (2) delete where actual index records and clustered index record are marked deleted.

Jan Lindström (Inactive) added a comment - 2016-12-01 10:57 Delete contains two stages (1) select that naturally needs to keep the result set consistent and (2) delete where actual index records and clustered index record are marked deleted.

Jan Lindström (Inactive) added a comment - 2016-12-01 10:58

Current implementation works as designed. Gap-locks are taken to keep the result set from select phase consistent.

Jan Lindström (Inactive) added a comment - 2016-12-01 10:58 Current implementation works as designed. Gap-locks are taken to keep the result set from select phase consistent.

Jan Lindström (Inactive) made changes - 2016-12-01 10:58

Fix Version/s		N/A [ 14700 ]
Fix Version/s	5.5 [ 15800 ]
Fix Version/s	10.0 [ 16000 ]
Fix Version/s	10.1 [ 16100 ]
Resolution		Not a Bug [ 6 ]
Status	Open [ 1 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2018-06-05 10:33

Link

This issue relates to MDEV-16232 [ MDEV-16232 ]

Marko Mäkelä made changes - 2018-06-05 10:33

Link

This issue relates to ~~MDEV-14589~~ [ ~~MDEV-14589~~ ]

Marko Mäkelä added a comment - 2018-06-05 10:33

Before MDEV-16232, an UPDATE or DELETE operation (including an UPDATE that is executed as part of REPLACE or INSERT…ON DUPLICATE KEY UPDATE) inside InnoDB consists of multiple mini-transactions. If we used a single mini-transaction for searching and updating a row, we could rely on implicit record locks, just like INSERT does. This might remove the need for gap locks in many cases.

For locking reads (SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, SELECT…LOCK IN SHARE MODE or SELECT…FOR UPDATE), gap locks are unavoidable. As noted in MDEV-16402, index condition pushdown would help avoid unnecessary gap locks.

With the partial fix of ~~MDEV-14589~~ in MariaDB Server 10.1.34, 10.2.12, 10.3.3, the READ COMMITTED and READ UNCOMMITTED isolation levels (or the non-default setting innodb_locks_unsafe_for_binlog=1) will avoid locking committed delete-marked records. That should help in the scenario of this bug, but not when using the default REPEATABLE READ isolation level.

Marko Mäkelä added a comment - 2018-06-05 10:33 Before MDEV-16232 , an UPDATE or DELETE operation (including an UPDATE that is executed as part of REPLACE or INSERT…ON DUPLICATE KEY UPDATE ) inside InnoDB consists of multiple mini-transactions. If we used a single mini-transaction for searching and updating a row, we could rely on implicit record locks, just like INSERT does. This might remove the need for gap locks in many cases. For locking reads ( SET TRANSACTION ISOLATION LEVEL SERIALIZABLE , SELECT…LOCK IN SHARE MODE or SELECT…FOR UPDATE ), gap locks are unavoidable. As noted in MDEV-16402 , index condition pushdown would help avoid unnecessary gap locks. With the partial fix of MDEV-14589 in MariaDB Server 10.1.34, 10.2.12, 10.3.3, the READ COMMITTED and READ UNCOMMITTED isolation levels (or the non-default setting innodb_locks_unsafe_for_binlog=1 ) will avoid locking committed delete-marked records. That should help in the scenario of this bug, but not when using the default REPEATABLE READ isolation level.

Marko Mäkelä made changes - 2018-06-05 10:33

Link

This issue relates to MDEV-16402 [ MDEV-16402 ]

Marko Mäkelä added a comment - 2018-06-05 10:34

I think that this is a valid bug that we can do something about.

Marko Mäkelä added a comment - 2018-06-05 10:34 I think that this is a valid bug that we can do something about.

Marko Mäkelä made changes - 2018-06-05 10:34

Assignee	Jan Lindström [ jplindst ]	Marko Mäkelä [ marko ]
Resolution	Not a Bug [ 6 ]
Status	Closed [ 6 ]	Stalled [ 10000 ]

Marko Mäkelä added a comment - 2018-06-05 11:57 - edited

Using the following test, I checked which locks are being acquired in a recent MariaDB Server 10.3 b50685af82508ca1cc83e1743dff527770e6e64b:

--source include/have_innodb.inc

CREATE TABLE t1(id INT PRIMARY KEY, a INT, UNIQUE KEY u(a)) ENGINE=InnoDB;

INSERT INTO t1 VALUES(1,1),(2,9999),(3,10000),(4,4);

DELETE FROM t1 WHERE a = 9999;

DROP TABLE t1;

Observed by a breakpoint on lock_rec_lock(), the locking ha_innobase::index_read() will acquire LOCK_X | LOCK_REC_NOT_GAP on the secondary index record (9999,2) and the primary key record (2). Then, as part of the delete-mark operation in ha_innobase::delete_row(), LOCK_X | LOCK_REC_NOT_GAP will be reacquired on the secondary index record (9999,2).

I also tested with a multi-column unique secondary key (with b CHAR(1)), and it made no difference.

I also tested with MariaDB 10.1 3b7da8a44c8a0ff4b40b37e4db01f7e397aefab5, and it acquired the same locks (reacquiring the lock on the secondary index record).

I tested one more variant, with a non-unique secondary index:

--source include/have_innodb.inc

CREATE TABLE t1(id INT PRIMARY KEY, a INT, b CHAR(1), KEY u(a,b)) ENGINE=InnoDB;

INSERT INTO t1 VALUES(1,1,'a'),(2,9999,'b'),(3,10000,'c'),(4,4,'d');

DELETE FROM t1 WHERE a = 9999 AND b='b';

DROP TABLE t1;

This will lock both the record and the preceding gap (LOCK_ORDINARY) as follows:

(9999,'b',2) (LOCK_X | LOCK_ORDINARY)
(2) (LOCK_X | LOCK_REC_NOT_GAP)
(9999,'b',2) (LOCK_X | LOCK_REC_NOT_GAP) (covered by the first one)
(10000,'c',3) (LOCK_X | LOCK_GAP)

This looks reasonable to me. The gap-only lock on (10000,'c',3) should not conflict with other deletes. I have the feeling that the gap-lock is like a shared lock, but combined with LOCK_INSERT_INTENTION it becomes an exclusive lock for that gap. To simplify the management of record locks, MDEV-16406 could use a single bitmap of record locks per page, using multiple bits per record.

The only potential for deadlock that I can see here is that one of the DELETE operations chooses a different query plan, performing the locking read on the PRIMARY KEY first, and then doing the ha_innobase::delete_row() first on the PRIMARY KEY and then on the secondary index. In that way, one DELETE would hold a lock on the secondary index record and the other on the PRIMARY KEY record, and the deadlock would be detected when each of them try to acquire the locks that the other one is holding. I can imagine that as records are deleted, query plans could change.

Marko Mäkelä added a comment - 2018-06-05 11:57 - edited Using the following test, I checked which locks are being acquired in a recent MariaDB Server 10.3 b50685af82508ca1cc83e1743dff527770e6e64b: --source include/have_innodb.inc CREATE TABLE t1(id INT PRIMARY KEY , a INT , UNIQUE KEY u(a)) ENGINE=InnoDB; INSERT INTO t1 VALUES (1,1),(2,9999),(3,10000),(4,4); DELETE FROM t1 WHERE a = 9999; DROP TABLE t1; Observed by a breakpoint on lock_rec_lock() , the locking ha_innobase::index_read() will acquire LOCK_X | LOCK_REC_NOT_GAP on the secondary index record (9999,2) and the primary key record (2). Then, as part of the delete-mark operation in ha_innobase::delete_row() , LOCK_X | LOCK_REC_NOT_GAP will be reacquired on the secondary index record (9999,2). I also tested with a multi-column unique secondary key (with b CHAR(1) ), and it made no difference. I also tested with MariaDB 10.1 3b7da8a44c8a0ff4b40b37e4db01f7e397aefab5, and it acquired the same locks (reacquiring the lock on the secondary index record). I tested one more variant, with a non-unique secondary index: --source include/have_innodb.inc CREATE TABLE t1(id INT PRIMARY KEY , a INT , b CHAR (1), KEY u(a,b)) ENGINE=InnoDB; INSERT INTO t1 VALUES (1,1, 'a' ),(2,9999, 'b' ),(3,10000, 'c' ),(4,4, 'd' ); DELETE FROM t1 WHERE a = 9999 AND b= 'b' ; DROP TABLE t1; This will lock both the record and the preceding gap ( LOCK_ORDINARY ) as follows: (9999,'b',2) ( LOCK_X | LOCK_ORDINARY ) (2) ( LOCK_X | LOCK_REC_NOT_GAP ) (9999,'b',2) ( LOCK_X | LOCK_REC_NOT_GAP ) (covered by the first one) (10000,'c',3) ( LOCK_X | LOCK_GAP ) This looks reasonable to me. The gap-only lock on (10000,'c',3) should not conflict with other deletes. I have the feeling that the gap-lock is like a shared lock, but combined with LOCK_INSERT_INTENTION it becomes an exclusive lock for that gap. To simplify the management of record locks, MDEV-16406 could use a single bitmap of record locks per page, using multiple bits per record. The only potential for deadlock that I can see here is that one of the DELETE operations chooses a different query plan, performing the locking read on the PRIMARY KEY first, and then doing the ha_innobase::delete_row() first on the PRIMARY KEY and then on the secondary index. In that way, one DELETE would hold a lock on the secondary index record and the other on the PRIMARY KEY record, and the deadlock would be detected when each of them try to acquire the locks that the other one is holding. I can imagine that as records are deleted, query plans could change.

Marko Mäkelä made changes - 2018-06-05 12:21

Link

This issue relates to MDEV-16406 [ MDEV-16406 ]

Marko Mäkelä made changes - 2018-06-13 14:39

Affects Version/s		10.2 [ 14601 ]
Affects Version/s		10.3 [ 22126 ]

Marko Mäkelä made changes - 2018-06-13 14:39

Fix Version/s		10.4 [ 22408 ]
Fix Version/s	N/A [ 14700 ]

Marko Mäkelä made changes - 2019-03-28 11:29

Fix Version/s	10.4 [ 22408 ]
NRE Projects		RM_long_term
Affects Version/s		10.4 [ 22408 ]
Priority	Critical [ 2 ]	Major [ 3 ]

Sergei Golubchik made changes - 2019-04-01 18:59

Fix Version/s

10.4 [ 22408 ]

Marko Mäkelä made changes - 2019-09-11 06:35

Link

This issue relates to MDEV-18706 [ MDEV-18706 ]

Marko Mäkelä added a comment - 2019-09-11 06:35

MDEV-18706 documents another scenario where a deadlock seems to be unnecessarily reported for a gap lock.

While reviewing that scenario, I came to the conclusion that InnoDB supports two modes of gap locks on records. In both cases, the gap covers the range that is between the preceding record and the anchor record of the gap (excluding the anchor record itself).

A comment in lock_rec_has_to_wait() suggests that insert intention locks never conflicted with each other. The comment was originally added in MySQL 4.0.3 and later modified in MySQL 4.0.5.

So, insert intention locks gap do not conflict with each other, but they do conflict with gap locks that are set by readers. There appear to be no exclusive gap locks, other than the mutual exclusion between the two groups of gap locks.

Marko Mäkelä added a comment - 2019-09-11 06:35 MDEV-18706 documents another scenario where a deadlock seems to be unnecessarily reported for a gap lock. While reviewing that scenario, I came to the conclusion that InnoDB supports two modes of gap locks on records. In both cases, the gap covers the range that is between the preceding record and the anchor record of the gap (excluding the anchor record itself). A comment in lock_rec_has_to_wait() suggests that insert intention locks never conflicted with each other. The comment was originally added in MySQL 4.0.3 and later modified in MySQL 4.0.5 . So, insert intention locks gap do not conflict with each other, but they do conflict with gap locks that are set by readers. There appear to be no exclusive gap locks, other than the mutual exclusion between the two groups of gap locks.

Marko Mäkelä made changes - 2020-04-21 07:37

Link

This issue relates to ~~MDEV-20605~~ [ ~~MDEV-20605~~ ]

Aleksey Midenkov made changes - 2020-05-10 10:00

Assignee

Marko Mäkelä [ marko ]

Aleksey Midenkov [ midenok ]

Aleksey Midenkov made changes - 2020-05-20 13:52

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Aleksey Midenkov made changes - 2020-05-25 14:46

Link

This issue relates to MDEV-22698 [ MDEV-22698 ]

Aleksey Midenkov made changes - 2020-05-25 18:16

Assignee	Aleksey Midenkov [ midenok ]	Marko Mäkelä [ marko ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Marko Mäkelä added a comment - 2020-08-12 12:17

Can you please rebase the fix to a recent main branch? I’d review it after a successful CI run. I think that it should also be extensively tested with RQG. For that purpose, it would be nice to get a version for the most recent main branch as well. Locking was changed somewhat by the trx_sys refactoring in 10.3.

Marko Mäkelä added a comment - 2020-08-12 12:17 Can you please rebase the fix to a recent main branch? I’d review it after a successful CI run. I think that it should also be extensively tested with RQG. For that purpose, it would be nice to get a version for the most recent main branch as well. Locking was changed somewhat by the trx_sys refactoring in 10.3.

Marko Mäkelä made changes - 2020-08-12 12:17

Assignee	Marko Mäkelä [ marko ]	Aleksey Midenkov [ midenok ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Marko Mäkelä made changes - 2020-08-24 14:43

Link

This issue relates to MDEV-23560 [ MDEV-23560 ]

Aleksey Midenkov added a comment - 2020-08-25 09:50

bb-10.2-midenok

Aleksey Midenkov added a comment - 2020-08-25 09:50 bb-10.2-midenok

Aleksey Midenkov made changes - 2020-08-25 09:50

Assignee	Aleksey Midenkov [ midenok ]	Marko Mäkelä [ marko ]
Status	Stalled [ 10000 ]	In Review [ 10002 ]

Marko Mäkelä added a comment - 2020-11-05 15:59

I updated bb-10.2-midenok-innodb today with the latest 10.2. The old results were no longer available in the grid view, but in the cross-reference I can see that the test galera.MW-328D was failing also for earlier versions of the branch. That test last failed in any other 10.2-based branch in 2018. So, it looks like this (or MDEV-18706, which was present in the same branch) will need some more work.

Marko Mäkelä added a comment - 2020-11-05 15:59 I updated bb-10.2-midenok-innodb today with the latest 10.2. The old results were no longer available in the grid view, but in the cross-reference I can see that the test galera.MW-328D was failing also for earlier versions of the branch. That test last failed in any other 10.2-based branch in 2018. So, it looks like this (or MDEV-18706 , which was present in the same branch) will need some more work.

Marko Mäkelä made changes - 2020-11-05 15:59

Assignee	Marko Mäkelä [ marko ]	Aleksey Midenkov [ midenok ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Aleksey Midenkov made changes - 2021-06-22 10:36

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Aleksey Midenkov made changes - 2021-08-27 08:01

Status

In Progress [ 3 ]

Stalled [ 10000 ]

Sergei Golubchik made changes - 2021-12-06 21:35

Workflow

MariaDB v3 [ 77682 ]

MariaDB v4 [ 143490 ]

Marko Mäkelä added a comment - 2022-01-04 09:54

I wonder if the fix of ~~MDEV-27025~~ would address this scenario.

Marko Mäkelä added a comment - 2022-01-04 09:54 I wonder if the fix of MDEV-27025 would address this scenario.

Marko Mäkelä made changes - 2022-01-04 09:54

Link

This issue is blocked by ~~MDEV-27025~~ [ ~~MDEV-27025~~ ]

Aleksey Midenkov added a comment - 2023-04-14 15:34

MySQL patch is available:

commit 16d84704097d5ce086eac0a3a1f2dbca0e6fa80c Author: Jakub Łopuszański
Date: Tue Jun 11 12:36:53 2019 +0200

Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY

PROBLEM: A deadlock was possible
when a transaction tried to "upgrade" an already held Record Lock to Next
Key Lock.

SOLUTION: This patch is based on observations that: (1) a Next
Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP
Lock never has to wait for any other lock In case we request a Next Key
Lock, we check if we already own a Record Lock of equal or stronger mode,
and if so, then we either upgrade it to Next Key Lock, or if it is not
possible (because the single lock_t struct is shared by more than one row)
we change the requested lock type to GAP Lock, which we either already
have, or can be granted immediately. (I don't consider Insert Intention
Locks a Gap Lock in above statements). Reviewed-by: Debarun Banerjee
RB:19879

Aleksey Midenkov added a comment - 2023-04-14 15:34 MySQL patch is available: commit 16d84704097d5ce086eac0a3a1f2dbca0e6fa80c Author: Jakub Łopuszański Date: Tue Jun 11 12:36:53 2019 +0200 Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY PROBLEM: A deadlock was possible when a transaction tried to "upgrade" an already held Record Lock to Next Key Lock. SOLUTION: This patch is based on observations that: (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP Lock never has to wait for any other lock In case we request a Next Key Lock, we check if we already own a Record Lock of equal or stronger mode, and if so, then we either upgrade it to Next Key Lock, or if it is not possible (because the single lock_t struct is shared by more than one row) we change the requested lock type to GAP Lock, which we either already have, or can be granted immediately. (I don't consider Insert Intention Locks a Gap Lock in above statements). Reviewed-by: Debarun Banerjee RB:19879

Aleksey Midenkov made changes - 2023-04-14 15:34

Assignee

Aleksey Midenkov [ midenok ]

Marko Mäkelä [ marko ]

Marko Mäkelä made changes - 2023-05-09 13:41

Assignee

Marko Mäkelä [ marko ]

Vladislav Lesin [ vlad.lesin ]

Julien Fritsch made changes - 2023-05-17 07:25

Priority

Major [ 3 ]

Critical [ 2 ]

Vladislav Lesin made changes - 2023-06-26 10:28

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Vladislav Lesin added a comment - 2023-06-26 11:45

It looks very similar to ~~MDEV-27025~~ and ~~MDEV-27992~~.

Vladislav Lesin added a comment - 2023-06-26 11:45 It looks very similar to MDEV-27025 and MDEV-27992 .

Vladislav Lesin added a comment - 2023-06-27 08:38 - edited

Can't repeat it with the following test:

--source include/have_innodb.inc

--source include/have_debug.inc

--source include/have_debug_sync.inc

--connect(dont_purge, localhost,root,,)

START TRANSACTION WITH CONSISTENT SNAPSHOT;

--connection default

# There are various scenarious in which a transaction already holds "half"

# of a record lock (for example, a lock on the record but not on the gap)

# and wishes to "upgrade it" to a full lock (i.e. on both gap and record).

# This is often a cause for a deadlock, if there is another transaction

# which is already waiting for the lock being blocked by us:

# 1. our granted lock for one half

# 2. her waiting lock for the same half

# 3. our waiting lock for the whole

# SCENARIO 1

# In this scenario, three different threads try to delete the same row,

# identified by a secondary index key.

# This kind of operation (besides LOCK_IX on a table) requires

# an LOCK_REC_NOT_GAP|LOCK_REC|LOCK_X lock on a secondary index

# 1. `deleter` is the first to get the required lock

# 2. `holder` enqueues a waiting lock

# 3. `waiter` enqueues right after `holder`

# 4. `deleter` commits, releasing the lock, and granting it to `holder`

# 5. `holder` now observes that the row was deleted, so it needs to

#    "seal the gap", by obtaining a LOCK_X|LOCK_REC, but..

# 6. this causes a deadlock between `holder` and `waiter`

CREATE TABLE `t`(

  `id` INT,

  `a` INT DEFAULT NULL,

  PRIMARY KEY(`id`),

  UNIQUE KEY `u`(`a`)

) ENGINE=InnoDB;

INSERT INTO t (`id`,`a`) VALUES

  (1,1),

  (2,9999),

  (3,10000);

--connect(deleter,localhost,root,,)

--connect(holder,localhost,root,,)

--connect(waiter,localhost,root,,)

--connection deleter

  SET DEBUG_SYNC =

    'lock_sec_rec_read_check_and_lock_has_locked

      SIGNAL deleter_has_locked

      WAIT_FOR waiter_has_locked';

  --send DELETE FROM t WHERE a = 9999

--connection holder

  SET DEBUG_SYNC=

    'now WAIT_FOR deleter_has_locked';

  SET DEBUG_SYNC=

    'lock_sec_rec_read_check_and_lock_has_locked SIGNAL holder_has_locked';

  --send DELETE FROM t WHERE a = 9999

--connection waiter

  SET DEBUG_SYNC=

    'now WAIT_FOR holder_has_locked';

  SET DEBUG_SYNC=

    'lock_sec_rec_read_check_and_lock_has_locked SIGNAL waiter_has_locked';

  --send DELETE FROM t WHERE a = 9999

--connection deleter

  --reap

--connection holder

  --reap

--connection waiter

  --reap

--connection default

--disconnect deleter

--disconnect holder

--disconnect waiter

--disconnect dont_purge

DROP TABLE `t`;

SET DEBUG_SYNC='reset';

and the following sync point:

diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc

index 26388ad95e2..bc59d824ff2 100644

--- a/storage/innobase/lock/lock0lock.cc

+++ b/storage/innobase/lock/lock0lock.cc

@@ -5763,6 +5763,8 @@ lock_sec_rec_modify_check_and_lock(

        return(err);

+#include "scope.h"

 /*********************************************************************//**

 Like lock_clust_rec_read_check_and_lock(), but reads a

 secondary index record.

@@ -5791,6 +5793,10 @@ lock_sec_rec_read_check_and_lock(

        dberr_t err;

        ulint   heap_no;

+        SCOPE_EXIT([]() {

+          DEBUG_SYNC_C("lock_sec_rec_read_check_and_lock_has_locked");

+        });

        ut_ad(!dict_index_is_clust(index));

        ut_ad(!dict_index_is_online_ddl(index));

        ut_ad(block->frame == page_align(rec));

from

commit bfba840dfa7794b988c59c94658920dbe556075d

Author: Jakub Łopuszański <jakub.lopuszanski@oracle.com>

Date:   Tue Jun 11 12:36:53 2019 +0200

    Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY

on the latest 10.4, as well as with the sequence of steps described in https://bugs.mysql.com/bug.php?id=82127.

Have not understood yet why.

Vladislav Lesin added a comment - 2023-06-27 08:38 - edited Can't repeat it with the following test: --source include/have_innodb.inc --source include/have_debug.inc --source include/have_debug_sync.inc --connect(dont_purge, localhost,root,,) START TRANSACTION WITH CONSISTENT SNAPSHOT; --connection default # There are various scenarious in which a transaction already holds "half" # of a record lock ( for example, a lock on the record but not on the gap) # and wishes to "upgrade it" to a full lock (i.e. on both gap and record). # This is often a cause for a deadlock, if there is another transaction # which is already waiting for the lock being blocked by us: # 1 . our granted lock for one half # 2 . her waiting lock for the same half # 3 . our waiting lock for the whole # # SCENARIO 1 # # In this scenario, three different threads try to delete the same row, # identified by a secondary index key. # This kind of operation (besides LOCK_IX on a table) requires # an LOCK_REC_NOT_GAP|LOCK_REC|LOCK_X lock on a secondary index # 1 . `deleter` is the first to get the required lock # 2 . `holder` enqueues a waiting lock # 3 . `waiter` enqueues right after `holder` # 4 . `deleter` commits, releasing the lock, and granting it to `holder` # 5 . `holder` now observes that the row was deleted, so it needs to # "seal the gap" , by obtaining a LOCK_X|LOCK_REC, but.. # 6 . this causes a deadlock between `holder` and `waiter` CREATE TABLE `t`( `id` INT, `a` INT DEFAULT NULL, PRIMARY KEY(`id`), UNIQUE KEY `u`(`a`) ) ENGINE=InnoDB; INSERT INTO t (`id`,`a`) VALUES ( 1 , 1 ), ( 2 , 9999 ), ( 3 , 10000 ); --connect(deleter,localhost,root,,) --connect(holder,localhost,root,,) --connect(waiter,localhost,root,,) --connection deleter SET DEBUG_SYNC = 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL deleter_has_locked WAIT_FOR waiter_has_locked'; --send DELETE FROM t WHERE a = 9999 --connection holder SET DEBUG_SYNC= 'now WAIT_FOR deleter_has_locked' ; SET DEBUG_SYNC= 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL holder_has_locked' ; --send DELETE FROM t WHERE a = 9999 --connection waiter SET DEBUG_SYNC= 'now WAIT_FOR holder_has_locked' ; SET DEBUG_SYNC= 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL waiter_has_locked' ; --send DELETE FROM t WHERE a = 9999 --connection deleter --reap --connection holder --reap --connection waiter --reap --connection default --disconnect deleter --disconnect holder --disconnect waiter --disconnect dont_purge DROP TABLE `t`; SET DEBUG_SYNC= 'reset' ; and the following sync point: diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc index 26388ad95e2..bc59d824ff2 100644 --- a/storage/innobase/lock/lock0lock.cc +++ b/storage/innobase/lock/lock0lock.cc @@ - 5763 , 6 + 5763 , 8 @@ lock_sec_rec_modify_check_and_lock( return (err); } +#include "scope.h" + /*********************************************************************/ /** Like lock_clust_rec_read_check_and_lock(), but reads a secondary index record. @@ - 5791 , 6 + 5793 , 10 @@ lock_sec_rec_read_check_and_lock( dberr_t err; ulint heap_no; + SCOPE_EXIT([]() { + DEBUG_SYNC_C( "lock_sec_rec_read_check_and_lock_has_locked" ); + }); + ut_ad(!dict_index_is_clust(index)); ut_ad(!dict_index_is_online_ddl(index)); ut_ad(block->frame == page_align(rec)); from commit bfba840dfa7794b988c59c94658920dbe556075d Author: Jakub Łopuszański <jakub.lopuszanski@oracle.com> Date: Tue Jun 11 12:36:53 2019 +0200 Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY on the latest 10.4, as well as with the sequence of steps described in https://bugs.mysql.com/bug.php?id=82127 . Have not understood yet why.

Julius Goryavsky made changes - 2023-06-27 09:59

Link

This issue relates to ~~MENT-1815~~ [ ~~MENT-1815~~ ]

Vladislav Lesin added a comment - 2023-06-27 12:42 - edited

The cause of why I can't reproduce it on 10.4.28 is ~~MDEV-30225~~ fix.

Vladislav Lesin added a comment - 2023-06-27 12:42 - edited The cause of why I can't reproduce it on 10.4.28 is MDEV-30225 fix.

Vladislav Lesin added a comment - 2023-06-27 17:53

~~MDEV-30225~~ does not fix the bug, but just hides it. If we take a look the test above, the 'holder' does not "seal the gap" after 'deleter' was committed because it was initially sealed, as after ~~MDEV-30225~~ fix the 'holder' initially requests next-key lock.

The following test from bfba840dfa7794b988c59c94658920dbe556075d mysql commit shows the issue:

# SCENARIO 2

# Here, we form a situation in which con1 has LOCK_REC_NOT_GAP on rows 1 and 2

# con2 waits for lock on row 1, and then con1 wants to upgrade the lock on row 1,

# which might cause a deadlock, unless con1 properly notices that even though the

# lock on row 1 can not be upgraded, a separate LOCK_GAP can be obtaied easily.

CREATE TABLE `t`(

  `id` INT NOT NULL PRIMARY KEY

) ENGINE=InnoDB;

INSERT INTO t (`id`) VALUES (1), (2);

--connect(holder,localhost,root,,)

--connect(waiter,localhost,root,,)

--connection holder

  BEGIN;

  SELECT id FROM t WHERE id=1 FOR UPDATE;

  SELECT id FROM t WHERE id=2 FOR UPDATE;

--connection waiter

  SET DEBUG_SYNC=

    'lock_wait_suspend_thread_enter SIGNAL waiter_will_wait';

  --send SELECT id FROM t WHERE id = 1 FOR UPDATE

--connection holder

  SET DEBUG_SYNC=

    'now WAIT_FOR waiter_will_wait';

  SELECT * FROM t FOR UPDATE;

  COMMIT;

--connection waiter

  --reap

--connection default

--disconnect holder

--disconnect waiter

DROP TABLE `t`;

Vladislav Lesin added a comment - 2023-06-27 17:53 MDEV-30225 does not fix the bug, but just hides it. If we take a look the test above, the 'holder' does not "seal the gap" after 'deleter' was committed because it was initially sealed, as after MDEV-30225 fix the 'holder' initially requests next-key lock. The following test from bfba840dfa7794b988c59c94658920dbe556075d mysql commit shows the issue: # SCENARIO 2 # # Here, we form a situation in which con1 has LOCK_REC_NOT_GAP on rows 1 and 2 # con2 waits for lock on row 1 , and then con1 wants to upgrade the lock on row 1 , # which might cause a deadlock, unless con1 properly notices that even though the # lock on row 1 can not be upgraded, a separate LOCK_GAP can be obtaied easily. CREATE TABLE `t`( `id` INT NOT NULL PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t (`id`) VALUES ( 1 ), ( 2 ); --connect(holder,localhost,root,,) --connect(waiter,localhost,root,,) --connection holder BEGIN; SELECT id FROM t WHERE id= 1 FOR UPDATE; SELECT id FROM t WHERE id= 2 FOR UPDATE; --connection waiter SET DEBUG_SYNC= 'lock_wait_suspend_thread_enter SIGNAL waiter_will_wait' ; --send SELECT id FROM t WHERE id = 1 FOR UPDATE --connection holder SET DEBUG_SYNC= 'now WAIT_FOR waiter_will_wait' ; SELECT * FROM t FOR UPDATE; COMMIT; --connection waiter --reap --connection default --disconnect holder --disconnect waiter DROP TABLE `t`;

Vladislav Lesin added a comment - 2023-06-30 11:06

This commit suffers from ~~MDEV-27992~~, see this comment for details.

Vladislav Lesin added a comment - 2023-06-30 11:06 This commit suffers from MDEV-27992 , see this comment for details.

Vladislav Lesin made changes - 2023-07-05 08:00

Assignee	Vladislav Lesin [ vlad.lesin ]	Marko Mäkelä [ marko ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Marko Mäkelä made changes - 2023-07-05 10:34

Link

This issue relates to ~~MDEV-28800~~ [ ~~MDEV-28800~~ ]

Marko Mäkelä added a comment - 2023-07-05 14:23

Thank you, this looks good to me.

Marko Mäkelä added a comment - 2023-07-05 14:23 Thank you, this looks good to me.

Marko Mäkelä made changes - 2023-07-05 14:23

Assignee	Marko Mäkelä [ marko ]	Vladislav Lesin [ vlad.lesin ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Vladislav Lesin made changes - 2023-07-06 16:04

Fix Version/s		10.4.31 [ 29010 ]
Fix Version/s		10.5.22 [ 29011 ]
Fix Version/s		10.6.15 [ 29013 ]
Fix Version/s		10.9.8 [ 29015 ]
Fix Version/s		10.10.6 [ 29017 ]
Fix Version/s		10.11.5 [ 29019 ]
Fix Version/s		11.0.3 [ 28920 ]
Fix Version/s		11.1.2 [ 28921 ]
Fix Version/s		11.2.1 [ 29034 ]
Fix Version/s	10.4 [ 22408 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Rob Schwyzer (Inactive) made changes - 2023-07-20 18:33

Remote Link

This issue links to "Page (MariaDB Confluence)" [ 35637 ]

Jira Automation (IT) made changes - 2024-07-04 08:35

Zendesk Related Tickets		201658
Zendesk active tickets		201658

People

Assignee:: Vladislav Lesin

Reporter:: Valerii Kravchuk

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 2016-10-06 07:10

Updated:: 2024-09-06 09:59

Resolved:: 2023-07-06 16:04

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.