[MDEV-10962] Deadlock with 3 concurrent DELETEs by unique key Created: 2016-10-06 Updated: 2023-09-01 Resolved: 2023-07-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 5.5, 10.0, 10.1, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.4.31, 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Valerii Kravchuk | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | upstream | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
As explained in upstream bug reports (that has all the details on the test case): http://bugs.mysql.com/bug.php?id=82127 there is a deadlock scenario with 3 concurrent DELETEs by UNIQUE key that can not be explained by the manual:
Deadlock happens both with triggers mentioned in that bug reports and without them (just less often). The problem was originally noted by customer on MariaDB 5.5.24, but affects all released versions up to those based on InnoDB from 5.7.x for sure. As there is no visible progress on upstream bugs, I create this bug report for MariaDB to decide if there is anything to fix here or to document clearly in the knowledge base. |
| Comments |
| Comment by Jan Lindström (Inactive) [ 2016-11-02 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I will firstly comment on case where we do not have any triggers, just tree concurrent transactions. For deadlock we need only two and I will mark then as trx(1) and trx(2) to identify them. Thus, trx (1) is doing delete t where a = 9999 and b = 'xxxx' and takes X-lock for record where a = 9999 and b = 'xxxx' , trx(2) does delete t where a = 9999 and b = 'xxxx' and tries to obtain X-lock for the same record, it can't have it so it needs to wait, trx(1) continues finds a delete marked row, continues to next record that is end of search range and tries to obtain X-lock with GAP, this can't be granted as we already have waiting lock request for the same record done by trx(1) in lock wait queue. Thus, we have trx(1) -> trx(2) -> trx(1) where -> means waiting and this is naturally a deadlock as it means that trx(1) -> trx(1). Why trx(1) takes a GAP-lock? When trx(1) searches matching rows it firstly finds the matching row and naturally places X-lock for that row to protect it. Then trx(1) tries to find more matching rows and finds a delete-marked row that is naturally skipped. Finally, trx(1) find a index-entry that does not match, in this case it takes GAP-lock. This is end of range like GAP-lock protecting the fact that any concurrent transactions can't INSERT or UPDATE a key here. Now question is this really necessary ? For non-unique secondary indexes, absolutely yes if isolation level is REPEATABLE READ or higher. For unique-index if only a part of the multi-key index is used or any of the key part condition is not exact query, absolutely yes if isolation level is REPEATABLE READ or higher, in above case that would mean query like delete from tu where a = 9999 or delete from tu where a = 9999 and b > 'XXX'. Lets consider INSERT ... VALUES(9999, 'xxxx' ) i.e. value that would cause duplicate key if delete is rolled back. Insert does not take row locks, it takes insert intention gap-lock to index record to avoid concurrent INSERTs to same index gap. Then, for unique-indexes this insert would need to find any duplicate index entries and for that it needs to take S-locks, thus insert would wait. Similarly for UPDATE that would change unique-index entry, it would need to do duplicate check using at least S-locks forcing it to wait for delete. DELETE as we already have seen would try to take X-lock to index entry forcing it to wait for first delete. Honestly, I do not see why exact query using all key parts on unique index would take gap-lock even in case when there is delete marked index entry. However, this is not a bug, this is current implementation so it works as designed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2016-11-02 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Trigger case is more clear. Select inside a trigger needs consistent view i.e. result set for that select should remain consistent during transaction execution. Trigger will inherit lock mode from original query firing it i.e. in this case X-lock. To maintain consistent view select will take gap locks. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2016-12-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Delete contains two stages (1) select that naturally needs to keep the result set consistent and (2) delete where actual index records and clustered index record are marked deleted. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2016-12-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Current implementation works as designed. Gap-locks are taken to keep the result set from select phase consistent. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-06-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Before MDEV-16232, an UPDATE or DELETE operation (including an UPDATE that is executed as part of REPLACE or INSERT…ON DUPLICATE KEY UPDATE) inside InnoDB consists of multiple mini-transactions. If we used a single mini-transaction for searching and updating a row, we could rely on implicit record locks, just like INSERT does. This might remove the need for gap locks in many cases. For locking reads (SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, SELECT…LOCK IN SHARE MODE or SELECT…FOR UPDATE), gap locks are unavoidable. As noted in MDEV-16402, index condition pushdown would help avoid unnecessary gap locks. With the partial fix of | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-06-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that this is a valid bug that we can do something about. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-06-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Using the following test, I checked which locks are being acquired in a recent MariaDB Server 10.3 b50685af82508ca1cc83e1743dff527770e6e64b:
Observed by a breakpoint on lock_rec_lock(), the locking ha_innobase::index_read() will acquire LOCK_X | LOCK_REC_NOT_GAP on the secondary index record (9999,2) and the primary key record (2). Then, as part of the delete-mark operation in ha_innobase::delete_row(), LOCK_X | LOCK_REC_NOT_GAP will be reacquired on the secondary index record (9999,2). I also tested with a multi-column unique secondary key (with b CHAR(1)), and it made no difference. I also tested with MariaDB 10.1 3b7da8a44c8a0ff4b40b37e4db01f7e397aefab5, and it acquired the same locks (reacquiring the lock on the secondary index record). I tested one more variant, with a non-unique secondary index:
This will lock both the record and the preceding gap (LOCK_ORDINARY) as follows:
This looks reasonable to me. The gap-only lock on (10000,'c',3) should not conflict with other deletes. I have the feeling that the gap-lock is like a shared lock, but combined with LOCK_INSERT_INTENTION it becomes an exclusive lock for that gap. To simplify the management of record locks, MDEV-16406 could use a single bitmap of record locks per page, using multiple bits per record. The only potential for deadlock that I can see here is that one of the DELETE operations chooses a different query plan, performing the locking read on the PRIMARY KEY first, and then doing the ha_innobase::delete_row() first on the PRIMARY KEY and then on the secondary index. In that way, one DELETE would hold a lock on the secondary index record and the other on the PRIMARY KEY record, and the deadlock would be detected when each of them try to acquire the locks that the other one is holding. I can imagine that as records are deleted, query plans could change. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-09-11 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MDEV-18706 documents another scenario where a deadlock seems to be unnecessarily reported for a gap lock. While reviewing that scenario, I came to the conclusion that InnoDB supports two modes of gap locks on records. In both cases, the gap covers the range that is between the preceding record and the anchor record of the gap (excluding the anchor record itself). A comment in lock_rec_has_to_wait() suggests that insert intention locks never conflicted with each other. The comment was originally added in MySQL 4.0.3 and later modified in MySQL 4.0.5. So, insert intention locks gap do not conflict with each other, but they do conflict with gap locks that are set by readers. There appear to be no exclusive gap locks, other than the mutual exclusion between the two groups of gap locks. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-08-12 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Can you please rebase the fix to a recent main branch? I’d review it after a successful CI run. I think that it should also be extensively tested with RQG. For that purpose, it would be nice to get a version for the most recent main branch as well. Locking was changed somewhat by the trx_sys refactoring in 10.3. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Aleksey Midenkov [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
bb-10.2-midenok | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-11-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I updated bb-10.2-midenok-innodb today with the latest 10.2. The old results were no longer available in the grid view, but in the cross-reference I can see that the test galera.MW-328D was failing also for earlier versions of the branch. That test last failed in any other 10.2-based branch in 2018. So, it looks like this (or MDEV-18706, which was present in the same branch) will need some more work. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I wonder if the fix of | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Aleksey Midenkov [ 2023-04-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MySQL patch is available: commit 16d84704097d5ce086eac0a3a1f2dbca0e6fa80c Author: Jakub Łopuszański Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY PROBLEM: A deadlock was possible SOLUTION: This patch is based on observations that: (1) a Next | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-06-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It looks very similar to | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-06-27 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Can't repeat it with the following test:
and the following sync point:
from
on the latest 10.4, as well as with the sequence of steps described in https://bugs.mysql.com/bug.php?id=82127. Have not understood yet why. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-06-27 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The cause of why I can't reproduce it on 10.4.28 is | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-06-27 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The following test from bfba840dfa7794b988c59c94658920dbe556075d mysql commit shows the issue:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-06-30 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This commit suffers from | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-07-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you, this looks good to me. |