Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10962

Deadlock with 3 concurrent DELETEs by unique key

Details

    Description

      As explained in upstream bug reports (that has all the details on the test case):

      http://bugs.mysql.com/bug.php?id=82127
      https://bugs.launchpad.net/percona-server/+bug/1598822

      there is a deadlock scenario with 3 concurrent DELETEs by UNIQUE key that can not be explained by the manual:

      CREATE TABLE `tu`(`id` int(11), `a` int(11) DEFAULT NULL, `b` varchar(10) DEFAULT NULL, `c` varchar(10) DEFAULT NULL, PRIMARY KEY(`id`), UNIQUE KEY `u`(`a`,`b`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 STATS_PERSISTENT=0;
       
      insert into tu values(1,1,'a','a'),(2,9999,'xxxx','x'),(3,10000,'b','b'),(4,4,'c','c');
       
      mysqlslap -uroot --concurrency=3 --create-schema=test --no-drop --number-of-queries=1000 --query="delete from tu where a = 9999 and b = 'xxxx'"
      mysqlslap: Cannot run query delete from tu where a = 9999 and b = 'xxxx' ERROR : Deadlock found when trying to get lock; try restarting transaction
      

      Deadlock happens both with triggers mentioned in that bug reports and without them (just less often).

      The problem was originally noted by customer on MariaDB 5.5.24, but affects all released versions up to those based on InnoDB from 5.7.x for sure.

      As there is no visible progress on upstream bugs, I create this bug report for MariaDB to decide if there is anything to fix here or to document clearly in the knowledge base.

      Attachments

        Issue Links

          Activity

            I will firstly comment on case where we do not have any triggers, just tree concurrent transactions. For deadlock we need only two and I will mark then as trx(1) and trx(2) to identify them. Thus, trx (1) is doing delete t where a = 9999 and b = 'xxxx' and takes X-lock for record where a = 9999 and b = 'xxxx' , trx(2) does delete t where a = 9999 and b = 'xxxx' and tries to obtain X-lock for the same record, it can't have it so it needs to wait, trx(1) continues finds a delete marked row, continues to next record that is end of search range and tries to obtain X-lock with GAP, this can't be granted as we already have waiting lock request for the same record done by trx(1) in lock wait queue. Thus, we have trx(1) -> trx(2) -> trx(1) where -> means waiting and this is naturally a deadlock as it means that trx(1) -> trx(1). Why trx(1) takes a GAP-lock? When trx(1) searches matching rows it firstly finds the matching row and naturally places X-lock for that row to protect it. Then trx(1) tries to find more matching rows and finds a delete-marked row that is naturally skipped. Finally, trx(1) find a index-entry that does not match, in this case it takes GAP-lock. This is end of range like GAP-lock protecting the fact that any concurrent transactions can't INSERT or UPDATE a key here. Now question is this really necessary ? For non-unique secondary indexes, absolutely yes if isolation level is REPEATABLE READ or higher. For unique-index if only a part of the multi-key index is used or any of the key part condition is not exact query, absolutely yes if isolation level is REPEATABLE READ or higher, in above case that would mean query like delete from tu where a = 9999 or delete from tu where a = 9999 and b > 'XXX'. Lets consider INSERT ... VALUES(9999, 'xxxx' ) i.e. value that would cause duplicate key if delete is rolled back. Insert does not take row locks, it takes insert intention gap-lock to index record to avoid concurrent INSERTs to same index gap. Then, for unique-indexes this insert would need to find any duplicate index entries and for that it needs to take S-locks, thus insert would wait. Similarly for UPDATE that would change unique-index entry, it would need to do duplicate check using at least S-locks forcing it to wait for delete. DELETE as we already have seen would try to take X-lock to index entry forcing it to wait for first delete. Honestly, I do not see why exact query using all key parts on unique index would take gap-lock even in case when there is delete marked index entry. However, this is not a bug, this is current implementation so it works as designed.

            jplindst Jan Lindström (Inactive) added a comment - I will firstly comment on case where we do not have any triggers, just tree concurrent transactions. For deadlock we need only two and I will mark then as trx(1) and trx(2) to identify them. Thus, trx (1) is doing delete t where a = 9999 and b = 'xxxx' and takes X-lock for record where a = 9999 and b = 'xxxx' , trx(2) does delete t where a = 9999 and b = 'xxxx' and tries to obtain X-lock for the same record, it can't have it so it needs to wait, trx(1) continues finds a delete marked row, continues to next record that is end of search range and tries to obtain X-lock with GAP, this can't be granted as we already have waiting lock request for the same record done by trx(1) in lock wait queue. Thus, we have trx(1) -> trx(2) -> trx(1) where -> means waiting and this is naturally a deadlock as it means that trx(1) -> trx(1). Why trx(1) takes a GAP-lock? When trx(1) searches matching rows it firstly finds the matching row and naturally places X-lock for that row to protect it. Then trx(1) tries to find more matching rows and finds a delete-marked row that is naturally skipped. Finally, trx(1) find a index-entry that does not match, in this case it takes GAP-lock. This is end of range like GAP-lock protecting the fact that any concurrent transactions can't INSERT or UPDATE a key here. Now question is this really necessary ? For non-unique secondary indexes, absolutely yes if isolation level is REPEATABLE READ or higher. For unique-index if only a part of the multi-key index is used or any of the key part condition is not exact query, absolutely yes if isolation level is REPEATABLE READ or higher, in above case that would mean query like delete from tu where a = 9999 or delete from tu where a = 9999 and b > 'XXX'. Lets consider INSERT ... VALUES(9999, 'xxxx' ) i.e. value that would cause duplicate key if delete is rolled back. Insert does not take row locks, it takes insert intention gap-lock to index record to avoid concurrent INSERTs to same index gap. Then, for unique-indexes this insert would need to find any duplicate index entries and for that it needs to take S-locks, thus insert would wait. Similarly for UPDATE that would change unique-index entry, it would need to do duplicate check using at least S-locks forcing it to wait for delete. DELETE as we already have seen would try to take X-lock to index entry forcing it to wait for first delete. Honestly, I do not see why exact query using all key parts on unique index would take gap-lock even in case when there is delete marked index entry. However, this is not a bug, this is current implementation so it works as designed.

            Trigger case is more clear. Select inside a trigger needs consistent view i.e. result set for that select should remain consistent during transaction execution. Trigger will inherit lock mode from original query firing it i.e. in this case X-lock. To maintain consistent view select will take gap locks.

            jplindst Jan Lindström (Inactive) added a comment - Trigger case is more clear. Select inside a trigger needs consistent view i.e. result set for that select should remain consistent during transaction execution. Trigger will inherit lock mode from original query firing it i.e. in this case X-lock. To maintain consistent view select will take gap locks.

            Delete contains two stages (1) select that naturally needs to keep the result set consistent and (2) delete where actual index records and clustered index record are marked deleted.

            jplindst Jan Lindström (Inactive) added a comment - Delete contains two stages (1) select that naturally needs to keep the result set consistent and (2) delete where actual index records and clustered index record are marked deleted.

            Current implementation works as designed. Gap-locks are taken to keep the result set from select phase consistent.

            jplindst Jan Lindström (Inactive) added a comment - Current implementation works as designed. Gap-locks are taken to keep the result set from select phase consistent.

            Before MDEV-16232, an UPDATE or DELETE operation (including an UPDATE that is executed as part of REPLACE or INSERT…ON DUPLICATE KEY UPDATE) inside InnoDB consists of multiple mini-transactions. If we used a single mini-transaction for searching and updating a row, we could rely on implicit record locks, just like INSERT does. This might remove the need for gap locks in many cases.

            For locking reads (SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, SELECT…LOCK IN SHARE MODE or SELECT…FOR UPDATE), gap locks are unavoidable. As noted in MDEV-16402, index condition pushdown would help avoid unnecessary gap locks.

            With the partial fix of MDEV-14589 in MariaDB Server 10.1.34, 10.2.12, 10.3.3, the READ COMMITTED and READ UNCOMMITTED isolation levels (or the non-default setting innodb_locks_unsafe_for_binlog=1) will avoid locking committed delete-marked records. That should help in the scenario of this bug, but not when using the default REPEATABLE READ isolation level.

            marko Marko Mäkelä added a comment - Before MDEV-16232 , an UPDATE or DELETE  operation (including an UPDATE  that is executed as part of REPLACE or INSERT…ON DUPLICATE KEY UPDATE ) inside InnoDB consists of multiple mini-transactions. If we used a single mini-transaction for searching and updating a row, we could rely on implicit record locks, just like INSERT does. This might remove the need for gap locks in many cases. For locking reads ( SET TRANSACTION ISOLATION LEVEL SERIALIZABLE , SELECT…LOCK IN SHARE MODE or SELECT…FOR UPDATE ), gap locks are unavoidable. As noted in MDEV-16402 , index condition pushdown would help avoid unnecessary gap locks. With the partial fix of MDEV-14589 in MariaDB Server 10.1.34, 10.2.12, 10.3.3, the READ COMMITTED and READ UNCOMMITTED isolation levels (or the non-default setting innodb_locks_unsafe_for_binlog=1 ) will avoid locking committed delete-marked records. That should help in the scenario of this bug, but not when using the default REPEATABLE READ isolation level.

            I think that this is a valid bug that we can do something about.

            marko Marko Mäkelä added a comment - I think that this is a valid bug that we can do something about.
            marko Marko Mäkelä added a comment - - edited

            Using the following test, I checked which locks are being acquired in a recent MariaDB Server 10.3 b50685af82508ca1cc83e1743dff527770e6e64b:

            --source include/have_innodb.inc
            CREATE TABLE t1(id INT PRIMARY KEY, a INT, UNIQUE KEY u(a)) ENGINE=InnoDB;
            INSERT INTO t1 VALUES(1,1),(2,9999),(3,10000),(4,4);
            DELETE FROM t1 WHERE a = 9999;
            DROP TABLE t1;
            

            Observed by a breakpoint on lock_rec_lock(), the locking ha_innobase::index_read() will acquire LOCK_X | LOCK_REC_NOT_GAP on the secondary index record (9999,2) and the primary key record (2). Then, as part of the delete-mark operation in ha_innobase::delete_row(), LOCK_X | LOCK_REC_NOT_GAP will be reacquired on the secondary index record (9999,2).

            I also tested with a multi-column unique secondary key (with b CHAR(1)), and it made no difference.

            I also tested with MariaDB 10.1 3b7da8a44c8a0ff4b40b37e4db01f7e397aefab5, and it acquired the same locks (reacquiring the lock on the secondary index record).

            I tested one more variant, with a non-unique secondary index:

            --source include/have_innodb.inc
            CREATE TABLE t1(id INT PRIMARY KEY, a INT, b CHAR(1), KEY u(a,b)) ENGINE=InnoDB;
            INSERT INTO t1 VALUES(1,1,'a'),(2,9999,'b'),(3,10000,'c'),(4,4,'d');
            DELETE FROM t1 WHERE a = 9999 AND b='b';
            DROP TABLE t1;
            

            This will lock both the record and the preceding gap (LOCK_ORDINARY) as follows:

            1. (9999,'b',2) (LOCK_X | LOCK_ORDINARY)
            2. (2) (LOCK_X | LOCK_REC_NOT_GAP)
            3. (9999,'b',2) (LOCK_X | LOCK_REC_NOT_GAP) (covered by the first one)
            4. (10000,'c',3) (LOCK_X | LOCK_GAP)

            This looks reasonable to me. The gap-only lock on (10000,'c',3) should not conflict with other deletes. I have the feeling that the gap-lock is like a shared lock, but combined with LOCK_INSERT_INTENTION it becomes an exclusive lock for that gap. To simplify the management of record locks, MDEV-16406 could use a single bitmap of record locks per page, using multiple bits per record.

            The only potential for deadlock that I can see here is that one of the DELETE operations chooses a different query plan, performing the locking read on the PRIMARY KEY first, and then doing the ha_innobase::delete_row() first on the PRIMARY KEY and then on the secondary index. In that way, one DELETE would hold a lock on the secondary index record and the other on the PRIMARY KEY record, and the deadlock would be detected when each of them try to acquire the locks that the other one is holding. I can imagine that as records are deleted, query plans could change.

            marko Marko Mäkelä added a comment - - edited Using the following test, I checked which locks are being acquired in a recent MariaDB Server 10.3 b50685af82508ca1cc83e1743dff527770e6e64b: --source include/have_innodb.inc CREATE TABLE t1(id INT PRIMARY KEY , a INT , UNIQUE KEY u(a)) ENGINE=InnoDB; INSERT INTO t1 VALUES (1,1),(2,9999),(3,10000),(4,4); DELETE FROM t1 WHERE a = 9999; DROP TABLE t1; Observed by a breakpoint on lock_rec_lock() , the locking ha_innobase::index_read() will acquire LOCK_X | LOCK_REC_NOT_GAP on the secondary index record (9999,2) and the primary key record (2). Then, as part of the delete-mark operation in ha_innobase::delete_row() , LOCK_X | LOCK_REC_NOT_GAP will be reacquired on the secondary index record (9999,2). I also tested with a multi-column unique secondary key (with b CHAR(1) ), and it made no difference. I also tested with MariaDB 10.1 3b7da8a44c8a0ff4b40b37e4db01f7e397aefab5, and it acquired the same locks (reacquiring the lock on the secondary index record). I tested one more variant, with a non-unique secondary index: --source include/have_innodb.inc CREATE TABLE t1(id INT PRIMARY KEY , a INT , b CHAR (1), KEY u(a,b)) ENGINE=InnoDB; INSERT INTO t1 VALUES (1,1, 'a' ),(2,9999, 'b' ),(3,10000, 'c' ),(4,4, 'd' ); DELETE FROM t1 WHERE a = 9999 AND b= 'b' ; DROP TABLE t1; This will lock both the record and the preceding gap ( LOCK_ORDINARY ) as follows: (9999,'b',2) ( LOCK_X | LOCK_ORDINARY ) (2) ( LOCK_X | LOCK_REC_NOT_GAP ) (9999,'b',2) ( LOCK_X | LOCK_REC_NOT_GAP ) (covered by the first one) (10000,'c',3) ( LOCK_X | LOCK_GAP ) This looks reasonable to me. The gap-only lock on (10000,'c',3) should not conflict with other deletes. I have the feeling that the gap-lock is like a shared lock, but combined with LOCK_INSERT_INTENTION it becomes an exclusive lock for that gap. To simplify the management of record locks, MDEV-16406 could use a single bitmap of record locks per page, using multiple bits per record. The only potential for deadlock that I can see here is that one of the DELETE operations chooses a different query plan, performing the locking read on the PRIMARY KEY first, and then doing the ha_innobase::delete_row() first on the PRIMARY KEY and then on the secondary index. In that way, one DELETE would hold a lock on the secondary index record and the other on the PRIMARY KEY record, and the deadlock would be detected when each of them try to acquire the locks that the other one is holding. I can imagine that as records are deleted, query plans could change.

            MDEV-18706 documents another scenario where a deadlock seems to be unnecessarily reported for a gap lock.

            While reviewing that scenario, I came to the conclusion that InnoDB supports two modes of gap locks on records. In both cases, the gap covers the range that is between the preceding record and the anchor record of the gap (excluding the anchor record itself).

            A comment in lock_rec_has_to_wait() suggests that insert intention locks never conflicted with each other. The comment was originally added in MySQL 4.0.3 and later modified in MySQL 4.0.5.

            So, insert intention locks gap do not conflict with each other, but they do conflict with gap locks that are set by readers. There appear to be no exclusive gap locks, other than the mutual exclusion between the two groups of gap locks.

            marko Marko Mäkelä added a comment - MDEV-18706 documents another scenario where a deadlock seems to be unnecessarily reported for a gap lock. While reviewing that scenario, I came to the conclusion that InnoDB supports two modes of gap locks on records. In both cases, the gap covers the range that is between the preceding record and the anchor record of the gap (excluding the anchor record itself). A comment in lock_rec_has_to_wait() suggests that insert intention locks never conflicted with each other. The comment was originally added in MySQL 4.0.3 and later modified in MySQL 4.0.5 . So, insert intention locks gap do not conflict with each other, but they do conflict with gap locks that are set by readers. There appear to be no exclusive gap locks, other than the mutual exclusion between the two groups of gap locks.

            Can you please rebase the fix to a recent main branch? I’d review it after a successful CI run. I think that it should also be extensively tested with RQG. For that purpose, it would be nice to get a version for the most recent main branch as well. Locking was changed somewhat by the trx_sys refactoring in 10.3.

            marko Marko Mäkelä added a comment - Can you please rebase the fix to a recent main branch? I’d review it after a successful CI run. I think that it should also be extensively tested with RQG. For that purpose, it would be nice to get a version for the most recent main branch as well. Locking was changed somewhat by the trx_sys refactoring in 10.3.

            bb-10.2-midenok

            midenok Aleksey Midenkov added a comment - bb-10.2-midenok

            I updated bb-10.2-midenok-innodb today with the latest 10.2. The old results were no longer available in the grid view, but in the cross-reference I can see that the test galera.MW-328D was failing also for earlier versions of the branch. That test last failed in any other 10.2-based branch in 2018. So, it looks like this (or MDEV-18706, which was present in the same branch) will need some more work.

            marko Marko Mäkelä added a comment - I updated bb-10.2-midenok-innodb today with the latest 10.2. The old results were no longer available in the grid view, but in the cross-reference I can see that the test galera.MW-328D was failing also for earlier versions of the branch. That test last failed in any other 10.2-based branch in 2018. So, it looks like this (or MDEV-18706 , which was present in the same branch) will need some more work.

            I wonder if the fix of MDEV-27025 would address this scenario.

            marko Marko Mäkelä added a comment - I wonder if the fix of MDEV-27025 would address this scenario.

            MySQL patch is available:

            commit 16d84704097d5ce086eac0a3a1f2dbca0e6fa80c Author: Jakub Łopuszański
            Date: Tue Jun 11 12:36:53 2019 +0200

            Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY

            PROBLEM: A deadlock was possible
            when a transaction tried to "upgrade" an already held Record Lock to Next
            Key Lock.

            SOLUTION: This patch is based on observations that: (1) a Next
            Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP
            Lock never has to wait for any other lock In case we request a Next Key
            Lock, we check if we already own a Record Lock of equal or stronger mode,
            and if so, then we either upgrade it to Next Key Lock, or if it is not
            possible (because the single lock_t struct is shared by more than one row)
            we change the requested lock type to GAP Lock, which we either already
            have, or can be granted immediately. (I don't consider Insert Intention
            Locks a Gap Lock in above statements). Reviewed-by: Debarun Banerjee
            RB:19879

            midenok Aleksey Midenkov added a comment - MySQL patch is available: commit 16d84704097d5ce086eac0a3a1f2dbca0e6fa80c Author: Jakub Łopuszański Date: Tue Jun 11 12:36:53 2019 +0200 Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY PROBLEM: A deadlock was possible when a transaction tried to "upgrade" an already held Record Lock to Next Key Lock. SOLUTION: This patch is based on observations that: (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP Lock never has to wait for any other lock In case we request a Next Key Lock, we check if we already own a Record Lock of equal or stronger mode, and if so, then we either upgrade it to Next Key Lock, or if it is not possible (because the single lock_t struct is shared by more than one row) we change the requested lock type to GAP Lock, which we either already have, or can be granted immediately. (I don't consider Insert Intention Locks a Gap Lock in above statements). Reviewed-by: Debarun Banerjee RB:19879

            It looks very similar to MDEV-27025 and MDEV-27992.

            vlad.lesin Vladislav Lesin added a comment - It looks very similar to MDEV-27025 and MDEV-27992 .
            vlad.lesin Vladislav Lesin added a comment - - edited

            Can't repeat it with the following test:

            --source include/have_innodb.inc
            --source include/have_debug.inc
            --source include/have_debug_sync.inc
             
            --connect(dont_purge, localhost,root,,)
            START TRANSACTION WITH CONSISTENT SNAPSHOT;
             
            --connection default
            # There are various scenarious in which a transaction already holds "half"
            # of a record lock (for example, a lock on the record but not on the gap)
            # and wishes to "upgrade it" to a full lock (i.e. on both gap and record).
            # This is often a cause for a deadlock, if there is another transaction
            # which is already waiting for the lock being blocked by us:
            # 1. our granted lock for one half
            # 2. her waiting lock for the same half
            # 3. our waiting lock for the whole
             
            #
            # SCENARIO 1
            #
            # In this scenario, three different threads try to delete the same row,
            # identified by a secondary index key.
            # This kind of operation (besides LOCK_IX on a table) requires
            # an LOCK_REC_NOT_GAP|LOCK_REC|LOCK_X lock on a secondary index
            # 1. `deleter` is the first to get the required lock
            # 2. `holder` enqueues a waiting lock
            # 3. `waiter` enqueues right after `holder`
            # 4. `deleter` commits, releasing the lock, and granting it to `holder`
            # 5. `holder` now observes that the row was deleted, so it needs to
            #    "seal the gap", by obtaining a LOCK_X|LOCK_REC, but..
            # 6. this causes a deadlock between `holder` and `waiter`
             
            CREATE TABLE `t`(
              `id` INT,
              `a` INT DEFAULT NULL,
              PRIMARY KEY(`id`),
              UNIQUE KEY `u`(`a`)
            ) ENGINE=InnoDB;
             
            INSERT INTO t (`id`,`a`) VALUES
              (1,1),
              (2,9999),
              (3,10000);
             
            --connect(deleter,localhost,root,,)
            --connect(holder,localhost,root,,)
            --connect(waiter,localhost,root,,)
             
             
            --connection deleter
              SET DEBUG_SYNC =
                'lock_sec_rec_read_check_and_lock_has_locked
                  SIGNAL deleter_has_locked
                  WAIT_FOR waiter_has_locked';
              --send DELETE FROM t WHERE a = 9999
             
            --connection holder
              SET DEBUG_SYNC=
                'now WAIT_FOR deleter_has_locked';
              SET DEBUG_SYNC=
                'lock_sec_rec_read_check_and_lock_has_locked SIGNAL holder_has_locked';
              --send DELETE FROM t WHERE a = 9999
             
            --connection waiter
              SET DEBUG_SYNC=
                'now WAIT_FOR holder_has_locked';
              SET DEBUG_SYNC=
                'lock_sec_rec_read_check_and_lock_has_locked SIGNAL waiter_has_locked';
              --send DELETE FROM t WHERE a = 9999
             
            --connection deleter
              --reap
             
            --connection holder
              --reap
             
            --connection waiter
              --reap
             
            --connection default
             
            --disconnect deleter
            --disconnect holder
            --disconnect waiter
            --disconnect dont_purge
             
             
            DROP TABLE `t`;
            SET DEBUG_SYNC='reset';
            

            and the following sync point:

            diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
            index 26388ad95e2..bc59d824ff2 100644
            --- a/storage/innobase/lock/lock0lock.cc
            +++ b/storage/innobase/lock/lock0lock.cc
            @@ -5763,6 +5763,8 @@ lock_sec_rec_modify_check_and_lock(
                    return(err);
             }
             
            +#include "scope.h"
            +
             /*********************************************************************//**
             Like lock_clust_rec_read_check_and_lock(), but reads a
             secondary index record.
            @@ -5791,6 +5793,10 @@ lock_sec_rec_read_check_and_lock(
                    dberr_t err;
                    ulint   heap_no;
             
            +        SCOPE_EXIT([]() {
            +          DEBUG_SYNC_C("lock_sec_rec_read_check_and_lock_has_locked");
            +        });
            +
                    ut_ad(!dict_index_is_clust(index));
                    ut_ad(!dict_index_is_online_ddl(index));
                    ut_ad(block->frame == page_align(rec));
            
            

            from

            commit bfba840dfa7794b988c59c94658920dbe556075d
            Author: Jakub Łopuszański <jakub.lopuszanski@oracle.com>
            Date:   Tue Jun 11 12:36:53 2019 +0200
             
                Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY
            

            on the latest 10.4, as well as with the sequence of steps described in https://bugs.mysql.com/bug.php?id=82127.

            Have not understood yet why.

            vlad.lesin Vladislav Lesin added a comment - - edited Can't repeat it with the following test: --source include/have_innodb.inc --source include/have_debug.inc --source include/have_debug_sync.inc   --connect(dont_purge, localhost,root,,) START TRANSACTION WITH CONSISTENT SNAPSHOT;   --connection default # There are various scenarious in which a transaction already holds "half" # of a record lock ( for example, a lock on the record but not on the gap) # and wishes to "upgrade it" to a full lock (i.e. on both gap and record). # This is often a cause for a deadlock, if there is another transaction # which is already waiting for the lock being blocked by us: # 1 . our granted lock for one half # 2 . her waiting lock for the same half # 3 . our waiting lock for the whole   # # SCENARIO 1 # # In this scenario, three different threads try to delete the same row, # identified by a secondary index key. # This kind of operation (besides LOCK_IX on a table) requires # an LOCK_REC_NOT_GAP|LOCK_REC|LOCK_X lock on a secondary index # 1 . `deleter` is the first to get the required lock # 2 . `holder` enqueues a waiting lock # 3 . `waiter` enqueues right after `holder` # 4 . `deleter` commits, releasing the lock, and granting it to `holder` # 5 . `holder` now observes that the row was deleted, so it needs to # "seal the gap" , by obtaining a LOCK_X|LOCK_REC, but.. # 6 . this causes a deadlock between `holder` and `waiter`   CREATE TABLE `t`( `id` INT, `a` INT DEFAULT NULL, PRIMARY KEY(`id`), UNIQUE KEY `u`(`a`) ) ENGINE=InnoDB;   INSERT INTO t (`id`,`a`) VALUES ( 1 , 1 ), ( 2 , 9999 ), ( 3 , 10000 );   --connect(deleter,localhost,root,,) --connect(holder,localhost,root,,) --connect(waiter,localhost,root,,)     --connection deleter SET DEBUG_SYNC = 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL deleter_has_locked WAIT_FOR waiter_has_locked'; --send DELETE FROM t WHERE a = 9999   --connection holder SET DEBUG_SYNC= 'now WAIT_FOR deleter_has_locked' ; SET DEBUG_SYNC= 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL holder_has_locked' ; --send DELETE FROM t WHERE a = 9999   --connection waiter SET DEBUG_SYNC= 'now WAIT_FOR holder_has_locked' ; SET DEBUG_SYNC= 'lock_sec_rec_read_check_and_lock_has_locked SIGNAL waiter_has_locked' ; --send DELETE FROM t WHERE a = 9999   --connection deleter --reap   --connection holder --reap   --connection waiter --reap   --connection default   --disconnect deleter --disconnect holder --disconnect waiter --disconnect dont_purge     DROP TABLE `t`; SET DEBUG_SYNC= 'reset' ; and the following sync point: diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc index 26388ad95e2..bc59d824ff2 100644 --- a/storage/innobase/lock/lock0lock.cc +++ b/storage/innobase/lock/lock0lock.cc @@ - 5763 , 6 + 5763 , 8 @@ lock_sec_rec_modify_check_and_lock( return (err); } +#include "scope.h" + /*********************************************************************/ /** Like lock_clust_rec_read_check_and_lock(), but reads a secondary index record. @@ - 5791 , 6 + 5793 , 10 @@ lock_sec_rec_read_check_and_lock( dberr_t err; ulint heap_no; + SCOPE_EXIT([]() { + DEBUG_SYNC_C( "lock_sec_rec_read_check_and_lock_has_locked" ); + }); + ut_ad(!dict_index_is_clust(index)); ut_ad(!dict_index_is_online_ddl(index)); ut_ad(block->frame == page_align(rec)); from commit bfba840dfa7794b988c59c94658920dbe556075d Author: Jakub Łopuszański <jakub.lopuszanski@oracle.com> Date: Tue Jun 11 12:36:53 2019 +0200   Bug #23755664 DEADLOCK WITH 3 CONCURRENT DELETES BY UNIQUE KEY on the latest 10.4, as well as with the sequence of steps described in https://bugs.mysql.com/bug.php?id=82127 . Have not understood yet why.
            vlad.lesin Vladislav Lesin added a comment - - edited

            The cause of why I can't reproduce it on 10.4.28 is MDEV-30225 fix.

            vlad.lesin Vladislav Lesin added a comment - - edited The cause of why I can't reproduce it on 10.4.28 is MDEV-30225 fix.

            MDEV-30225 does not fix the bug, but just hides it. If we take a look the test above, the 'holder' does not "seal the gap" after 'deleter' was committed because it was initially sealed, as after MDEV-30225 fix the 'holder' initially requests next-key lock.

            The following test from bfba840dfa7794b988c59c94658920dbe556075d mysql commit shows the issue:

            # SCENARIO 2
            #
            # Here, we form a situation in which con1 has LOCK_REC_NOT_GAP on rows 1 and 2
            # con2 waits for lock on row 1, and then con1 wants to upgrade the lock on row 1,
            # which might cause a deadlock, unless con1 properly notices that even though the
            # lock on row 1 can not be upgraded, a separate LOCK_GAP can be obtaied easily.
             
            CREATE TABLE `t`(
              `id` INT NOT NULL PRIMARY KEY
            ) ENGINE=InnoDB;
             
            INSERT INTO t (`id`) VALUES (1), (2);
             
            --connect(holder,localhost,root,,)
            --connect(waiter,localhost,root,,)
             
            --connection holder
              BEGIN;
              SELECT id FROM t WHERE id=1 FOR UPDATE;
              SELECT id FROM t WHERE id=2 FOR UPDATE;
             
            --connection waiter
              SET DEBUG_SYNC=
                'lock_wait_suspend_thread_enter SIGNAL waiter_will_wait';
              --send SELECT id FROM t WHERE id = 1 FOR UPDATE
             
            --connection holder
              SET DEBUG_SYNC=
                'now WAIT_FOR waiter_will_wait';
              SELECT * FROM t FOR UPDATE;
              COMMIT;
             
            --connection waiter
              --reap
             
            --connection default
             
            --disconnect holder
            --disconnect waiter
             
            DROP TABLE `t`;
            

            vlad.lesin Vladislav Lesin added a comment - MDEV-30225 does not fix the bug, but just hides it. If we take a look the test above, the 'holder' does not "seal the gap" after 'deleter' was committed because it was initially sealed, as after MDEV-30225 fix the 'holder' initially requests next-key lock. The following test from bfba840dfa7794b988c59c94658920dbe556075d mysql commit shows the issue: # SCENARIO 2 # # Here, we form a situation in which con1 has LOCK_REC_NOT_GAP on rows 1 and 2 # con2 waits for lock on row 1 , and then con1 wants to upgrade the lock on row 1 , # which might cause a deadlock, unless con1 properly notices that even though the # lock on row 1 can not be upgraded, a separate LOCK_GAP can be obtaied easily.   CREATE TABLE `t`( `id` INT NOT NULL PRIMARY KEY ) ENGINE=InnoDB;   INSERT INTO t (`id`) VALUES ( 1 ), ( 2 );   --connect(holder,localhost,root,,) --connect(waiter,localhost,root,,)   --connection holder BEGIN; SELECT id FROM t WHERE id= 1 FOR UPDATE; SELECT id FROM t WHERE id= 2 FOR UPDATE;   --connection waiter SET DEBUG_SYNC= 'lock_wait_suspend_thread_enter SIGNAL waiter_will_wait' ; --send SELECT id FROM t WHERE id = 1 FOR UPDATE   --connection holder SET DEBUG_SYNC= 'now WAIT_FOR waiter_will_wait' ; SELECT * FROM t FOR UPDATE; COMMIT;   --connection waiter --reap   --connection default   --disconnect holder --disconnect waiter   DROP TABLE `t`;

            This commit suffers from MDEV-27992, see this comment for details.

            vlad.lesin Vladislav Lesin added a comment - This commit suffers from MDEV-27992 , see this comment for details.

            Thank you, this looks good to me.

            marko Marko Mäkelä added a comment - Thank you, this looks good to me.

            People

              vlad.lesin Vladislav Lesin
              valerii Valerii Kravchuk
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.