Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23560

Deadlock detected on SELECT when only one record being processed

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.14
    • None
    • None
    • None
    • Vanilla Docker image of MariaDb

    Description

      I have a complex situation that I have boiled down to a relatively simple testcase. I have been unable to use any MySQL Unit Test infrastructure as the issue involves having 3 simultaneous clients. I am attaching a JDBC program which can be used as described below to demonstrate the problem. I have confirmed this is not a JDBC issue as I have an equivalent ODBC program that exhibits the same behaviour.

      I have one client session that has successfully done a 'SELECT ... FOR UPDATE' on a record (with autocommit off), and I have two other sessions (with autocommit off) that are doing the same select on the same key. These two other sessions are blocked, waiting for a lock. The session that has the lock, then does a delete of the selected record (in JDBC, it does a resultSet.deleteRow(), in ODBC it does a SQLSetPos(..., SQL_DELETE)). That is fine. Finally, that session does a commit. At this point the two blocked sessions come to life. I am expecting both of the 'SELECT' statements to return successfully with 'no rows found'. The bug is that instead of both doing that, one of them returns successfully with 'no rows found', but the other one produces a 'Deadlock found when trying to get lock; try restarting transaction'.
      The session that produces the 'deadlock' error is always the second item in the waiter queue.
      [Please see at the bottom for more specifics about when the problem occurs and when it doesn't]

      Setup:
      ====
      I have a simple table defined as

      MariaDB [clouddb]> describe BUG_DEMO;
      +-------+---------+------+-----+---------+-------+
      | Field | Type    | Null | Key | Default | Extra |
      +-------+---------+------+-----+---------+-------+
      | C1    | int(11) | NO   | PRI | NULL    |       |
      | C2    | int(11) | NO   | UNI | NULL    |       |
      +-------+---------+------+-----+---------+-------+
      2 rows in set (0.001 sec)
      

      created using 'CREATE TABLE BUG_DEMO (C1 INTEGER NOT NULL PRIMARY KEY, C2 INTEGER NOT NULL UNIQUE KEY)'

      I then insert a single row into this table using for example "INSERT INTO BUG_DEMO(C1,C2) VALUES(12345, 12345)"

      (These two operations can be done with my attached program by running 'make create' and 'make insert'.

      Reproduce:
      =======
      Create 3 client Unix terminal windows. In each of them type 'make delete' and press enter. The 3 windows each connect to the database and await user action. In all three windows, now press enter. This issues the 'SELECT' in all 3 sessions. The first will execute immediately, and the other two will block. The first of the windows now prompts for user action; and you should press enter. This then does a delete and does another prompt. Notice the other windows remain unchanged. Then pressing enter on the active window causes a commit() to occur. At this point the two 'block'ed windows come to like and exhibit the undesired behaviour.

      Problem Specifics:
      The code in the given reproduces the problem. Some slight variations do change things.
      a) The datatypes of the two columns seems to have no effect on the issue.
      b) Having more columns does not seem to affect the issue
      c) If I remove the 'UNIQUE KEY' from C2, the problem does not occur.
      d) If the select is done on C1 (the primary key) instead of C2, the problem does not occur.
      So, it seems to be specific to when a SELECT is done on a UNIQUE KEYed column that is NOT the primary key.

      Attachments

        Issue Links

          Activity

            Where is the client program? Could you repeat this with something like

            yes 'SELECT * FROM BUG_DEMO …;'|mysql -uroot test &
            yes 'INSERT …;'|mysql -uroot test &
            

            If you are using InnoDB, then this might be working as designed. InnoDB does not have a concept of ‘table row locks’, but instead it has ‘index record’ locks. If the SELECT is accessing the secondary UNIQUE index first, it can easily deadlock with an INSERT that would first lock (and modify) the primary key index before locking the secondary indexes, one by one. This report could actually be a duplicate of MDEV-10962. As you may know, both DELETE and UPDATE internally invoke a locking read first, and such deadlocks are possible when using a secondary index in the locking read.

            marko Marko Mäkelä added a comment - Where is the client program? Could you repeat this with something like yes 'SELECT * FROM BUG_DEMO …;'|mysql -uroot test & yes 'INSERT …;'|mysql -uroot test & If you are using InnoDB, then this might be working as designed. InnoDB does not have a concept of ‘table row locks’, but instead it has ‘index record’ locks. If the SELECT is accessing the secondary UNIQUE index first, it can easily deadlock with an INSERT that would first lock (and modify) the primary key index before locking the secondary indexes, one by one. This report could actually be a duplicate of MDEV-10962 . As you may know, both DELETE and UPDATE internally invoke a locking read first, and such deadlocks are possible when using a secondary index in the locking read.

            Sorry for the delay in putting my test program on - my network died. .. it is there now.

            stevem Steve Millington added a comment - Sorry for the delay in putting my test program on - my network died. .. it is there now.

            Marko,

            Thanks for getting back.

            I think that 'If the SELECT is accessing the secondary UNIQUE index first, it can easily deadlock with an INSERT that would first lock (and modify) the primary key index before locking the secondary indexes, one by one.' is not the situation I am in. My situation is purely with SELECT ... / DELETE. The only INSERT is prior to the issue setting up the database.

            This does look a bit similar to 10962, but it is really hard for me to tell.

            stevem Steve Millington added a comment - Marko, Thanks for getting back. I think that 'If the SELECT is accessing the secondary UNIQUE index first, it can easily deadlock with an INSERT that would first lock (and modify) the primary key index before locking the secondary indexes, one by one.' is not the situation I am in. My situation is purely with SELECT ... / DELETE. The only INSERT is prior to the issue setting up the database. This does look a bit similar to 10962, but it is really hard for me to tell.

            It is intuitively odd that the exact same select issues by two clients would produce deadlocks. If any locks were taken, one would expect them to be taken in the same order and therefore no deadlock to occur! However. I am aware that my intuition is not necessarily what would occur!

            stevem Steve Millington added a comment - It is intuitively odd that the exact same select issues by two clients would produce deadlocks. If any locks were taken, one would expect them to be taken in the same order and therefore no deadlock to occur! However. I am aware that my intuition is not necessarily what would occur!
            stevem Steve Millington added a comment - - edited

            Here is the output of innodb status

            MariaDB [(none)]> show engine innodb status\G
            *************************** 1. row ***************************
              Type: InnoDB
              Name: 
            Status: 
            =====================================
            2020-08-24 15:53:29 0x7fab642ee700 INNODB MONITOR OUTPUT
            =====================================
            Per second averages calculated from the last 23 seconds
            -----------------
            BACKGROUND THREAD
            -----------------
            srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 7975 srv_idle
            srv_master_thread log flush and writes: 7975
            ----------
            SEMAPHORES
            ----------
            OS WAIT ARRAY INFO: reservation count 52
            OS WAIT ARRAY INFO: signal count 35
            RW-shared spins 55, rounds 1336, OS waits 44
            RW-excl spins 0, rounds 0, OS waits 0
            RW-sx spins 0, rounds 0, OS waits 0
            Spin rounds per wait: 24.29 RW-shared, 0.00 RW-excl, 0.00 RW-sx
            ------------------------
            LATEST DETECTED DEADLOCK
            ------------------------
            2020-08-24 14:47:47 0x7fab64258700
            *** (1) TRANSACTION:
            TRANSACTION 69, ACTIVE 4 sec starting index read
            mysql tables in use 1, locked 1
            LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
            MySQL thread id 30, OS thread handle 140374096008960, query id 112 192.168.7.158 vsamdso Statistics
            SELECT C1,C2 FROM BUG_DEMO WHERE C2 = 12345 FOR UPDATE WAIT 30
            *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
            RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 69 lock_mode X locks rec but not gap waiting
            Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
             0: len 4; hex 80003039; asc   09;;
             1: len 4; hex 80003039; asc   09;;
             
            *** (2) TRANSACTION:
            TRANSACTION 68, ACTIVE 5 sec starting index read
            mysql tables in use 1, locked 1
            3 lock struct(s), heap size 1128, 2 row lock(s)
            MySQL thread id 28, OS thread handle 140374096316160, query id 111 192.168.7.158 vsamdso Statistics
            SELECT C1,C2 FROM BUG_DEMO WHERE C2 = 12345 FOR UPDATE WAIT 30
            *** (2) HOLDS THE LOCK(S):
            RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 68 lock_mode X locks rec but not gap
            Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
             0: len 4; hex 80003039; asc   09;;
             1: len 4; hex 80003039; asc   09;;
             
            *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
            RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 68 lock_mode X waiting
            Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
             0: len 4; hex 80003039; asc   09;;
             1: len 4; hex 80003039; asc   09;;
             
            *** WE ROLL BACK TRANSACTION (1)
            ------------
            TRANSACTIONS
            ------------
            Trx id counter 75
            Purge done for trx's n:o < 75 undo n:o < 0 state: running
            History list length 31
            LIST OF TRANSACTIONS FOR EACH SESSION:
            ---TRANSACTION 421849074450776, not started
            0 lock struct(s), heap size 1128, 0 row lock(s)
            ---TRANSACTION 421849074446520, not started
            0 lock struct(s), heap size 1128, 0 row lock(s)
            --------
            FILE I/O
            --------
            I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
            I/O thread 1 state: waiting for completed aio requests (log thread)
            I/O thread 2 state: waiting for completed aio requests (read thread)
            I/O thread 3 state: waiting for completed aio requests (read thread)
            I/O thread 4 state: waiting for completed aio requests (read thread)
            I/O thread 5 state: waiting for completed aio requests (read thread)
            I/O thread 6 state: waiting for completed aio requests (write thread)
            I/O thread 7 state: waiting for completed aio requests (write thread)
            I/O thread 8 state: waiting for completed aio requests (write thread)
            I/O thread 9 state: waiting for completed aio requests (write thread)
            Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
             ibuf aio reads:, log i/o's:, sync i/o's:
            Pending flushes (fsync) log: 0; buffer pool: 0
            192 OS file reads, 350 OS file writes, 131 OS fsyncs
            0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
            -------------------------------------
            INSERT BUFFER AND ADAPTIVE HASH INDEX
            -------------------------------------
            Ibuf: size 1, free list len 0, seg size 2, 0 merges
            merged operations:
             insert 0, delete mark 0, delete 0
            discarded operations:
             insert 0, delete mark 0, delete 0
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            Hash table size 69257, node heap has 0 buffer(s)
            0.00 hash searches/s, 0.00 non-hash searches/s
            ---
            LOG
            ---
            Log sequence number 86093
            Log flushed up to   86093
            Pages flushed up to 86093
            Last checkpoint at  86084
            0 pending log flushes, 0 pending chkp writes
            87 log i/o's done, 0.00 log i/o's/second
            ----------------------
            BUFFER POOL AND MEMORY
            ----------------------
            Total large memory allocated 285212672
            Dictionary memory allocated 32968
            Buffer pool size   16046
            Free buffers       15711
            Database pages     335
            Old database pages 0
            Modified db pages  0
            Percent of dirty pages(LRU & free pages): 0.000
            Max dirty pages percent: 75.000
            Pending reads 0
            Pending writes: LRU 0, flush list 0, single page 0
            Pages made young 0, not young 0
            0.00 youngs/s, 0.00 non-youngs/s
            Pages read 176, created 159, written 256
            0.00 reads/s, 0.00 creates/s, 0.00 writes/s
            No buffer pool page gets since the last printout
            Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
            LRU len: 335, unzip_LRU len: 0
            I/O sum[0]:cur[0], unzip sum[0]:cur[0]
            --------------
            ROW OPERATIONS
            --------------
            0 queries inside InnoDB, 0 queries in queue
            0 read views open inside InnoDB
            Process ID=1, Main thread ID=140373336721152, state: sleeping
            Number of rows inserted 4, updated 0, deleted 3, read 6
            0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
            Number of system rows inserted 0, updated 0, deleted 0, read 0
            0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
            ----------------------------
            END OF INNODB MONITOR OUTPUT
            ============================
             
            1 row in set (0.001 sec)
            

            stevem Steve Millington added a comment - - edited Here is the output of innodb status MariaDB [(none)]> show engine innodb status\G *************************** 1. row *************************** Type: InnoDB Name : Status: ===================================== 2020-08-24 15:53:29 0x7fab642ee700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 23 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 7975 srv_idle srv_master_thread log flush and writes: 7975 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 52 OS WAIT ARRAY INFO: signal count 35 RW-shared spins 55, rounds 1336, OS waits 44 RW-excl spins 0, rounds 0, OS waits 0 RW-sx spins 0, rounds 0, OS waits 0 Spin rounds per wait: 24.29 RW-shared, 0.00 RW-excl, 0.00 RW-sx ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2020-08-24 14:47:47 0x7fab64258700 *** (1) TRANSACTION : TRANSACTION 69, ACTIVE 4 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s) MySQL thread id 30, OS thread handle 140374096008960, query id 112 192.168.7.158 vsamdso Statistics SELECT C1,C2 FROM BUG_DEMO WHERE C2 = 12345 FOR UPDATE WAIT 30 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 69 lock_mode X locks rec but not gap waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 80003039; asc 09;; 1: len 4; hex 80003039; asc 09;;   *** (2) TRANSACTION : TRANSACTION 68, ACTIVE 5 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1128, 2 row lock(s) MySQL thread id 28, OS thread handle 140374096316160, query id 111 192.168.7.158 vsamdso Statistics SELECT C1,C2 FROM BUG_DEMO WHERE C2 = 12345 FOR UPDATE WAIT 30 *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 68 lock_mode X locks rec but not gap Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 80003039; asc 09;; 1: len 4; hex 80003039; asc 09;;   *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 6 page no 4 n bits 72 index C2 of table `clouddb`.`BUG_DEMO` trx id 68 lock_mode X waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 80003039; asc 09;; 1: len 4; hex 80003039; asc 09;;   *** WE ROLL BACK TRANSACTION (1) ------------ TRANSACTIONS ------------ Trx id counter 75 Purge done for trx 's n:o < 75 undo n:o < 0 state: running History list length 31 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 421849074450776, not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION 421849074446520, not started 0 lock struct(s), heap size 1128, 0 row lock(s) -------- FILE I/O -------- I/O thread 0 state: waiting for completed aio requests (insert buffer thread) I/O thread 1 state: waiting for completed aio requests (log thread) I/O thread 2 state: waiting for completed aio requests (read thread) I/O thread 3 state: waiting for completed aio requests (read thread) I/O thread 4 state: waiting for completed aio requests (read thread) I/O thread 5 state: waiting for completed aio requests (read thread) I/O thread 6 state: waiting for completed aio requests (write thread) I/O thread 7 state: waiting for completed aio requests (write thread) I/O thread 8 state: waiting for completed aio requests (write thread) I/O thread 9 state: waiting for completed aio requests (write thread) Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] , ibuf aio reads:, log i/o' s:, sync i/o 's: Pending flushes (fsync) log: 0; buffer pool: 0 192 OS file reads, 350 OS file writes, 131 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 0 merges merged operations: insert 0, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) Hash table size 69257, node heap has 0 buffer(s) 0.00 hash searches/s, 0.00 non-hash searches/s --- LOG --- Log sequence number 86093 Log flushed up to 86093 Pages flushed up to 86093 Last checkpoint at 86084 0 pending log flushes, 0 pending chkp writes 87 log i/o' s done, 0.00 log i/o's/ second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total large memory allocated 285212672 Dictionary memory allocated 32968 Buffer pool size 16046 Free buffers 15711 Database pages 335 Old database pages 0 Modified db pages 0 Percent of dirty pages(LRU & free pages): 0.000 Max dirty pages percent: 75.000 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 0, not young 0 0.00 youngs/s, 0.00 non-youngs/s Pages read 176, created 159, written 256 0.00 reads/s, 0.00 creates/s, 0.00 writes/s No buffer pool page gets since the last printout Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 335, unzip_LRU len: 0 I/O sum [0]:cur[0], unzip sum [0]:cur[0] -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Process ID=1, Main thread ID=140373336721152, state: sleeping Number of rows inserted 4, updated 0, deleted 3, read 6 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s Number of system rows inserted 0, updated 0, deleted 0, read 0 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================   1 row in set (0.001 sec)

            I know nothing of this, but to my unknowing eyes, it looks like the deadlock is with itself!

            stevem Steve Millington added a comment - I know nothing of this, but to my unknowing eyes, it looks like the deadlock is with itself!

            People

              Unassigned Unassigned
              stevem Steve Millington
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.