Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2.18, 10.3.10
-
Windows 7 Pro x64, Ubuntu Xenial
Description
This is a surprisingly simple scenario leading to a (to me) surprising deadlock. Since I'm not an expert on InnoDB I have already tried other ways to find out whether it is a bug (see https://dba.stackexchange.com/questions/216450/is-this-surprising-deadlock-reasonable-or-a-mariadb-bug), but unfortunately didn't get any helpful responses, so I'm filing it now.
Here is the setup. The isolation level is left at its default (REPEATABLE_READ):
create database deadlock;
use deadlock;
create table foo(id int primary key) engine=InnoDB;
insert into foo values (10);
Now we can start. Let's open two connections A and B to the database.
-- Connection A
begin;
select * from foo for update;
-- Connection B
begin;
select * from foo for update;
At this point, the select issued by B is waiting. This is as expected because A is holding the relevant locks. Note that the StackExchange question used "delete from foo;" instead in both transactions - it makes no difference to the result, but a select seems easier to analyze.
-- Connection A
insert into foo values (9);
And at this point, mariadb kills the waiting query of B due to a deadlock. This is very surprising to me, because it would mean that B has already managed to acquire some locks.
As an additional point of information, if you use the id 11 instead of 9 in the last statement, there is no deadlock.
This test is a simplification of a production scenario where this deadlock kept occurring - there was a transaction that first deleted "invalid" records before inserting a new one. Whether a record was "invalid" was determined in a way that did not use an index, so the entire table had to be scanned. I did not understand how the deadlock could occur, since my mental model was that the delete would essentially get an exclusive lock on everything relevant in the table, so a second transaction running the same queries would have to wait before being able to acquire any locks, and thus could not possibly contribute to a deadlock.
Attachments
Issue Links
- relates to
-
MDEV-16406 Refactor the InnoDB record locks
-
- Open
-
-
MDEV-24813 Locking full table scan fails to use table-level locking
-
- In Review
-
Here is the last deadlock info for the test case that you described. It is basically the same as the one you attached, but the attachment is for DELETE, so I'll get another one, to avoid confusion.
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-29 15:53:36 0x7ffb08d3c700
*** (1) TRANSACTION:
TRANSACTION 58, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1160, 1 row lock(s)
MySQL thread id 9, OS thread handle 140716161918720, query id 54 localhost ::1 root Sending data
select * from foo for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index PRIMARY of table `deadlock`.`foo` trx id 58 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
*** (2) TRANSACTION:
TRANSACTION 57, ACTIVE 19 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1160, 3 row lock(s)
MySQL thread id 10, OS thread handle 140716161615616, query id 57 localhost ::1 root Update
insert into foo values (9)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6 page no 3 n bits 72 index PRIMARY of table `deadlock`.`foo` trx id 57 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index PRIMARY of table `deadlock`.`foo` trx id 57 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
*** WE ROLL BACK TRANSACTION (1)
Here Connection A is TRANSACTION(2), and Connection B is TRANSACTION(1)
So, to my understanding, in your case Connection A, by running SELECT .. FOR UPDATE, acquired an exclusive lock on the index record, and also a shared gap lock.
Connection B, by starting SELECT .. FOR UPDATE, also acquired a shared gap lock (nothing prevents it from doing so), and requested an exclusive lock on the index record, but couldn't get the record lock, hence waiting.
Then, Connection A, by starting INSERT, requested an exclusive gap lock, but couldn't get it, because Connection B is holding the shared one:
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 32882 page no 3 n bits 72 index PRIMARY of table `deadlock`.`foo` trx id 481731 lock_mode X locks gap before rec i
Hence the deadlock.
It's a bit inconvenient that there is no clear indication of gap lock holding in the LAST DEADLOCK info (or in TRANSACTIONS info, for that matter), so I'll assign it to marko to confirm that's what is happening here, and if so, clarify if it's possible to see acquired gap locks in innodb status (apart from counting rows and seeing that there is one too many in "row locks").