[MDEV-16406] Refactor the InnoDB record locks - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: Storage Engine - InnoDB
Labels:
- lock
- performance

Description

Currently, lock_t is a union of table locks and record lock bitmaps. There can be multiple record lock bitmaps in a transaction. A record lock bitmap is specific to an index leaf page and a locking mode. It could be more efficient to use a single bitmap per page, with 4 bits per record heap number to represent all combinations of record and gap lock modes: {shared,exclusive,gap-read,gap-write}.

There are two types of gap locks (covering the open range of keys from the preceding record to the anchor record). A gap-read lock (acquired by UPDATE, DELETE, or a locking SELECT) prevents any subsequent INSERT to that range.

INSERT will acquire a gap-write or insert-intention lock (LOCK_INSERT_INTENTION). It conflicts with previously granted gap-read locks. Gap-read locks may be granted on the same lock if a gap-write lock has been granted to an active transaction.

An INSERT requires both a gap-write lock and an exclusive lock on the key that is being inserted. Any amount of gap-read locks may be held on the same gap by different transactions.

In lock_rec_has_to_wait(), gap-write locks do not conflict with gap-write locks of other transactions, even though lock_mode_compatible() would not hold.

A transaction can wait for at most one record. ~~MDEV-24671~~ already simplified lock waits by introducing the function lock_wait(). If we pack 2 record heap numbers per byte (4 bits per heap number) in the record lock bitmap, there is no space for a further LOCK_WAIT flag. We might want to create a special type of bitmap for a waiting lock request, to avoid allocating a large bitmap (if the record heap number is 3000, the size of a straightforward bitmap would be 1500 bytes). We might store the attributes of the single-record lock request in the trx_t object. Maybe we could just preallocate a record lock bitmap object in trx_t, to be used for all waiting requests?

Attachments

Issue Links

blocks

MDEV-21452 Use condition variables and normal mutexes instead of InnoDB os_event and mutex

Closed

relates to

MDEV-17512 Deadlock after locking all rows in table

Open

MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration

Closed

MDEV-21974 InnoDB DML under backup locks make buffer pool usage grow permanently

Open

MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key

Closed

MDEV-11215 Several locks taken to same record inside a transaction.

Stalled

MDEV-18706 ER_LOCK_DEADLOCK on concurrent read and insert into already locked gap

In Review

MDEV-20612 Improve InnoDB lock_sys scalability

Closed

MDEV-26779 reduce lock_sys.wait_mutex contention by using spinloop construct

Closed

(4 relates to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2018-06-05 12:44 - edited

There are two modes of gap locks, sharable within each mode.
That is, multiple locking reads (possibly as part of DELETE or UPDATE) can acquire a read-gap-lock on the same gap. Also, multiple INSERT can acquire a write-gap-lock (called LOCK_INSERT_INTENTION in the code) on the same gap. But at most one type of gap lock may exist on the same gap at any given time.
The following test demonstrates this:

--source include/have_innodb.inc

CREATE TABLE t1(id INT PRIMARY KEY) ENGINE=InnoDB;

INSERT INTO t1(id) VALUES(1),(2),(3),(4);

BEGIN;

SELECT * FROM t1 WHERE id=0 FOR UPDATE;

connect(con1, localhost, root,,);

SELECT * FROM t1 WHERE id=0 FOR UPDATE;

disconnect con1;

connection default;

DROP TABLE t1;

Both SELECT will acquire a gap lock that precedes the successor of the non-matching record. In this case, the non-matching record would be the page infimum, and the successor would be the record id=1. Any INSERT with a value that is less than this would be prevented.

The example would also work without the INSERT; in that case, the gap lock would be attached to the page supremum, preventing any INSERT into the empty table while either of the two transactions hold the gap lock.

Marko Mäkelä added a comment - 2018-06-05 12:44 - edited There are two modes of gap locks, sharable within each mode. That is, multiple locking reads (possibly as part of DELETE or UPDATE ) can acquire a read-gap-lock on the same gap. Also, multiple INSERT can acquire a write-gap-lock (called LOCK_INSERT_INTENTION in the code) on the same gap. But at most one type of gap lock may exist on the same gap at any given time. The following test demonstrates this: --source include/have_innodb.inc CREATE TABLE t1(id INT PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t1(id) VALUES (1),(2),(3),(4); BEGIN ; SELECT * FROM t1 WHERE id=0 FOR UPDATE ; connect (con1, localhost, root,,); SELECT * FROM t1 WHERE id=0 FOR UPDATE ; disconnect con1; connection default ; DROP TABLE t1; Both SELECT will acquire a gap lock that precedes the successor of the non-matching record. In this case, the non-matching record would be the page infimum, and the successor would be the record id=1 . Any INSERT with a value that is less than this would be prevented. The example would also work without the INSERT ; in that case, the gap lock would be attached to the page supremum, preventing any INSERT into the empty table while either of the two transactions hold the gap lock.

Marko Mäkelä added a comment - 2018-12-31 09:07 - edited

As noted in MDEV-17512, it could be desirable to have a variation of a locking SELECT that acquires write-gap-locks, so that a later INSERT by the same transaction within the already searched range of keys will be guaranteed to succeed. Currently, FOR UPDATE only guarantees that UPDATE and DELETE will succeed.

Marko Mäkelä added a comment - 2018-12-31 09:07 - edited As noted in MDEV-17512 , it could be desirable to have a variation of a locking SELECT that acquires write-gap-locks, so that a later INSERT by the same transaction within the already searched range of keys will be guaranteed to succeed. Currently, FOR UPDATE only guarantees that UPDATE and DELETE will succeed.

Marko Mäkelä added a comment - 2019-02-23 18:58

In a discussion with monty, arubin and Peter Zaitsev today, the need for having secondary index record locks was questioned. The basic idea is that instead of checking if a lock exists on a secondary index record, we check the lock on the corresponding clustered index record.

We would still seem to need gap locks on secondary index records. A locking read on a secondary index must prevent an INSERT of the same value, for any primary key value:

connection con1;

BEGIN;

# returning any amount of rows, possibly including 0

SELECT COUNT(*) FROM t WHERE indexed_column BETWEEN 42 AND 64 LOCK IN SHARE MODE;

connection con2;

--error ER_LOCK_WAIT_TIMEOUT

INSERT INTO t SET pk=12345, indexed_column=50;

--error ER_LOCK_WAIT_TIMEOUT

UPDATE t SET indexed_column=50;

--error ER_LOCK_WAIT_TIMEOUT

DELETE FROM t WHERE indexed_column BETWEEN 30 AND 70;

connection con1;

# must return the same result!

SELECT COUNT(*) FROM t WHERE indexed_column BETWEEN 42 AND 64 LOCK IN SHARE MODE;

If the SELECT did not acquire gap locks that cover the whole range [42,64], then the INSERT, DELETE or UPDATE would not be blocked.

Note: Even if users do not use SERIALIZABLE isolation level or LOCK IN SHARE MODE or FOR UPDATE, InnoDB will internally acquire shared or exclusive locks in FOREIGN KEY checks or CASCADE or SET operations.

It would seem that we may not really need single-record locks on secondary indexes, but gap locks we do need.

There is a possible reason against relying explicitly on gap locks on secondary indexes. Gap locks are inherently nondeterministic (potentially causing problems with some forms of replication), since they can be attached to a delete-marked purgeable record. That is, purge can widen gaps at any time by removing those "invisible" keys. Because of this, it might be better to prefer single-record locks over gap locks in those cases where we want to lock a single secondary index key.

Marko Mäkelä added a comment - 2019-02-23 18:58 In a discussion with monty , arubin and Peter Zaitsev today, the need for having secondary index record locks was questioned. The basic idea is that instead of checking if a lock exists on a secondary index record, we check the lock on the corresponding clustered index record. We would still seem to need gap locks on secondary index records. A locking read on a secondary index must prevent an INSERT of the same value, for any primary key value: connection con1; BEGIN ; # returning any amount of rows , possibly including 0 SELECT COUNT (*) FROM t WHERE indexed_column BETWEEN 42 AND 64 LOCK IN SHARE MODE; connection con2; --error ER_LOCK_WAIT_TIMEOUT INSERT INTO t SET pk=12345, indexed_column=50; --error ER_LOCK_WAIT_TIMEOUT UPDATE t SET indexed_column=50; --error ER_LOCK_WAIT_TIMEOUT DELETE FROM t WHERE indexed_column BETWEEN 30 AND 70; connection con1; # must return the same result! SELECT COUNT (*) FROM t WHERE indexed_column BETWEEN 42 AND 64 LOCK IN SHARE MODE; If the SELECT did not acquire gap locks that cover the whole range [42,64], then the INSERT , DELETE or UPDATE would not be blocked. Note: Even if users do not use SERIALIZABLE isolation level or LOCK IN SHARE MODE or FOR UPDATE , InnoDB will internally acquire shared or exclusive locks in FOREIGN KEY checks or CASCADE or SET operations. It would seem that we may not really need single-record locks on secondary indexes, but gap locks we do need. There is a possible reason against relying explicitly on gap locks on secondary indexes. Gap locks are inherently nondeterministic (potentially causing problems with some forms of replication), since they can be attached to a delete-marked purgeable record. That is, purge can widen gaps at any time by removing those "invisible" keys. Because of this, it might be better to prefer single-record locks over gap locks in those cases where we want to lock a single secondary index key.

Marko Mäkelä added a comment - 2020-02-19 20:11

Gap locks should be deterministic, once we fix ~~MDEV-20605~~.

Marko Mäkelä added a comment - 2020-02-19 20:11 Gap locks should be deterministic, once we fix MDEV-20605 .

MariaDB Server

Refactor the InnoDB record locks

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration