[MDEV-36308] Unreasonable block in repeatable read - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 11.7.2
Fix Version/s: None
Component/s: Locking
Labels:
None
Environment:
Ubuntu 22.04

Description

When I tested the database repeatable read isolation level, I found a issue.
As shown in the use case below, the delete statement is blocked. But in my opinion there is no lock conflict here and it should not be blocked.

--- I see

--- session 0 to init

SET GLOBAL innodb_lock_wait_timeout = 20;

SET GLOBAL lock_wait_timeout = 20;

DROP TABLE IF EXISTS tuEkZFXW;

CREATE TABLE tuEkZFXW (ID INT, VAL INT, c0 DOUBLE  PRIMARY KEY, c1 DECIMAL(55, 11) , c2 BOOLEAN );

INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (460, 561, 6537.3885, 7569.5750, 1);

INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (464, 566, 4044.8689, 8585.5846, 1);

--- session 1

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN;

--- session 2

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN;

--- session 1

INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0);

--- session 2

DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100); ---blocked

--- session 1

SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;

+------+------+-----------+------------------+------+

| ID   | VAL  | c0        | c1               | c2   |

+------+------+-----------+------------------+------+

|  460 |  561 | 6537.3885 | 7569.57500000000 |    1 |

|  464 |  566 | 4044.8689 | 8585.58460000000 |    1 |

+------+------+-----------+------------------+------+

2 rows in set (0.00 sec)

COMMIT;

--- session 2

COMMIT;

--- expected to see

--- I see in PG

...

--- session 2 (not block)

 DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100);

DELETE 3

--- session 1

SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;

 id  | val |    c0     |    c1    | c2

-----+-----+-----------+----------+----

 460 | 561 | 6537.3887 | 7569.575 |  1

 464 | 566 |  4044.869 | 8585.585 |  1

(2 rows)

...

Attachments

Issue Links

relates to

MDEV-16232 Use fewer mini-transactions

Stalled

MDEV-16402 Support Index Condition Pushdown for clustered PK scans

Confirmed

MDEV-35124 Set innodb_snapshot_isolation=ON by default

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 5 days ago

I can reproduce the claimed blocking in a recent build of the 10.11 branch. I converted the described scenario to a .test file of the mtr tool:

--source include/have_innodb.inc

SET GLOBAL innodb_lock_wait_timeout = 20, lock_wait_timeout = 20;

CREATE TABLE tuEkZFXW (ID INT, VAL INT, c0 DOUBLE  PRIMARY KEY, c1 DECIMAL(55, 11) , c2 BOOLEAN ) ENGINE=InnoDB;

INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2)

VALUES (460, 561, 6537.3885, 7569.5750, 1),(464, 566, 4044.8689, 8585.5846, 1);

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN;

connect(con2,localhost,root);

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN;

connection default;

INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0);

connection con2;

DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100);

connection default;

SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;

COMMIT;

connection con2;

COMMIT;

disconnect con2;

connection default;

DROP TABLE tuEkZFXW;

Like I expect, the DELETE statement will be blocked in lock_wait(). The condition c0>=100 should match all READ UNCOMMITTED rows in the table. Both DELETE and UPDATE will typically be preceded by a call to row_search_mvcc() with prebuilt->select_lock_type=LOCK_X. I see that the DELETE is invoking lock_rec_convert_impl_to_expl() on the record c0=7105.4030 that the conflicting INSERT statement executed.

Why do you think that this locking conflict is unreasonable? For the record, when I replaced the DELETE with the following statement (to take advantage of ~~MDEV-13115~~), the locked record was skipped in the output:

SELECT * FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) FOR UPDATE SKIP LOCKED;

Marko Mäkelä added a comment - 5 days ago I can reproduce the claimed blocking in a recent build of the 10.11 branch. I converted the described scenario to a .test file of the mtr tool: --source include/have_innodb.inc SET GLOBAL innodb_lock_wait_timeout = 20, lock_wait_timeout = 20; CREATE TABLE tuEkZFXW (ID INT , VAL INT , c0 DOUBLE PRIMARY KEY , c1 DECIMAL (55, 11) , c2 BOOLEAN ) ENGINE=InnoDB; INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (460, 561, 6537.3885, 7569.5750, 1),(464, 566, 4044.8689, 8585.5846, 1); SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ; BEGIN ; connect (con2,localhost,root); SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ; BEGIN ; connection default ; INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0); connection con2; DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100); connection default ; SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID; COMMIT ; connection con2; COMMIT ; disconnect con2; connection default ; DROP TABLE tuEkZFXW; Like I expect, the DELETE statement will be blocked in lock_wait() . The condition c0>=100 should match all READ UNCOMMITTED rows in the table. Both DELETE and UPDATE will typically be preceded by a call to row_search_mvcc() with prebuilt->select_lock_type=LOCK_X . I see that the DELETE is invoking lock_rec_convert_impl_to_expl() on the record c0=7105.4030 that the conflicting INSERT statement executed. Why do you think that this locking conflict is unreasonable? For the record, when I replaced the DELETE with the following statement (to take advantage of MDEV-13115 ), the locked record was skipped in the output: SELECT * FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) FOR UPDATE SKIP LOCKED;

huicong xu added a comment - 5 days ago

Thank you for your response.
I think a transaction shouldn't be able to see another transaction's inserts under repeatable read isolation level.
Also, the newly inserted row does not satisfy the condition “c0 < 0 OR c0 > 0”, so I don't think it should be locked.

huicong xu added a comment - 5 days ago Thank you for your response. I think a transaction shouldn't be able to see another transaction's inserts under repeatable read isolation level. Also, the newly inserted row does not satisfy the condition “c0 < 0 OR c0 > 0”, so I don't think it should be locked.

Marko Mäkelä added a comment - 5 days ago

If MDEV-16402 or MDEV-16232 were fixed, then it should be possible to filter out the freshly inserted record c0=0 before the record is locked. Currently, InnoDB will simply attempt to lock each visited index record.

For UPDATE and possibly also DELETE operations there would also be a tweak that was originally implemented by me to address MySQL Bug #3300. That is, once a record is visited and it is found to be not matching the condition, it would be unlocked. This tweak would not be applicable here, because we’d still lock the record for checking the condition, and then unlock it. We would still have a conflict with the uncommitted INSERT statement.

Locking operations in InnoDB always see the latest data. Starting with ~~MDEV-35124~~, InnoDB’s REPEATABLE READ is Snapshot Isolation by default (and ER_CHECKREAD errors are possible). Before that, the InnoDB REPEATABLE READ was something for which there is no nice formal definition.

Marko Mäkelä added a comment - 5 days ago If MDEV-16402 or MDEV-16232 were fixed, then it should be possible to filter out the freshly inserted record c0=0 before the record is locked. Currently, InnoDB will simply attempt to lock each visited index record. For UPDATE and possibly also DELETE operations there would also be a tweak that was originally implemented by me to address MySQL Bug #3300 . That is, once a record is visited and it is found to be not matching the condition, it would be unlocked. This tweak would not be applicable here, because we’d still lock the record for checking the condition, and then unlock it. We would still have a conflict with the uncommitted INSERT statement. Locking operations in InnoDB always see the latest data. Starting with MDEV-35124 , InnoDB’s REPEATABLE READ is Snapshot Isolation by default (and ER_CHECKREAD errors are possible). Before that, the InnoDB REPEATABLE READ was something for which there is no nice formal definition.

Marko Mäkelä added a comment - 5 days ago

Our documentation seems to be somewhat lacking in this area. https://mariadb.com/kb/en/innodb-lock-modes/ does not mention that visited index records would be locked, optionally in a secondary index and usually in the clustered index (except when a covering secondary index read is possible).

Marko Mäkelä added a comment - 5 days ago Our documentation seems to be somewhat lacking in this area. https://mariadb.com/kb/en/innodb-lock-modes/ does not mention that visited index records would be locked, optionally in a secondary index and usually in the clustered index (except when a covering secondary index read is possible).

MariaDB Server

Unreasonable block in repeatable read

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration