Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36308

Unreasonable block in repeatable read

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.7.2
    • None
    • Locking
    • None
    • Ubuntu 22.04

    Description

      When I tested the database repeatable read isolation level, I found a issue.
      As shown in the use case below, the delete statement is blocked. But in my opinion there is no lock conflict here and it should not be blocked.

      --- I see
      --- session 0 to init
      SET GLOBAL innodb_lock_wait_timeout = 20;
      SET GLOBAL lock_wait_timeout = 20;
      DROP TABLE IF EXISTS tuEkZFXW;
      CREATE TABLE tuEkZFXW (ID INT, VAL INT, c0 DOUBLE  PRIMARY KEY, c1 DECIMAL(55, 11) , c2 BOOLEAN );
      INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (460, 561, 6537.3885, 7569.5750, 1);
      INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (464, 566, 4044.8689, 8585.5846, 1);
       
      --- session 1
      SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
      BEGIN;
      --- session 2
      SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
      BEGIN;
      --- session 1
      INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0);
      --- session 2
      DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100); ---blocked
      --- session 1
      SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;
      +------+------+-----------+------------------+------+
      | ID   | VAL  | c0        | c1               | c2   |
      +------+------+-----------+------------------+------+
      |  460 |  561 | 6537.3885 | 7569.57500000000 |    1 |
      |  464 |  566 | 4044.8689 | 8585.58460000000 |    1 |
      +------+------+-----------+------------------+------+
      2 rows in set (0.00 sec)
      COMMIT;
      --- session 2
      COMMIT;
       
      --- expected to see
      --- I see in PG
      ...
      --- session 2 (not block)
       DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100);
      DELETE 3
      --- session 1 
      SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;
       id  | val |    c0     |    c1    | c2 
      -----+-----+-----------+----------+----
       460 | 561 | 6537.3887 | 7569.575 |  1
       464 | 566 |  4044.869 | 8585.585 |  1
      (2 rows)
      ...
      

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä added a comment -

            I can reproduce the claimed blocking in a recent build of the 10.11 branch. I converted the described scenario to a .test file of the mtr tool:

            --source include/have_innodb.inc
            SET GLOBAL innodb_lock_wait_timeout = 20, lock_wait_timeout = 20;
            CREATE TABLE tuEkZFXW (ID INT, VAL INT, c0 DOUBLE  PRIMARY KEY, c1 DECIMAL(55, 11) , c2 BOOLEAN ) ENGINE=InnoDB;
            INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2)
            VALUES (460, 561, 6537.3885, 7569.5750, 1),(464, 566, 4044.8689, 8585.5846, 1);
             
            SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
            BEGIN;
             
            connect(con2,localhost,root);
            SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
            BEGIN;
             
            connection default;
            INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0);
            connection con2;
            DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100);
             
            connection default;
            SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID;
            COMMIT;
             
            connection con2;
            COMMIT;
            disconnect con2;
            connection default;
            DROP TABLE tuEkZFXW;
            

            Like I expect, the DELETE statement will be blocked in lock_wait(). The condition c0>=100 should match all READ UNCOMMITTED rows in the table. Both DELETE and UPDATE will typically be preceded by a call to row_search_mvcc() with prebuilt->select_lock_type=LOCK_X. I see that the DELETE is invoking lock_rec_convert_impl_to_expl() on the record c0=7105.4030 that the conflicting INSERT statement executed.

            Why do you think that this locking conflict is unreasonable? For the record, when I replaced the DELETE with the following statement (to take advantage of MDEV-13115), the locked record was skipped in the output:

            SELECT * FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) FOR UPDATE SKIP LOCKED;
            

            marko Marko Mäkelä added a comment - I can reproduce the claimed blocking in a recent build of the 10.11 branch. I converted the described scenario to a .test file of the mtr tool: --source include/have_innodb.inc SET GLOBAL innodb_lock_wait_timeout = 20, lock_wait_timeout = 20; CREATE TABLE tuEkZFXW (ID INT , VAL INT , c0 DOUBLE PRIMARY KEY , c1 DECIMAL (55, 11) , c2 BOOLEAN ) ENGINE=InnoDB; INSERT INTO tuEkZFXW (ID, VAL, c0, c1, c2) VALUES (460, 561, 6537.3885, 7569.5750, 1),(464, 566, 4044.8689, 8585.5846, 1);   SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ; BEGIN ;   connect (con2,localhost,root); SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ; BEGIN ;   connection default ; INSERT INTO tuEkZFXW (ID, VAL, c0, c2) VALUES ( 468, 570, 7105.4030, 0); connection con2; DELETE FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100);   connection default ; SELECT ID, VAL, tuEkZFXW.c0, tuEkZFXW.c1, tuEkZFXW.c2 FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) ORDER BY tuEkZFXW.ID; COMMIT ;   connection con2; COMMIT ; disconnect con2; connection default ; DROP TABLE tuEkZFXW; Like I expect, the DELETE statement will be blocked in lock_wait() . The condition c0>=100 should match all READ UNCOMMITTED rows in the table. Both DELETE and UPDATE will typically be preceded by a call to row_search_mvcc() with prebuilt->select_lock_type=LOCK_X . I see that the DELETE is invoking lock_rec_convert_impl_to_expl() on the record c0=7105.4030 that the conflicting INSERT statement executed. Why do you think that this locking conflict is unreasonable? For the record, when I replaced the DELETE with the following statement (to take advantage of MDEV-13115 ), the locked record was skipped in the output: SELECT * FROM tuEkZFXW WHERE ( tuEkZFXW.c2 > 0 OR tuEkZFXW.c2 < 0) AND (tuEkZFXW.c0 >= 100) FOR UPDATE SKIP LOCKED;
            dlxue huicong xu added a comment -

            Thank you for your response.
            I think a transaction shouldn't be able to see another transaction's inserts under repeatable read isolation level.
            Also, the newly inserted row does not satisfy the condition “c0 < 0 OR c0 > 0”, so I don't think it should be locked.

            dlxue huicong xu added a comment - Thank you for your response. I think a transaction shouldn't be able to see another transaction's inserts under repeatable read isolation level. Also, the newly inserted row does not satisfy the condition “c0 < 0 OR c0 > 0”, so I don't think it should be locked.
            marko Marko Mäkelä added a comment -

            If MDEV-16402 or MDEV-16232 were fixed, then it should be possible to filter out the freshly inserted record c0=0 before the record is locked. Currently, InnoDB will simply attempt to lock each visited index record.

            For UPDATE and possibly also DELETE operations there would also be a tweak that was originally implemented by me to address MySQL Bug #3300. That is, once a record is visited and it is found to be not matching the condition, it would be unlocked. This tweak would not be applicable here, because we’d still lock the record for checking the condition, and then unlock it. We would still have a conflict with the uncommitted INSERT statement.

            Locking operations in InnoDB always see the latest data. Starting with MDEV-35124, InnoDB’s REPEATABLE READ is Snapshot Isolation by default (and ER_CHECKREAD errors are possible). Before that, the InnoDB REPEATABLE READ was something for which there is no nice formal definition.

            marko Marko Mäkelä added a comment - If MDEV-16402 or MDEV-16232 were fixed, then it should be possible to filter out the freshly inserted record c0=0 before the record is locked. Currently, InnoDB will simply attempt to lock each visited index record. For UPDATE and possibly also DELETE operations there would also be a tweak that was originally implemented by me to address MySQL Bug #3300 . That is, once a record is visited and it is found to be not matching the condition, it would be unlocked. This tweak would not be applicable here, because we’d still lock the record for checking the condition, and then unlock it. We would still have a conflict with the uncommitted INSERT statement. Locking operations in InnoDB always see the latest data. Starting with MDEV-35124 , InnoDB’s REPEATABLE READ is Snapshot Isolation by default (and ER_CHECKREAD errors are possible). Before that, the InnoDB REPEATABLE READ was something for which there is no nice formal definition.
            marko Marko Mäkelä added a comment -

            Our documentation seems to be somewhat lacking in this area. https://mariadb.com/kb/en/innodb-lock-modes/ does not mention that visited index records would be locked, optionally in a secondary index and usually in the clustered index (except when a covering secondary index read is possible).

            marko Marko Mäkelä added a comment - Our documentation seems to be somewhat lacking in this area. https://mariadb.com/kb/en/innodb-lock-modes/ does not mention that visited index records would be locked, optionally in a secondary index and usually in the clustered index (except when a covering secondary index read is possible).

            People

              greenman Ian Gilfillan
              dlxue huicong xu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.