Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26642

Weird SELECT view when a record is modified to the same value by two transactions

Details

    Description

      Under REPEATABLE-READ isolation level, if two transactions concurrently modify the same row to the same value, the transaction that modifies later does not see the modified content.

      /* init */ create table t(a int, b int);
      /* init */ insert into t values (0, 0), (1, 1), (2, 2);
       
      /* s1 */ begin;
      /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
      /* s2 */ begin;
      /* s2 */ update t set a = 10 where b = 1;
      /* s2 */ commit;
      /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
      /* s1 */ update t set a = 10 where true;
      /* s1 */ select * from t;  -- [(10, 0), (1, 1), (10, 2)]
      /* s1 */ commit;
      

      The result of final SELECT should be (10, 0), (10, 1), (10, 2).

      I think it is so weird for session 1 to see the second row is still (1, 1) after the successful execution of an UPDATE with the "WHERE TRUE" predicate.

      So I think it will be better for s1 to see all records it updates regardless of whether the values before and after the UPDATE are the same.

      Attachments

        Issue Links

          Activity

            dinary dinary added a comment - - edited

            This bug can also be reproduced in MySQL and has been verified by MySQL Verification Team.

            MySQL#100328

            dinary dinary added a comment - - edited This bug can also be reproduced in MySQL and has been verified by MySQL Verification Team. MySQL#100328

            I think that this must be addressed in the documentation. This is the impact of a design decision of the InnoDB REPEATABLE READ, probably present since the very first release (MySQL 3.23.49). If we changed this now after all these years, some applications could be broken.

            I wrote a simpler mtr test to demonstrate this:

            --source include/have_innodb.inc
             
            CREATE TABLE t(a INT) ENGINE=InnoDB;
            INSERT INTO t VALUES (0);
             
            connect (con1,localhost,root);
            #SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
            #SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
            #SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
            #SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
            START TRANSACTION WITH CONSISTENT SNAPSHOT;
             
            connection default;
            UPDATE t SET a=1;
             
            connection con1;
            SELECT * FROM t;
            UPDATE t SET a=1;
            SELECT * FROM t;
            COMMIT;
            disconnect con1;
             
            connection default;
            SELECT * FROM t;
            DROP TABLE t;
            

            Only for the default REPEATABLE READ isolation level, the second-but-last SELECT in the test will return 0 instead of 1.

            With READ COMMITTED, the read view will be reset at the start of each statement. So, that statement will observe the impact of the first UPDATE, which had been committed.

            READ UNCOMMITTED will display the latest state of the table. SERIALIZABLE will do that too, but with the difference that the accessed records will be locked first. (See MDEV-14589 for some discussion on isolation levels.)

            In InnoDB, a read view will observe the changes of all transactions that had been committed at the time the read view was created, as well as the changes that the transaction performed itself. In this case, the second UPDATE did not modify the record, and the first UPDATE had not been committed before the read view was created. So, there is no bug; it is working as designed.

            Here is a more complex test, with 2 columns, more similar to the described scenario. Note that after we actually modify the record in that transaction, then we will read back the "correct" result.

            --source include/have_innodb.inc
             
            CREATE TABLE t(a INT, b INT) ENGINE=InnoDB;
            INSERT INTO t VALUES (0,0);
             
            connect (con1,localhost,root);
            START TRANSACTION WITH CONSISTENT SNAPSHOT;
             
            connection default;
            UPDATE t SET a=1; # update to (1,0) is committed, but not visible in the above read view
             
            connection con1;
            SELECT * FROM t;
            UPDATE t SET a=1;
            SELECT * FROM t; # returns "unexpected" (0,0)
            UPDATE t SET b=1;
            SELECT * FROM t; # returns (1,1) because we modified the record
            ROLLBACK;
            disconnect con1;
             
            connection default;
            SELECT * FROM t; # returns (1,0)
            DROP TABLE t;
            

            marko Marko Mäkelä added a comment - I think that this must be addressed in the documentation. This is the impact of a design decision of the InnoDB REPEATABLE READ , probably present since the very first release (MySQL 3.23.49). If we changed this now after all these years, some applications could be broken. I wrote a simpler mtr test to demonstrate this: --source include/have_innodb.inc   CREATE TABLE t(a INT ) ENGINE=InnoDB; INSERT INTO t VALUES (0);   connect (con1,localhost,root); # SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ; # SET TRANSACTION ISOLATION LEVEL READ COMMITTED ; # SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ; # SET TRANSACTION ISOLATION LEVEL REPEATABLE READ ; START TRANSACTION WITH CONSISTENT SNAPSHOT;   connection default ; UPDATE t SET a=1;   connection con1; SELECT * FROM t; UPDATE t SET a=1; SELECT * FROM t; COMMIT ; disconnect con1;   connection default ; SELECT * FROM t; DROP TABLE t; Only for the default REPEATABLE READ isolation level, the second-but-last SELECT in the test will return 0 instead of 1 . With READ COMMITTED , the read view will be reset at the start of each statement. So, that statement will observe the impact of the first UPDATE , which had been committed. READ UNCOMMITTED will display the latest state of the table. SERIALIZABLE will do that too, but with the difference that the accessed records will be locked first. (See MDEV-14589 for some discussion on isolation levels.) In InnoDB, a read view will observe the changes of all transactions that had been committed at the time the read view was created, as well as the changes that the transaction performed itself. In this case, the second UPDATE did not modify the record, and the first UPDATE had not been committed before the read view was created. So, there is no bug; it is working as designed. Here is a more complex test, with 2 columns, more similar to the described scenario. Note that after we actually modify the record in that transaction, then we will read back the "correct" result. --source include/have_innodb.inc   CREATE TABLE t(a INT , b INT ) ENGINE=InnoDB; INSERT INTO t VALUES (0,0);   connect (con1,localhost,root); START TRANSACTION WITH CONSISTENT SNAPSHOT;   connection default ; UPDATE t SET a=1; # update to (1,0) is committed , but not visible in the above read view   connection con1; SELECT * FROM t; UPDATE t SET a=1; SELECT * FROM t; # returns "unexpected" (0,0) UPDATE t SET b=1; SELECT * FROM t; # returns (1,1) because we modified the record ROLLBACK ; disconnect con1;   connection default ; SELECT * FROM t; # returns (1,0) DROP TABLE t;
            dinary dinary added a comment -

            Thank you for your reply. I think your explanation is reasonable. And I agree with you on MDEV-26671 is a duplicate of this. In my opinion, if this issue will not be fixed due to compatibility, then the documentation notes are needed.

            dinary dinary added a comment - Thank you for your reply. I think your explanation is reasonable. And I agree with you on MDEV-26671 is a duplicate of this. In my opinion, if this issue will not be fixed due to compatibility, then the documentation notes are needed.
            jason.dou Jason added a comment - - edited

            I slightly modify the test case in the description, and obtain weird result.

            The modified test case:

            /* init */ create table t(a int, b int);
            /* init */ insert into t values (0, 0), (1, 1), (2, 2);
             
            /* s1 */ begin;
            /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
            /* s2 */ begin;
            /* s2 */ update t set *a = 11* where b = 1; 
            /* s2 */ commit;
            /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
            /* s1 */ update t set a = 10 where true;
            /* s1 */ select * from t;  -- [(10, 0), *(10, 1)*, (10, 2)]
            /* s1 */ commit;
            

            The original test case:

            /* init */ create table t(a int, b int);
            /* init */ insert into t values (0, 0), (1, 1), (2, 2);
             
            /* s1 */ begin;
            /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
            /* s2 */ begin;
            /* s2 */ update t set *a = 10* where b = 1; 
            /* s2 */ commit;
            /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)]
            /* s1 */ update t set a = 10 where true;
            /* s1 */ select * from t;  -- [(10, 0), *(1, 1)*, (10, 2)]
            /* s1 */ commit;
            

            In the modified version, we can see that, the last select of session s1 correctly returns the excepted result after the successful execution of an UPDATE with the "WHERE TRUE" predicate.

            However, in the original version, the return result seems wrong.

            So, the result of the last select of session s1 depends on the update's values in session s2.
            This is very weird.

            Should the result of the last select of session s1 be the same, no matter the values in the update of session s2?

            jason.dou Jason added a comment - - edited I slightly modify the test case in the description, and obtain weird result. The modified test case: /* init */ create table t(a int , b int ); /* init */ insert into t values (0, 0), (1, 1), (2, 2); /* s1 */ begin ; /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)] /* s2 */ begin ; /* s2 */ update t set *a = 11* where b = 1; /* s2 */ commit ; /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)] /* s1 */ update t set a = 10 where true ; /* s1 */ select * from t; -- [(10, 0), *(10, 1)*, (10, 2)] /* s1 */ commit ; The original test case: /* init */ create table t(a int , b int ); /* init */ insert into t values (0, 0), (1, 1), (2, 2); /* s1 */ begin ; /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)] /* s2 */ begin ; /* s2 */ update t set *a = 10* where b = 1; /* s2 */ commit ; /* s1 */ select * from t; -- [(0, 0), (1, 1), (2, 2)] /* s1 */ update t set a = 10 where true ; /* s1 */ select * from t; -- [(10, 0), *(1, 1)*, (10, 2)] /* s1 */ commit ; In the modified version, we can see that, the last select of session s1 correctly returns the excepted result after the successful execution of an UPDATE with the "WHERE TRUE" predicate. However, in the original version, the return result seems wrong. So, the result of the last select of session s1 depends on the update's values in session s2. This is very weird. Should the result of the last select of session s1 be the same, no matter the values in the update of session s2?

            With the crude fix that I posted in MDEV-32898, my first and second mtr test fail like this:

            bb-10.6-MDEV-32898-pkgtest c0db57157bdb87f10dbe7fab094bbc53f5abbb8f

            mysqltest: At line 18: query 'UPDATE t SET a=1' failed: ER_LOCK_DEADLOCK (1213): Deadlock found when trying to get lock; try restarting transaction
            mysqltest: At line 14: query 'UPDATE t SET a=1' failed: ER_LOCK_DEADLOCK (1213): Deadlock found when trying to get lock; try restarting transaction
            

            We might want to use a better error code than ER_LOCK_DEADLOCK for this.

            Readily built packages of my work-in-progress fix should (soon) be available for download at https://ci.mariadb.org/42468/. I am eager to see if you can reproduce more issues around this. I am also in contact with the author of https://jepsen.io/analyses/mysql-8.0.34; see this mailing list archive.

            marko Marko Mäkelä added a comment - With the crude fix that I posted in MDEV-32898 , my first and second mtr test fail like this: bb-10.6-MDEV-32898-pkgtest c0db57157bdb87f10dbe7fab094bbc53f5abbb8f mysqltest: At line 18: query 'UPDATE t SET a=1' failed: ER_LOCK_DEADLOCK (1213): Deadlock found when trying to get lock; try restarting transaction mysqltest: At line 14: query 'UPDATE t SET a=1' failed: ER_LOCK_DEADLOCK (1213): Deadlock found when trying to get lock; try restarting transaction We might want to use a better error code than ER_LOCK_DEADLOCK for this. Readily built packages of my work-in-progress fix should (soon) be available for download at https://ci.mariadb.org/42468/ . I am eager to see if you can reproduce more issues around this. I am also in contact with the author of https://jepsen.io/analyses/mysql-8.0.34 ; see this mailing list archive .

            This was fixed together with MDEV-32898. If SET innodb_snapshot_isolation=ON is executed before starting a transaction, then any transaction that uses read views (such as REPEATABLE READ transactions) will fail with ER_CHECKREAD in case the latest version of a being-locked record is not in the read view.

            marko Marko Mäkelä added a comment - This was fixed together with MDEV-32898 . If SET innodb_snapshot_isolation=ON is executed before starting a transaction, then any transaction that uses read views (such as REPEATABLE READ transactions) will fail with ER_CHECKREAD in case the latest version of a being-locked record is not in the read view.

            For what it is worth, below is a mtr version of the scenario of MySQL Bug #116067:

            --source include/have_innodb.inc
             
            CREATE TABLE simple_table (
                id VARCHAR(20) PRIMARY KEY,
                unique_col VARCHAR(20),
                status VARCHAR(20),
                UNIQUE INDEX idx_unique_col (unique_col)
            ) ENGINE=InnoDB;
             
            START TRANSACTION;
            SELECT * FROM simple_table WHERE unique_col = 'unique1';
             
            --connect (con1,localhost,root)
            START TRANSACTION;
             
            INSERT INTO simple_table (id, unique_col, status)
            VALUES ('id1', 'unique1', 'in_progress')
            ON DUPLICATE KEY UPDATE
                unique_col = VALUES(unique_col);
             
             COMMIT;
            --disconnect con1
            --connection default
             
            INSERT INTO simple_table (id, unique_col, status)
            VALUES ('id2', 'unique1', 'in_progress')
            ON DUPLICATE KEY UPDATE
                unique_col = VALUES(unique_col);
             
            # Fails to see just upserted record
            SELECT * FROM simple_table WHERE unique_col = 'unique1';
             
            COMMIT;
             
            SELECT * FROM simple_table WHERE unique_col = 'unique1';
             
            DROP TABLE simple_table;
            

            By default, the bug is reproduced, that is, the result of the last-but-one SELECT is empty. If save the above to mysql-test/main/bug-116067.test and run

            ./mtr --mysqld=--innodb-snapshot-isolation main.bug-116067
            

            it will fail like this:

            10.6 a74bea7ba99a8c215cf085ff7a9dd4bdabab51e8

            mysqltest: At line 25: query 'INSERT INTO simple_table (id, unique_col, status)
            VALUES ('id2', 'unique1', 'in_progress')
            ON DUPLICATE KEY UPDATE
            unique_col = VALUES(unique_col)' failed: ER_CHECKREAD (1020): Record has changed since last read in table 'simple_table'
            

            This is because we would notice that the transaction in con1 had modified the data, which would constitute a violation of Snapshot Isolation.

            marko Marko Mäkelä added a comment - For what it is worth, below is a mtr version of the scenario of MySQL Bug #116067 : --source include/have_innodb.inc   CREATE TABLE simple_table ( id VARCHAR (20) PRIMARY KEY , unique_col VARCHAR (20), status VARCHAR (20), UNIQUE INDEX idx_unique_col (unique_col) ) ENGINE=InnoDB;   START TRANSACTION ; SELECT * FROM simple_table WHERE unique_col = 'unique1' ;   --connect (con1,localhost,root) START TRANSACTION ;   INSERT INTO simple_table (id, unique_col, status) VALUES ( 'id1' , 'unique1' , 'in_progress' ) ON DUPLICATE KEY UPDATE unique_col = VALUES (unique_col);   COMMIT ; --disconnect con1 --connection default   INSERT INTO simple_table (id, unique_col, status) VALUES ( 'id2' , 'unique1' , 'in_progress' ) ON DUPLICATE KEY UPDATE unique_col = VALUES (unique_col);   # Fails to see just upserted record SELECT * FROM simple_table WHERE unique_col = 'unique1' ;   COMMIT ;   SELECT * FROM simple_table WHERE unique_col = 'unique1' ;   DROP TABLE simple_table; By default, the bug is reproduced, that is, the result of the last-but-one SELECT is empty. If save the above to mysql-test/main/bug-116067.test and run ./mtr --mysqld=--innodb-snapshot-isolation main.bug-116067 it will fail like this: 10.6 a74bea7ba99a8c215cf085ff7a9dd4bdabab51e8 mysqltest: At line 25: query 'INSERT INTO simple_table (id, unique_col, status) VALUES ('id2', 'unique1', 'in_progress') ON DUPLICATE KEY UPDATE unique_col = VALUES(unique_col)' failed: ER_CHECKREAD (1020): Record has changed since last read in table 'simple_table' This is because we would notice that the transaction in con1 had modified the data, which would constitute a violation of Snapshot Isolation.

            People

              marko Marko Mäkelä
              dinary dinary
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.