[MDEV-26643] Inconsistent behaviors of UPDATE under RU & RC isolation level Created: 2021-09-19  Updated: 2024-01-16

Status: Open
Project: MariaDB Server
Component/s: Documentation, Locking, Storage Engine - InnoDB
Affects Version/s: 10.5.12
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: dinary Assignee: Ian Gilfillan
Resolution: Unresolved Votes: 1
Labels: innodb
Environment:

OS: Ubuntu-20.04


Issue Links:
Relates
relates to MDEV-26642 Weird SELECT view when a record is mo... Confirmed
relates to MDEV-32898 Phantom rows caused by UPDATE of PRIM... Stalled
relates to MDEV-26671 Inconsistent SELECT View when modify ... Open

 Description   

Isolation Level: Read Uncommitted & Read Committed.
The behaviors of UPDATE for different rows are inconsistent. Just some of the rows are not updated while others are.

/* init */ drop table if exists t;
/* init */ create table t(a int, b int);
/* init */ insert into t values(null, 1), (2, 2), (null, null), (null, 3), (4, null);
 
/* s1 */ begin;
/* s1 */ update t set a = 10 where 1;
/* s2 */ begin;
/* s2 */ update t set b = 20 where a; -- blocked
/* s1 */ commit; -- s2 unblocked
/* s2 */ commit;
 
select * from t;
 
+------+------+
| a    | b    |
+------+------+
|   10 |    1 |
|   10 |   20 |
|   10 |   20 |
|   10 |   20 |
|   10 |   20 |
+------+------+

The field b of the first row is not updated but other rows are updated.

After several attempts, I found that the results depend on the distribution of the data in the table rather than the timing.

If the initial table is like:

+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|    2 |    2 |
| null | null |
| null |    3 |
|    4 | null |
+------+------+

then, the result will be

+------+------+
| a    | b    |
+------+------+
|   10 |   20 |
|   10 |   20 |
|   10 |   20 |
|   10 |   20 |
|   10 |   20 |
+------+------+

More specifically, if there exists a row whose column `a` is not NULL, and it is the first record from top to bottom whose column `a` is not NULL. Then, all rows before it with a NULL column `a` will not be updated, while all rows after it with a NULL column `a` will be updated. This means that even the execution of a single statement is not atomic.

This bug can be reproduced under Read Uncommitted and Read Committed.



 Comments   
Comment by dinary [ 2021-09-24 ]

This bug can also be reproduced in MySQL and has been verified by MySQL Verification Team.

https://bugs.mysql.com/bug.php?id=104833

Comment by Marko Mäkelä [ 2021-11-01 ]

I suspect that the root cause of this the way how InnoDB read views work. I posted some explanation and executable test cases in MDEV-26642.

Comment by dinary [ 2021-11-23 ]

Thank you for your reply. I agree with your explanation MDEV-26642.

But this issue seems different with MDEV-26642 and MDEV-26671.

Comment by Marko Mäkelä [ 2022-01-05 ]

I think that this one would be good for vlad.lesin to check. He has been spending a lot of time diagnosing locking in the past months. I wonder if MDEV-27025 would affect this.

Comment by Vladislav Lesin [ 2022-04-05 ]

I have analyzed the above case. I simplified test case a little bit:

--source include/have_innodb.inc                                                
                                                                                
create table t(a int, b int) engine = InnoDB;                                   
insert into t values(null, 1), (2, 2);                                          
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;                               
                                                                                
begin;                                                                          
update t set a = 10;                                                            
                                                                                
connect (con1,localhost,root);                                                  
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;                               
begin;                                                                          
--send update t set b = 20 where a                                              
                                                                                
--connection default                                                            
select sleep(1);                                                                
commit;                                                                         
                                                                                
--connection con1                                                               
--reap                                                                          
commit;                                                                         
                                                                                
--disconnect con1                                                               
--connection default                                                            
select * from t;                                                                
drop table t;

When the second transaction reads (10,1) row, the row is locked by the first transaction. For this case row_search_mvcc() reads previously committed (null,1) value:

row_search_mvcc(...)                                                            
{                                                                               
...                                                                             
no_gap_lock:                                                                    
...                                                                             
                err = sel_set_rec_lock(pcur,                                    
                                       rec, index, offsets,                     
                                       prebuilt->select_lock_type,              
                                       lock_type, thr, &mtr);                   
                                                                                
                switch (err) {                                                  
                ...                                                             
                case DB_LOCK_WAIT:                                              
                        ...                                                     
                        row_sel_build_committed_vers_for_mysql(                 
                                clust_index, prebuilt, rec,                     
                                &offsets, &heap, &old_vers, need_vrow ? &vrow : NULL,
                                &mtr);                                          
                        ...                                                     
                        if (old_vers == NULL) {                                 
                                /* The row was not yet committed */             
                                                                                
                                goto next_rec;                                  
                        }                                                       
                                                                                
                        did_semi_consistent_read = true;                        
                        rec = old_vers;                                         
                        break;                                                  
                        ...                                                     
                }                                                               
                                                                                
...                                                                             
}

And as (null,1) does no match "where" condition of the 2nd update, i.e. a = null, it's not updated.

This behavior is documented out for mysql https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html:

"Using READ COMMITTED has additional effects:
...
For UPDATE statements, if a row is already locked, InnoDB performs a “semi-consistent” read, returning the latest committed version to MySQL so that MySQL can determine whether the row matches the WHERE condition of the UPDATE. If the row matches (must be updated), MySQL reads the row again and this time InnoDB either locks it or waits for a lock on it. "

Unfortunately, out isolation levels documentation does not contain such description: https://mariadb.com/kb/en/mariadb-transactions-and-isolation-levels-for-sql-server-users.

Comment by Marko Mäkelä [ 2022-04-05 ]

I think that this must be addressed in our documentation.

Years ago, the semi-consistent read was originally implemented by me to fix MySQL Bug #3300. The function ha_innobase::try_semi_consistent_read() will enable the semi-consistent reads for the READ COMMITTED and READ UNCOMMITTED isolation levels. Normal consistent reads will wait for a lock on any uncommitted record version.

In the scenario given by the Description, when the second UPDATE statement visits the first row, it will be locked by the first UPDATE, which was not committed yet. For the semi-consistent read, the latest committed version (a,b)=(null,1) will be constructed, and a=null will not satisfy the condition WHERE a. Hence, the row will not be updated.

dinary, are you able to access MySQL Bug #104833, or did you receive any updates for it? It has been made private.

Comment by Marko Mäkelä [ 2024-01-16 ]

Unlike for my mtr test cases of MDEV-26642, for vlad.lesin’s test on 2022-04-05 I can’t observe any difference when testing with or without my work-in-progress fix of MDEV-32898. It is notable that for the READ UNCOMMITTED isolation level, no read view will be created. The work-in-progress fix of MDEV-32898 only affects the lock acquisition of records when the transaction has a read view where the record is not visible.

Generated at Thu Feb 08 09:46:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.