Details
-
Bug
-
Status: In Review (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.6, 10.11, 11.4, 11.8, 12.3, 12.2.2
-
None
-
Can result in unexpected behaviour
Description
innodb_snapshot_isolation fails to prevent lost updates under contention
Summary
innodb_snapshot_isolation = ON with REPEATABLE READ and START TRANSACTION WITH CONSISTENT SNAPSHOT fails to reject concurrent updates to the same row, allowing lost updates. The rate is approximately 1 in 150K commits under moderate contention (32 threads, 16 rows).
Environment
- MariaDB 12.2.2 (mariadb:12.2.2 Docker image)
- innodb_snapshot_isolation = ON
- InnoDB engine
- Transaction isolation: REPEATABLE READ
- Transactions opened with START TRANSACTION WITH CONSISTENT SNAPSHOT
- Host: Linux 6.8.0-60-generic, x86_64
Expected behavior
From the MariaDB documentation: when innodb_snapshot_isolation is enabled, an UPDATE inside a REPEATABLE READ transaction started with START TRANSACTION WITH CONSISTENT SNAPSHOT must fail with ER_CHECKREAD (error 1020, "Record has changed since last read in table") if the target row was modified by another transaction after the snapshot was taken.
This guarantees that no two concurrent snapshot transactions can both successfully commit an update to the same row based on the same pre-transaction state – i.e., lost updates are prevented.
What is a lost update
A lost update occurs when two transactions T1 and T2 both:
- Read the same row (observing the same value V).
- Compute a new value based on V.
- Write the new value to that row.
- Commit successfully.
One of the two writes is silently overwritten by the other. The effect of one committed transaction is lost as if it never happened.
Formally (Berenson et al., 1995 – "A Critique of ANSI SQL Isolation Levels"), the lost update anomaly P4 is:
T1 reads row r.
|
T2 reads row r.
|
T1 writes row r (based on its read). T1 commits.
|
T2 writes row r (based on its read). T2 commits.
|
T1's write is lost.
|
With innodb_snapshot_isolation, MariaDB should detect that T2's UPDATE targets a row modified after T2's snapshot and reject it with ER_CHECKREAD at step 4, preventing the anomaly.
Test design
Schema
CREATE TABLE counter (id INT PRIMARY KEY, val INT NOT NULL) ENGINE=InnoDB; |
CREATE TABLE log ( |
seq BIGINT AUTO_INCREMENT PRIMARY KEY, |
counter_id INT NOT NULL, |
old_val INT NOT NULL, |
new_val INT NOT NULL |
) ENGINE=InnoDB;
|
|
|
-- 16 independent counter rows, all starting at 0
|
INSERT INTO counter (id, val) VALUES (1, 0), (2, 0), ..., (16, 0); |
Thread logic (32 threads, each running continuously for 20 minutes)
Each thread repeats:
1. Pick a random counter_id in [1, 16].
|
2. START TRANSACTION WITH CONSISTENT SNAPSHOT
|
3. SELECT val FROM counter WHERE id = :counter_id -- reads V
|
4. UPDATE counter SET val = V+1 WHERE id = :counter_id -- writes V+1
|
5. INSERT INTO log (counter_id, old_val, new_val) VALUES (:counter_id, V, V+1)
|
6. sleep(100 microseconds) -- widen the race window
|
7. COMMIT
|
If step 4 fails with ER_CHECKREAD (expected under contention), the thread rolls back and retries from step 1. If COMMIT fails, the thread also retries. Only successful commits are counted.
Why this proves lost updates
Invariant: Each successful commit increments exactly one counter by exactly 1. Therefore:
SUM(counter.val) == number of successful commits
|
If any lost update occurs, two transactions both read the same value V from the same counter and both commit V+1. The counter advances by 1 instead of 2, but two commits were counted. The invariant breaks:
SUM(counter.val) < number of successful commits
|
Three independent measurements are cross-checked:
| Metric | Source | Meaning |
|---|---|---|
| Committed count | In-process atomic counter, incremented after each successful COMMIT | How many times the application observed COMMIT succeed |
| Audit log rows | SELECT COUNT |
How many transactions the database actually committed |
| Counter sum | SELECT SUM(val) FROM counter | Net effect of all increments |
If no lost updates occur, all three are equal.
If a lost update occurs:
- Committed count == audit log rows (both reflect what the DB committed).
- Counter sum < committed count (the lost update caused one increment to be a no-op on the counter value).
- The audit log contains a duplicate (counter_id, new_val) pair: two committed transactions wrote the same new_val to the same counter.
Why this cannot be a false positive:
- The UPDATE sets val = V+1 where V is a literal read from the same transaction's snapshot. It is not val = val + 1 (which would use the current row version). Two transactions can only write the same new_val if they both read the same old_val – which means the second one was not rejected despite the row being modified after its snapshot.
- The audit log INSERT is inside the same transaction as the UPDATE. If the transaction rolls back, neither the counter update nor the log entry persists. If the transaction commits, both persist atomically. Therefore audit log rows == committed count (confirmed in all runs).
- Each counter is independent. A lost update on counter 5 does not affect counter 7. The per-counter audit trail is a self-contained proof.
Results
Run summary
| Run | Threads | Counters | Delay | Duration | Commits | Rejected | Reject % | Lost updates |
|---|---|---|---|---|---|---|---|---|
| 5 | 32 | 16 | 100us | 20 min | 5,881,030 | 9,871,950 | 62.7% | 40 |
Run 5 parameters (16 counters + 100us delay) dramatically increased the reproduction rate by increasing throughput (4,900 commits/sec vs ~500) and widening the race window.
Run 5 detailed results
Total committed: 5,881,030
|
Total rejected: 9,871,950
|
Rejection rate: 62.7%
|
Counter sum: 5,880,990 (expected: 5,881,030)
|
Throughput: 4,900 commits/sec
|
Audit log rows: 5,881,030 (matches committed count exactly)
|
Duplicate values: 40 (each = one lost update)
|
Lost updates: 40 (5,881,030 - 5,880,990)
|
|
|
Per-counter values:
|
[1]=368597 [2]=368855 [3]=367124 [4]=367492
|
[5]=366690 [6]=366643 [7]=366796 [8]=367650
|
[9]=367974 [10]=367825 [11]=368279 [12]=367794
|
[13]=367717 [14]=367144 [15]=366743 [16]=367667
|
40 lost updates were spread across 14 of 16 counters:
| Counter | Lost updates |
|---|---|
| 2 | 2 |
| 3 | 1 |
| 4 | 2 |
| 5 | 4 |
| 6 | 2 |
| 7 | 5 |
| 8 | 1 |
| 9 | 6 |
| 10 | 2 |
| 11 | 3 |
| 12 | 3 |
| 14 | 4 |
| 15 | 2 |
| 16 | 3 |
Audit log evidence
Every lost update follows the same pattern. Two transactions take a snapshot, both read the same old_val, both UPDATE to the same new_val, and both COMMIT successfully. The second should have been rejected with ER_CHECKREAD.
Representative examples (3 of 40):
Counter 2 – seq 382588 and 382594 (gap: 6 log entries apart):
seq counter_id old_val new_val
|
382578 2 24057 24058 -- preceding normal commit
|
382588 2 24058 24059 -- T1: read 24058, wrote 24059, COMMIT OK
|
382594 2 24058 24059 -- T2: read 24058, wrote 24059, COMMIT OK << BUG
|
382606 2 24059 24060 -- following normal commit
|
Counter 7 – seq 5060312 and 5060315 (gap: 3 log entries apart):
seq counter_id old_val new_val
|
5060301 7 315440 315441 -- preceding normal commit
|
5060312 7 315441 315442 -- T1: read 315441, wrote 315442, COMMIT OK
|
5060315 7 315441 315442 -- T2: read 315441, wrote 315442, COMMIT OK << BUG
|
5060334 7 315442 315443 -- following normal commit
|
Counter 10 – seq 3728491 and 3728493 (gap: 2 log entries apart):
seq counter_id old_val new_val
|
3728476 10 232965 232966 -- preceding normal commit
|
3728491 10 232966 232967 -- T1: read 232966, wrote 232967, COMMIT OK
|
3728493 10 232966 232967 -- T2: read 232966, wrote 232967, COMMIT OK << BUG
|
3728499 10 232967 232968 -- following normal commit
|
The seq column is a global auto-increment across all counters. The small gaps (2--13) between the duplicate pair confirm the two transactions were near-simultaneous – exactly the tight timing window expected from a race condition.
Full evidence for all 40 lost updates across all 14 affected counters:
seq ctr old_val -> new_val note
|
-------- --- ----------------- -------------------------
|
382588 2 24058 -> 24059 T1 OK
|
382594 2 24058 -> 24059 T2 OK <- LOST UPDATE (gap 6)
|
|
|
5825975 2 365428 -> 365429 T1 OK
|
5825980 2 365428 -> 365429 T2 OK <- LOST UPDATE (gap 5)
|
|
|
4988851 3 311285 -> 311286 T1 OK
|
4988856 3 311285 -> 311286 T2 OK <- LOST UPDATE (gap 5)
|
|
|
2570781 4 160966 -> 160967 T1 OK
|
2570786 4 160966 -> 160967 T2 OK <- LOST UPDATE (gap 5)
|
|
|
5271518 4 329585 -> 329586 T1 OK
|
5271522 4 329585 -> 329586 T2 OK <- LOST UPDATE (gap 4)
|
|
|
220717 5 13710 -> 13711 T1 OK
|
220724 5 13710 -> 13711 T2 OK <- LOST UPDATE (gap 7)
|
|
|
1176470 5 73282 -> 73283 T1 OK
|
1176483 5 73282 -> 73283 T2 OK <- LOST UPDATE (gap 13)
|
|
|
4354827 5 271715 -> 271716 T1 OK
|
4354835 5 271715 -> 271716 T2 OK <- LOST UPDATE (gap 8)
|
|
|
4804190 5 299600 -> 299601 T1 OK
|
4804197 5 299600 -> 299601 T2 OK <- LOST UPDATE (gap 7)
|
|
|
3192626 6 199730 -> 199731 T1 OK
|
3192630 6 199730 -> 199731 T2 OK <- LOST UPDATE (gap 4)
|
|
|
4611160 6 287876 -> 287877 T1 OK
|
4611168 6 287876 -> 287877 T2 OK <- LOST UPDATE (gap 8)
|
|
|
473637 7 29417 -> 29418 T1 OK
|
473643 7 29417 -> 29418 T2 OK <- LOST UPDATE (gap 6)
|
|
|
1393565 7 87021 -> 87022 T1 OK
|
1393572 7 87021 -> 87022 T2 OK <- LOST UPDATE (gap 7)
|
|
|
5060312 7 315441 -> 315442 T1 OK
|
5060315 7 315441 -> 315442 T2 OK <- LOST UPDATE (gap 3)
|
|
|
5192584 7 323660 -> 323661 T1 OK
|
5192590 7 323660 -> 323661 T2 OK <- LOST UPDATE (gap 6)
|
|
|
5621036 7 350360 -> 350361 T1 OK
|
5621046 7 350360 -> 350361 T2 OK <- LOST UPDATE (gap 10)
|
|
|
1919034 8 119698 -> 119699 T1 OK
|
1919041 8 119698 -> 119699 T2 OK <- LOST UPDATE (gap 7)
|
|
|
327932 9 20478 -> 20479 T1 OK
|
327935 9 20478 -> 20479 T2 OK <- LOST UPDATE (gap 3)
|
|
|
346221 9 21590 -> 21591 T1 OK
|
346227 9 21590 -> 21591 T2 OK <- LOST UPDATE (gap 6)
|
|
|
1988638 9 123885 -> 123886 T1 OK
|
1988645 9 123885 -> 123886 T2 OK <- LOST UPDATE (gap 7)
|
|
|
2545615 9 158661 -> 158662 T1 OK
|
2545619 9 158661 -> 158662 T2 OK <- LOST UPDATE (gap 4)
|
|
|
3209522 9 200262 -> 200263 T1 OK
|
3209529 9 200262 -> 200263 T2 OK <- LOST UPDATE (gap 7)
|
|
|
4091265 9 255844 -> 255845 T1 OK
|
4091270 9 255844 -> 255845 T2 OK <- LOST UPDATE (gap 5)
|
|
|
3728491 10 232966 -> 232967 T1 OK
|
3728493 10 232966 -> 232967 T2 OK <- LOST UPDATE (gap 2)
|
|
|
4864245 10 304269 -> 304270 T1 OK
|
4864253 10 304269 -> 304270 T2 OK <- LOST UPDATE (gap 8)
|
|
|
1334823 11 83439 -> 83440 T1 OK
|
1334830 11 83439 -> 83440 T2 OK <- LOST UPDATE (gap 7)
|
|
|
1512048 11 94630 -> 94631 T1 OK
|
1512052 11 94630 -> 94631 T2 OK <- LOST UPDATE (gap 4)
|
|
|
5121677 11 320628 -> 320629 T1 OK
|
5121681 11 320628 -> 320629 T2 OK <- LOST UPDATE (gap 4)
|
|
|
1033955 12 64808 -> 64809 T1 OK
|
1033964 12 64808 -> 64809 T2 OK <- LOST UPDATE (gap 9)
|
|
|
1497871 12 93518 -> 93519 T1 OK
|
1497874 12 93518 -> 93519 T2 OK <- LOST UPDATE (gap 3)
|
|
|
5340853 12 333926 -> 333927 T1 OK
|
5340857 12 333926 -> 333927 T2 OK <- LOST UPDATE (gap 4)
|
|
|
276258 14 17362 -> 17363 T1 OK
|
276269 14 17362 -> 17363 T2 OK <- LOST UPDATE (gap 11)
|
|
|
2232313 14 139398 -> 139399 T1 OK
|
2232317 14 139398 -> 139399 T2 OK <- LOST UPDATE (gap 4)
|
|
|
2391597 14 149314 -> 149315 T1 OK
|
2391603 14 149314 -> 149315 T2 OK <- LOST UPDATE (gap 6)
|
|
|
5434067 14 339338 -> 339339 T1 OK
|
5434070 14 339338 -> 339339 T2 OK <- LOST UPDATE (gap 3)
|
|
|
2842710 15 177055 -> 177056 T1 OK
|
2842715 15 177055 -> 177056 T2 OK <- LOST UPDATE (gap 5)
|
|
|
5080193 15 316601 -> 316602 T1 OK
|
5080197 15 316601 -> 316602 T2 OK <- LOST UPDATE (gap 4)
|
|
|
1189409 16 74503 -> 74504 T1 OK
|
1189413 16 74503 -> 74504 T2 OK <- LOST UPDATE (gap 4)
|
|
|
4030103 16 252111 -> 252112 T1 OK
|
4030109 16 252111 -> 252112 T2 OK <- LOST UPDATE (gap 6)
|
|
|
5645450 16 352818 -> 352819 T1 OK
|
5645458 16 352818 -> 352819 T2 OK <- LOST UPDATE (gap 8)
|
Reproducer
Prerequisites
- Docker
- Rust toolchain: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- System packages: apt install build-essential pkg-config libssl-dev zlib1g-dev
Project structure
The reproducer is a Rust test project with two files:
Cargo.toml:
[package]
|
name = "isolation-tests"
|
version = "0.1.0"
|
edition = "2021"
|
|
|
[dependencies]
|
mysql = "25"
|
rand = "0.10.0"
|
|
|
[[test]]
|
name = "isolation"
|
path = "tests/isolation.rs"
|
tests/isolation.rs – contains the stress test (stress_lost_update_prevention) and supporting anomaly tests (G-single, G2-item). The full source is included at the end of this report.
Steps
# 1. Create the project directory
|
mkdir isolation-tests && cd isolation-tests |
|
|
# 2. Create Cargo.toml and tests/isolation.rs as shown above
|
# (or copy from the attached source files)
|
|
|
# 3. Start MariaDB 12.2.2
|
docker run -d --name mariadb-test \ |
-e MARIADB_ROOT_PASSWORD=testpass \
|
-e MARIADB_DATABASE=testdb \
|
-p 3307:3306 mariadb:12.2.2
|
|
|
# 4. Wait for MariaDB to be ready
|
until docker exec mariadb-test mariadb -uroot -ptestpass -e "SELECT 1" 2>/dev/null; do |
sleep 1 |
done
|
|
|
# 5. Enable innodb_snapshot_isolation and increase max_connections
|
docker exec mariadb-test mariadb -uroot -ptestpass -e " |
SET GLOBAL innodb_snapshot_isolation = ON;
|
SET GLOBAL max_connections = 200;
|
"
|
|
|
# 6. Run the stress test (20 minutes by default)
|
DATABASE_URL="mysql://root:testpass@127.0.0.1:3307/testdb" \ |
cargo test stress_lost_update -- --nocapture |
|
|
# 7. Optionally adjust duration (e.g. 10 minutes)
|
DATABASE_URL="mysql://root:testpass@127.0.0.1:3307/testdb" \ |
STRESS_DURATION_SECS=600 \
|
cargo test stress_lost_update -- --nocapture |
|
|
# 8. Run all tests (G-single, G2-item, lost update basic, stress)
|
DATABASE_URL="mysql://root:testpass@127.0.0.1:3307/testdb" \ |
cargo test -- --nocapture |
|
|
# 9. After failure, inspect the audit log for evidence
|
docker exec mariadb-test mariadb -uroot -ptestpass testdb -e " |
SELECT counter_id, new_val, COUNT(*) as cnt
|
FROM test_stress_log
|
GROUP BY counter_id, new_val
|
HAVING cnt > 1;
|
"
|
|
|
# 10. Show surrounding context for each duplicate
|
docker exec mariadb-test mariadb -uroot -ptestpass testdb -e " |
SELECT l.seq, l.counter_id, l.old_val, l.new_val |
FROM test_stress_log l
|
JOIN (
|
SELECT counter_id, new_val
|
FROM test_stress_log
|
GROUP BY counter_id, new_val
|
HAVING COUNT(*) > 1
|
) d ON l.counter_id = d.counter_id
|
AND l.new_val BETWEEN d.new_val - 1 AND d.new_val + 1
|
ORDER BY l.counter_id, d.new_val, l.seq; |
"
|
|
|
# 11. Cleanup
|
docker rm -f mariadb-test |
The test asserts SUM(counter.val) == committed_count. If any lost update occurs, the assertion fails and the output includes the duplicate audit log entries as evidence.
What the test outputs
On success (no lost updates):
PASSED: No lost updates under stress. innodb_snapshot_isolation held.
|
test stress_lost_update_prevention ... ok
|
On failure (lost updates detected):
Stress test results (32 threads, 16 counters, 1200.1s):
|
Total committed: 5881030
|
Counter sum: 5880990
|
Duplicate values: 40
|
counter 2: new_val=24059 appeared 2 times
|
...
|
LOST UPDATE DETECTED: counter sum (5880990) != committed count (5881030).
|
40 updates were lost! Duplicates in log: 40
|
test stress_lost_update_prevention ... FAILED
|
Attachments
Issue Links
- relates to
-
MDEV-38977 Galera - G-single with process pauses
-
- In Progress
-