[MDEV-39264] It is unclear that ER_CHECKREAD results in a transaction rollback - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 12.2.2
Fix Version/s: 12.3.2, 10.11.17, 11.4.11, 11.8.7
Component/s: Server
Labels:
None
Environment:
mariadb:12.2.2 Docker image
innodb_snapshot_isolation = ON
InnoDB engine
Transaction isolation: REPEATABLE READ
Transactions opened with START TRANSACTION WITH CONSISTENT SNAPSHOT
Host: Linux 6.8.0-60-generic, x86_64

Bug Category:
Can result in unexpected behaviour
Release Note Summary:
The error message ER_CHECKREAD (1020) was revised to suggest that InnoDB has rolled back the transaction: "try restarting transaction"

Description

With innodb_snapshot_isolation = ON at REPEATABLE READ, a transaction that UPDATEs an unrelated column of a row can subsequently observe changes made by a concurrent transaction to other columns of the same row. This violates Repeatable Read: a SELECT returns a different value for a column that the transaction did not modify, despite no intervening rollback.

The root cause is the same race condition reported in ~~MDEV-39263~~: innodb_snapshot_isolation intermittently fails to reject an UPDATE on a row that was modified after the transaction's snapshot. When the UPDATE slips through, InnoDB switches from snapshot-read to current-read for that row, exposing the concurrent writer's changes.

Minimal example

CREATE TABLE test_nrr (

  id     INT PRIMARY KEY,

  name   INT NOT NULL,

  gender INT NOT NULL

) ENGINE=InnoDB;

INSERT INTO test_nrr VALUES (0, 0, 0);

A concurrent writer increments name:

-- T2 (writer, autocommit):

UPDATE test_nrr SET name = name + 1 WHERE id = 0;

-- row is now (id=0, name=1, gender=0)

Meanwhile, T1 is in a REPEATABLE READ transaction:

-- T1 (reader):

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

START TRANSACTION WITH CONSISTENT SNAPSHOT;

SELECT name FROM test_nrr WHERE id = 0;

--> 0                                           -- snapshot read

-- T2 commits name=1 here

UPDATE test_nrr SET gender = 99 WHERE id = 0;

-- should fail with ER_CHECKREAD (1020) because the row was modified

-- BUG: intermittently succeeds

SELECT name FROM test_nrr WHERE id = 0;

--> 1                                           -- name changed!

COMMIT;

T1 read name twice within a REPEATABLE READ transaction. Despite only modifying gender, the value of name spontaneously changed from 0 to 1.

Expected behavior

With innodb_snapshot_isolation = ON, T1's UPDATE on gender must fail with ER_CHECKREAD (error 1020, "Record has changed since last read in table") because a concurrent transaction modified the row after T1's snapshot was taken. This prevents T1 from entering the current-read state that would expose the concurrent writer's changes.

Without innodb_snapshot_isolation, InnoDB's REPEATABLE READ allows this behavior by design: locking operations (UPDATE, DELETE, SELECT FOR UPDATE) read the current committed version of rows they touch. This is documented. But innodb_snapshot_isolation exists precisely to prevent this class of anomalies, and it fails to do so intermittently.

Why this happens

InnoDB's MVCC has two read modes per transaction:

Snapshot read (consistent read): SELECT returns data from the transaction's snapshot.
Current read (locking read): UPDATE, DELETE, SELECT ... FOR UPDATE read the latest committed version.

After a transaction performs a locking operation on a row, subsequent non-locking {{SELECT}}s of that same row also return the current version, not the snapshot version. This is because the transaction's own modifications are merged with the current row version.

When innodb_snapshot_isolation is enabled, UPDATEs on rows modified since the snapshot are rejected with ER_CHECKREAD, preventing the switch to current-read mode. But due to a race condition in the conflict detection, the UPDATE occasionally slips through, and the transaction enters current-read mode for that row, exposing concurrent changes.

Relationship to Lost Update bug

This is the same underlying race condition as the Lost Update (P4) bug. In both cases, innodb_snapshot_isolation fails to detect that a row was modified after the transaction's snapshot:

Lost Update: Two transactions read the same row, both UPDATE and COMMIT — the second UPDATE should be rejected but isn't. Result: one write is silently lost.
Non-Repeatable Read: A transaction reads a row, a concurrent writer modifies it, the transaction UPDATEs an unrelated column — the UPDATE should be rejected but isn't. Result: subsequent reads see the concurrent writer's changes, violating Repeatable Read.

Test design

Schema

CREATE TABLE test_nrr (

  id     INT PRIMARY KEY,

  name   INT NOT NULL,

  gender INT NOT NULL

) ENGINE=InnoDB;

-- 16 rows, all starting at (name=0, gender=0)

INSERT INTO test_nrr (id, name, gender) VALUES (0, 0, 0), (1, 0, 0), ..., (15, 0, 0);

Writer threads (16 threads)

Each writer continuously increments name on random rows using autocommit:

UPDATE test_nrr SET name = name + 1 WHERE id = :random_row

Reader threads (16 threads)

Each reader repeats:

1. START TRANSACTION WITH CONSISTENT SNAPSHOT

2. SELECT name FROM test_nrr WHERE id = :random_row       → name_before

3. UPDATE test_nrr SET gender = :random WHERE id = :row    → touches unrelated column

   - If ER_CHECKREAD: ROLLBACK, retry (expected, innodb_snapshot_isolation working)

   - If OK: continue

4. SELECT name FROM test_nrr WHERE id = :row               → name_after

5. COMMIT

6. Check: name_before == name_after (Repeatable Read invariant)

Why this proves non-repeatable read

The reader only modifies gender. It never writes name. Under Repeatable Read, two reads of name within the same transaction must return the same value.

If name_before != name_after, the transaction observed a change it didn't make — a non-repeatable read. This can only happen if:

A concurrent writer modified the row (changing name) after the reader's snapshot, AND
The reader's UPDATE on gender was NOT rejected by innodb_snapshot_isolation, AND
After the UPDATE, InnoDB switched to current-read for that row, exposing the writer's change.

Step 2 is the bug ~~MDEV-39263~~: innodb_snapshot_isolation should have rejected the UPDATE.

Results

60-second stress test:

Total writes:     824,640   (13,742/sec)

Total reads:      111,644   (1,861/sec — UPDATE succeeded, both SELECTs completed)

Total rejected:   282,086   (71.6% — ER_CHECKREAD on UPDATE, working correctly)

Anomalies:        11        (name changed after UPDATE on gender)

All 11 anomalies:

row=8:  name_before=138,    name_after=140    (jumped by 2)

row=3:  name_before=4021,   name_after=4023   (jumped by 2)

row=12: name_before=4943,   name_after=4944   (jumped by 1)

row=12: name_before=7392,   name_after=7395   (jumped by 3)

row=0:  name_before=9153,   name_after=9156   (jumped by 3)

row=15: name_before=9502,   name_after=9505   (jumped by 3)

row=15: name_before=13867,  name_after=13869  (jumped by 2)

row=0:  name_before=26171,  name_after=26172  (jumped by 1)

row=13: name_before=34905,  name_after=34906  (jumped by 1)

row=1:  name_before=39889,  name_after=39890  (jumped by 1)

row=6:  name_before=51166,  name_after=51167  (jumped by 1)

In every case, the reader's transaction:

Read name (snapshot value)
Successfully updated gender (should have been rejected — same race as lost update bug)
Read name again — saw a different value because InnoDB switched to current-read

The "jumped by N" values show how many concurrent writer commits occurred between the reader's snapshot and its UPDATE. The UPDATE should have detected the first one and rejected with ER_CHECKREAD, but the race condition allowed it through.

Note that innodb_snapshot_isolation correctly rejected 282,086 out of 282,097 conflicting UPDATEs (99.996%). Only 11 slipped through. The race is rare but real.

Reproducer

Prerequisites

Docker
Go 1.22+: apt install golang-go
Go MySQL driver: go get github.com/go-sql-driver/mysql

Steps

# 1. Start MariaDB 12.2.2

docker run -d --name mariadb-test \

  -e MARIADB_ROOT_PASSWORD=testpass \

  -e MARIADB_DATABASE=testdb \

  -p 3307:3306 mariadb:12.2.2

# 2. Wait for MariaDB to be ready

until docker exec mariadb-test mariadb -uroot -ptestpass -e "SELECT 1" 2>/dev/null; do

  sleep 1

done

# 3. Enable innodb_snapshot_isolation and increase max_connections

docker exec mariadb-test mariadb -uroot -ptestpass -e "

  SET GLOBAL innodb_snapshot_isolation = ON;

  SET GLOBAL max_connections = 200;

# 4. Run the test (60 minutes by default, check every 5 min)

DATABASE_URL="root:testpass@tcp(127.0.0.1:3307)/testdb" \

  go test -v -run TestStressNonRepeatableAfterUpdate -count=1 -timeout 70m

# 5. Optionally reduce duration

DATABASE_URL="root:testpass@tcp(127.0.0.1:3307)/testdb" \

  STRESS_DURATION_SECS=300 \

  go test -v -run TestStressNonRepeatableAfterUpdate -count=1 -timeout 10m

# 6. Cleanup

docker rm -f mariadb-test

What the test outputs

On failure (non-repeatable read detected):

NON-REPEATABLE READ: row=8, name_before=138, name_after=140

  (changed by 2 after own UPDATE on gender)

...

NON-REPEATABLE READ AFTER UPDATE: 11 reads saw name change after updating

  an unrelated column. innodb_snapshot_isolation failed to reject the UPDATE

  despite the row being modified.

--- FAIL: TestStressNonRepeatableAfterUpdate (60.03s)

On success (no anomalies):

PASSED: innodb_snapshot_isolation correctly rejected UPDATEs on modified rows,

  preventing non-repeatable reads.

--- PASS: TestStressNonRepeatableAfterUpdate (60.02s)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

MDEV-39264.test
2 kB
2026-04-15 18:05
MDEV-39264.result
1 kB
2026-04-15 18:05

Issue Links

relates to

MDEV-32652 Commit/rollback must emit an error when not inside active transaction

Closed

MDEV-35124 Set innodb_snapshot_isolation=ON by default

Closed

MDEV-39263 innodb_snapshot_isolation fails to prevent lost updates under contention

Closed

MDEV-38977 Galera - G-single with process pauses

In Testing

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Vadim Tkachenko

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2026-04-05 12:08

Updated:: 2026-04-21 07:55

Resolved:: 2026-04-21 07:50

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.