[MDEV-29044] Lost update anomaly found in Galera-Maria cluster - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.4(EOL)
Fix Version/s: 10.4(EOL)
Component/s: Galera
Labels:
- Galera
- lost-update

Description

If I understand it correctly, transactions issued on different nodes in a Galera-Maria cluster should be free of lost update anomalies, as claimed in https://galeracluster.com/library/training/tutorials/supporting-transaction-isolation-levels.html. However, we have observed such anomaly in our tests.

Experimental Setup

We have set up our tests with the following configuration:

We run a cluster of two nodes with docker-compose, using mariadb 10.7.3. We use a simple database schema with a single two-column table and each row representing a key-value pair:

CREATE TABLE IF NOT EXISTS dbcop.variables (var BIGINT(64) UNSIGNED NOT NULL PRIMARY KEY, val BIGINT(64) UNSIGNED NOT NULL)

We have one client session on each node; both sessions are executed concurrently. First, we initialize the table (using a different session) with values set to 0. After the initialization, both clients run a stream of transactions produced by our workload generator. The values written by the clients are guaranteed to be unique.

Query results

The results observed by the client are in client-result.log:

Transaction (0, 0)

WRITE(0, 0)

...

where each transaction is identified by (session id, txn id). In a session, transactions with larger txn ids are executed after those with smaller ids finish. Queries are shown as READ(var, val) or WRITE(var, val).

When running the experiment, a failed transaction is retried until all its operations succeed and the transaction is successfully committed. Only the successful shot is logged on the client side, and all earier executions are shown as ROLLBACK in the query logs.

The lost-update violation

Both transactions txn (1, 5) and txn (2, 13) read var=0, val=4 written by txn (1, 4) and both transaction successfully commit their writes on var=0. A lost update found! See lost-update.pdf for the complete scenario.

Database Logs

The server-side logs of the nodes are attached in server-logs.zip. The query logs and error logs are stored in mariadb_general.log and mariadb_error.log. The binary logs are in mariadb_bin*.

Reproducing the violation

The tools to reproduce the violation are attached in tools.zip.

1. Start the galera cluster in docker

The docker-compose file is generator/docker/docker-compose.yml. Database
logs are stored in /tmp/ in the containers.

2. Record and verify a history

First build the tools:

    # tested on ubuntu 22.04

    apt install g++ openjdk-11-jdk cmake libgmp-dev zlib1g-dev pkg-config libssl-dev cargo

    cd generator

    cargo b --release

    cd ../PolySI

    ./gradlew jar

    cd ..

Then generate txns and record history:

    # Run txns and record history

    rm -rf /tmp/a/ /tmp/b

    ./generator/target/release/dbcop generate -d /tmp/a/ -e 2 -n 2 --nhist 1 -t 10 -v 1

    ./generator/target/release/dbcop run --db galera -d /tmp/a/ -o /tmp/b/ 127.0.0.1:3306 127.0.0.1:3307

    # requires Java 11; will print [[[ REJECT ]]] if a violation is found

    java -jar ./PolySI/build/libs/PolySI-1.0.0-SNAPSHOT.jar audit -t dbcop --no-pruning /tmp/b/hist-00000/history.bincode

    # print history

    java -jar ./PolySI/build/libs/PolySI-1.0.0-SNAPSHOT.jar dump -t dbcop /tmp/b/hist-00000/history.bincode

Note that, since we are running black-box testing with randomized workloads, we cannot reproduce exactly the same violating histories. As we observed, violations manifest very frequently, e.g., we observed 8 violating histories out of 10 collected histories.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

client-result.log
0.9 kB
2022-08-24 07:25
client-result-debug.log
0.9 kB
2022-08-27 10:13
client-result-sync.log
0.9 kB
2022-08-25 07:57
lost-update.pdf
52 kB
2022-08-24 13:55
my.cnf
1 kB
2022-08-25 02:59
my.conf-sync
1 kB
2022-08-25 07:59
server-logs.zip
55 kB
2022-08-24 07:25
server-logs-debug.zip
210 kB
2022-08-27 10:13
server-log-sync.zip
58 kB
2022-08-25 07:56
tools.zip
3.20 MB
2022-08-24 07:31

Activity

Ascending order - Click to sort in descending order

View 19 older comments

Jan Lindström (Inactive) added a comment - 2022-08-31 13:21

nobiplusplus czg Galera cluster can provide only those consistency levels that are supported by InnoDB storage engine. Now InnoDB repeatable read has some adoptions for MVCC see https://dev.mysql.com/doc/refman/5.7/en/innodb-consistent-read.html. Remember that UPDATE will also do a read and this read happens from consistent view created on SELECT. However, as that SELECT does not use row locks values could change when UPDATE reads them again using row locking. Only way currently to make sure that SELECT and UPDATE see exactly the same database state is to use SELECT ... FOR UPDATE that will set row locks.

Jan Lindström (Inactive) added a comment - 2022-08-31 13:21 nobiplusplus czg Galera cluster can provide only those consistency levels that are supported by InnoDB storage engine. Now InnoDB repeatable read has some adoptions for MVCC see https://dev.mysql.com/doc/refman/5.7/en/innodb-consistent-read.html . Remember that UPDATE will also do a read and this read happens from consistent view created on SELECT. However, as that SELECT does not use row locks values could change when UPDATE reads them again using row locking. Only way currently to make sure that SELECT and UPDATE see exactly the same database state is to use SELECT ... FOR UPDATE that will set row locks.

Daniele Sciascia added a comment - 2022-09-01 09:36

The claim that Galera prevents "lost update" on transactions issued on different nodes appears to be wrong: https://galeracluster.com/library/training/tutorials/supporting-transaction-isolation-levels.html

In the article above, the following example is given:

# initially table t contains (1,0)

# trx 1 and trx 2 execute on node 1 and node 2, respectively

trx 1: begin;

trx 2: begin;

trx 1: select * from t; # returns row (1,0)

trx 2: select * form t; # returns row (1,0)

trx 1: update t set j=1 where i=1;

trx 2: update t set j=1 where i=1;

trx 1: commit; # success

trx 2: commit; # deadlock error

The execution above happens to prevent a "lost update". However, it sufficient to switch the order of commit in trx 1 and the update of trx 2 to have an execution in which both transactions commit successful:

# initially table t contains (1,0)

# trx 1 and trx 2 execute on node 1 and node 2, respectively

trx 1: begin;

trx 2: begin;

trx 1: select * from t; # returns row (1,0)

trx 2: select * form t; # returns row (1,0)

trx 1: update t set j=1 where i=1;

trx 1: commit; # success

trx 2: update t set j=1 where i=1; # make sure this happens after the commit of trx 1 is applied to node 2

trx 2: commit; # success

Notice that, for both transactions to commit, the "update" of trx 2 must happen after the commit of trx 1 is applied to node 2.

Daniele Sciascia added a comment - 2022-09-01 09:36 The claim that Galera prevents "lost update" on transactions issued on different nodes appears to be wrong: https://galeracluster.com/library/training/tutorials/supporting-transaction-isolation-levels.html In the article above, the following example is given: # initially table t contains (1,0) # trx 1 and trx 2 execute on node 1 and node 2, respectively trx 1: begin; trx 2: begin; trx 1: select * from t; # returns row (1,0) trx 2: select * form t; # returns row (1,0) trx 1: update t set j=1 where i=1; trx 2: update t set j=1 where i=1; trx 1: commit; # success trx 2: commit; # deadlock error The execution above happens to prevent a "lost update". However, it sufficient to switch the order of commit in trx 1 and the update of trx 2 to have an execution in which both transactions commit successful: # initially table t contains (1,0) # trx 1 and trx 2 execute on node 1 and node 2, respectively trx 1: begin; trx 2: begin; trx 1: select * from t; # returns row (1,0) trx 2: select * form t; # returns row (1,0) trx 1: update t set j=1 where i=1; trx 1: commit; # success trx 2: update t set j=1 where i=1; # make sure this happens after the commit of trx 1 is applied to node 2 trx 2: commit; # success Notice that, for both transactions to commit, the "update" of trx 2 must happen after the commit of trx 1 is applied to node 2.

Zhenge Chen added a comment - 2022-09-01 15:05 - edited

I have confirmed Daniele's finding using our tool. So, is that simply an incorrect example given? Or Galera actually does not prevent lost updates BY DESIGN?

Zhenge Chen added a comment - 2022-09-01 15:05 - edited I have confirmed Daniele's finding using our tool. So, is that simply an incorrect example given? Or Galera actually does not prevent lost updates BY DESIGN?

Daniele Sciascia added a comment - 2022-09-02 08:03

czg The documentation is wrong. Galera does not prevent lost updates.
Thanks for trying your tool on Galera and to bring this up. On our part, ee will remove that claim from the documentation.

Daniele Sciascia added a comment - 2022-09-02 08:03 czg The documentation is wrong. Galera does not prevent lost updates. Thanks for trying your tool on Galera and to bring this up. On our part, ee will remove that claim from the documentation.

Si Liu added a comment - 2022-09-02 12:21

Thanks, Daniele, for your confirmation and help along the way!

Si Liu added a comment - 2022-09-02 12:21 Thanks, Daniele, for your confirmation and help along the way!

MariaDB Server

Lost update anomaly found in Galera-Maria cluster

Details

Description

Experimental Setup

Query results

The lost-update violation

Database Logs

Reproducing the violation

Attachments

Attachments

Activity

People

Dates

Git Integration