Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
12.2.2
-
None
-
Debian Trixie, Galera 26.4.25-deb13
Description
With innodb_flush_log_at_trx_commit=1, write loss due to coordinated process crashes in Galera Cluster (see MDEV-38974) is significantly reduced. However, it is not eliminated! With process crashes and network partitions, Galera Cluster occasionally loses the effects of committed transactions. For example, at roughly 141 seconds into this Jepsen test run (https://s3.amazonaws.com/jepsen.io/analyses/mariadb-galera-12.1.2/20260224T175533-lost-writes-2.zip), the cluster lost approximately nineteen seconds of writes across four separate rows: 0, 285, 410, and 446. Some, like key 0, lost only a short postfix of elements. Key 410, on the other hand, lost all twenty-five elements and began afresh:
Time (s) Elements
-------- -----------------------------
141.36 17, 19, 26, ..., 91, 92, 97
152.79 175
153.21 175, 176, 177, 179
154.46 175, 176, 177, 179, 180
Note that the transactions which wrote 17, 19, and so on were successfully committed; their effects definitely should not have been lost.
You can reproduce this with the Jepsen MariaDB test suite, at https://github.com/jepsen-io/mysql. Try commit df8c29675809444b730a6ea5da0d80e243e7fc70, and try something like:
lein run test-all --db maria --nodes n1,n2,n3 -w append --concurrency 6n --nemesis kill,partition --time-limit 300 --test-count 500 --innodb-flush-log-at-trx-commit 1 --expected-consistency-model snapshot-isolation --isolation repeatable-read
This takes a few hours--I haven't had as much time as I'd like to put into getting a concise reproduction case. Nevertheless, this generally spits out a handful of cases of data loss each day.