Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
12.1.2, 12.2.2
-
None
-
Debian Trixie, Galera 26.4.25-deb13
Description
MariaDB's documentation recommends that users set up Galera clusters with innodb_flush_log_at_trx_commit=0, characterizing it as "a safer, recommended option with Galera Cluster". It is not safe; this frequently causes the loss of committed transactions when process crashes occur in rapid succession.
A test suite to reproduce this is at https://github.com/jepsen-io/mysql; use commit 3500f8c80bd0f419d7f21a7b89eaf65f8651a7af, and try something like:
lein run test-all --nodes n1,n2,n3 -w append --concurrency 6n --nemesis kill --time-limit 300 --test-count 5 --isolation repeatable-read --expected-consistency-model snapshot-isolation
For an example failing case, including config files and the error/general logs on each node, see:
https://s3.amazonaws.com/jepsen.io/analyses/mariadb-galera-12.1.2/20260105T175818-lost-writes.zip
I suggest revising MariaDB's documentation to make it clear that this option allows the loss of committed transactions.