Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38974

Galera - Write loss with coordinated process crashes due to innodb_flush_log_at_trx_commit=0

    XMLWordPrintable

Details

    Description

      MariaDB's documentation recommends that users set up Galera clusters with innodb_flush_log_at_trx_commit=0, characterizing it as "a safer, recommended option with Galera Cluster". It is not safe; this frequently causes the loss of committed transactions when process crashes occur in rapid succession.

      https://mariadb.com/docs/galera-cluster/galera-management/configuration/configuring-mariadb-galera-cluster

      A test suite to reproduce this is at https://github.com/jepsen-io/mysql; use commit 3500f8c80bd0f419d7f21a7b89eaf65f8651a7af, and try something like:

      lein run test-all --nodes n1,n2,n3 -w append --concurrency 6n --nemesis kill --time-limit 300 --test-count 5 --isolation repeatable-read --expected-consistency-model snapshot-isolation

      For an example failing case, including config files and the error/general logs on each node, see:

      https://s3.amazonaws.com/jepsen.io/analyses/mariadb-galera-12.1.2/20260105T175818-lost-writes.zip

      I suggest revising MariaDB's documentation to make it clear that this option allows the loss of committed transactions.

      Attachments

        Activity

          People

            seppo Seppo Jaakola
            aphyr Kyle Kingsbury
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.