Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-40179

Upon joining wsrep_order_and_check_continuity() can miss to commit some transactions

    XMLWordPrintable

Details

    • Can result in hang or crash
    • Q3/2026 Replic. Development

    Description

      Discovered when testing the fix for MDEV-38147:

      On the joiner, the wsrep XID continuity check (handler.cc:wsrep_order_and_check_continuity()) recovers a contiguous run of prepared wsrep XIDs starting at the SE checkpoint and stops at the first gap, potentially leaving some prepared transactions uncommitted.
      The post-recovery dry-run ha_recover(0) (mysqld.cc:init_server_conponents()) then finds those leftover prepared transactions and aborts.

      The non-contiguous prepared set in the mariabackup snapshot is almost certainly the the consequence of wsrep_slave_threads>0: parallel appliers preparing out of order.
      and the per-transaction rotation at 4096 makes the snapshot land mid-gap. It only affects the joiner.

      Attachments

        Activity

          People

            Yurchenko Alexey Yurchenko
            Yurchenko Alexey Yurchenko
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 5d
                5d
                Remaining:
                Remaining Estimate - 5d
                5d
                Logged:
                Time Spent - Not Specified
                Not Specified

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.