Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38579

MariaDB Galera IST Failure Due to Sequence Number Mismatch

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 11.4.4, 11.4.7, 11.4.9
    • None
    • Galera
    • None

    Description

      Summary

      MariaDB Galera cluster experiences IST (Incremental State Transfer) failure when a joiner node attempts to recover from a donor node. The failure is caused by a sequence number (seqno) mismatch that occurs due to timing issues in the recovery process.

      Issue Title

      *IST Receiver Failure: Sequence Number Mismatch Between GCache Recovery and InnoDB Recovery*

      Severity

      *Critical* - Causes node restart loop and cluster instability

      Environment

      • MariaDB Version**: 11.4.4, but same issue is seen in 11.4.7 and 11.4.9 as well.
      • Galera Version**: 4.21(rd811a57)
      • Cluster Setup**: 3-node Galera cluster with SST via mariabackup
      • Storage Engine**: InnoDB
      • 3 node kubernetes cluster docker image from Mariadb.org repo

      Problem Description

      The three node cluster are running in kubernetes environment with

      Root Cause

      The IST failure occurs due to a sequence number mismatch between:
      1. *GCache Recovery Phase*: Recovers seqno from persistent galera.cache
      2. *SST Phase*: Donor sends writesets for a specific seqno range
      3. *InnoDB Recovery Phase*: InnoDB recovery completes with a different seqno than expected
      4. *IST Application Phase*: IST tries to apply writesets but seqno doesn't match

      Detailed Sequence of Events

      Phase 1: GCache Recovery (Initial Position)

      ```
      2026-01-16 12:53:55 0 [Note] WSREP: Found saved state: 6e350d19-f301-11f0-bb57-471d08f203bb:6966, safe_to_bootstrap: 0
      2026-01-16 12:53:55 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 1-6966
      ```

      • *Initial GCache seqno*: 6966
      • *Status*: Node has persistent state up to seqno 6966

      Phase 2: State Transfer Request

      ```
      2026-01-16 12:53:56 2 [Note] WSREP: State transfer required:
      Group state: 6e350d19-f301-11f0-bb57-471d08f203bb:6968
      Local state: 6e350d19-f301-11f0-bb57-471d08f203bb:6966
      ```

      • *Group state*: 6968 (cluster is at seqno 6968)
      • *Local state*: 6966 (joiner is at seqno 6966)
      • *Gap*: 2 transactions (6967-6968)

      Phase 3: SST via mariabackup

      ```
      WSREP_SST: [INFO] 'xtrabackup_ist' received from donor: Running IST
      WSREP_SST: [INFO] Galera co-ords from donor: 6e350d19-f301-11f0-bb57-471d08f203bb:6966 0
      ```

      • *Donor sends*: Backup at seqno 6966
      • *Expected IST range*: 6967-6968

      Phase 4: InnoDB Recovery (THE PROBLEM)

      ```
      2026-01-16 12:53:57 0 [Note] InnoDB: log sequence number 31084789; transaction id 5168
      2026-01-16 12:53:57 0 [Note] InnoDB: Loading buffer pool(s) from /bitnami/mariadb/data/ib_buffer_pool
      2026-01-16 12:54:00 3 [Note] WSREP: Recovered position from storage: 6e350d19-f301-11f0-bb57-471d08f203bb:6666
      ```

      • *InnoDB recovered position*: 6666 (LOWER than expected 6966!)
      • *This is the critical issue*: InnoDB recovery completes with seqno 6666, not 6966

      Phase 5: IST Application Failure

      ```
      2026-01-16 12:54:00 2 [Note] WSREP: Receiving IST: 302 writesets, seqnos 6667-6968
      2026-01-16 12:54:00 0 [Note] WSREP: ####### IST applying starts with 6667
      2026-01-16 12:54:00 6 [ERROR] WSREP: Receiving IST failed, node restart required:
      IST receiver reported failure: 'IST started with wrong seqno: 6929, expected <= 6667'
      ```

      • *Expected IST start*: 6667 (after InnoDB recovery at 6666)
      • *Actual IST start*: 6929 (MISMATCH!)
      • *Result*: IST fails, node requires restart

      Root Cause Analysis

      The Timing Issue

      1. *GCache reports*: Seqno 6966 is safe
      2. *SST backup taken at*: Seqno 6966
      3. *InnoDB recovery completes at*: Seqno 6666 (300+ transactions behind!)
      4. *IST tries to apply*: Writesets starting from 6667, but InnoDB is at 6666
      5. *Seqno adjustment happens too late*: After IST has already started with wrong expectations

      Why InnoDB Recovery Lags

      • InnoDB crash recovery may not replay all transactions from the backup
      • Buffer pool loading may cause seqno to be lower than expected
      • Galera seqno tracking and InnoDB LSN may be out of sync
      • The SST backup seqno (6966) doesn't match the actual InnoDB recovered seqno (6666)

      Error Messages

      ```
      [ERROR] WSREP: Receiving IST failed, node restart required:
      IST receiver reported failure: 'IST started with wrong seqno: 6929, expected <= 6667'
      ```

      Impact

      • *Node Restart Loop*: Node continuously restarts trying to recover
      • *Cluster Instability*: Joiner node cannot join the cluster
      • *Data Consistency Risk*: Incomplete state transfer
      • *Service Disruption*: Affected node is unavailable

      Expected Behavior

      1. GCache recovery should match InnoDB recovery seqno
      2. SST backup seqno should match actual InnoDB recovered seqno
      3. IST should start with correct seqno range
      4. No seqno mismatch between phases

      Short-term Workaround

      Manual restart of the node is required

      Steps to Reproduce

      1. Set up 3-node Galera cluster
      2. Perform transactions to advance seqno
      3. Stop one node (joiner)
      4. Continue transactions on other nodes
      5. Restart joiner node
      6. Observe IST failure with seqno mismatch

      Configuration Details

      Galera Configuration

      ```
      base_dir = /bitnami/mariadb/data/
      base_host = 10.59.47.32
      base_port = 4567
      cert.log_conflicts = no
      cert.optimistic_pa = yes
      debug = no
      evs.auto_evict = 0
      evs.delay_margin = PT1S
      evs.delayed_keep_period = PT30S
      evs.inactive_check_period = PT0.5S
      evs.inactive_timeout = PT15S
      evs.join_retrans_period = PT1S
      evs.max_install_timeouts = 3
      evs.send_window = 4
      evs.stats_report_period = PT1M
      evs.suspect_timeout = PT5S
      evs.user_send_window = 2
      evs.view_forget_timeout = PT24H
      gcache.dir = /bitnami/mariadb/data/
      gcache.keep_pages_size = 0
      gcache.mem_size = 0
      gcache.name = galera.cache
      gcache.page_size = 128M
      gcache.recover = yes
      gcache.size = 20G
      gcomm.thread_prio = (default)
      gcs.fc_debug = 0
      gcs.fc_factor = 1.0
      gcs.fc_limit = 16
      gcs.fc_master_slave = no
      gcs.fc_single_primary = no
      gcs.max_packet_size = 64500
      gcs.max_throttle = 0.25
      gcs.recv_q_hard_limit = 9223372036854775807
      gcs.recv_q_soft_limit = 0.25
      gcs.sync_donor = no
      gmcast.segment = 0
      gmcast.version = 0
      ```

      InnoDB Configuration

      ```
      innodb_buffer_pool_size = 2.000GiB
      innodb_buffer_pool_chunk_size = 32.000MiB
      innodb_log_sequence_number = 31084789
      innodb_transaction_id = 5168
      innodb_undo_tablespaces = 3 (active)
      innodb_rollback_segments = 128
      innodb_temp_file_size = 12.000MiB
      innodb_use_native_aio = yes
      innodb_use_avx512 = yes
      innodb_compression_algorithm = zlib 1.2.13
      ```

      SST (State Transfer) Configuration

      ```
      sst_method = mariabackup
      sst_role = joiner
      sst_address = 10.59.47.32
      sst_datadir = /bitnami/mariadb/data/
      sst_defaults_file = /opt/bitnami/mariadb/conf/my.cnf
      sst_parent_pid = 1
      sst_progress = 0
      sst_binlog = mysql-bin
      sst_ssl_mode = DISABLED
      sst_ssl_ca = (empty)
      sst_ssl_capath = (empty)
      sst_ssl_cert = (empty)
      sst_ssl_key = (empty)
      sst_ssl_encrypt = 0
      sst_streamer = socat
      sst_stream_port = 4444
      sst_socket_info_utility = ss
      sst_timeout = 310 seconds (with -k 310 300)
      ```

      Cluster Topology

      ```
      Cluster UUID: 6e350d19-f301-11f0-bb57-471d08f203bb
      Cluster Name: DBGalera
      Cluster Size: 3 nodes
      Cluster State: PRIMARY

      Node 0 (Index 0):
      UUID: 008c0b68-a846
      Address: tcp://10.59.47.39:4567
      Status: SYNCED
      Hostname: ttd-db-mariadb-0

      Node 1 (Index 1):
      UUID: 61658c97-8ecd
      Address: tcp://10.59.47.64:4567
      Status: SYNCED
      Hostname: ttd-db-mariadb-0

      Node 2 (Index 2) - AFFECTED NODE:
      UUID: b53c8051-951c
      Address: tcp://10.59.47.32:4567
      Status: JOINER (failed to join)
      Hostname: ttd-db-mariadb-0
      Role: Joiner
      Donor: Node 1 (Index 1)
      ```

      GCache State

      ```
      GCache Version: 2
      GCache UUID: 6e350d19-f301-11f0-bb57-471d08f203bb
      GCache Seqno Range: 1 - 6966
      GCache Offset: 1280
      GCache Synced: yes
      GCache Total Size: 21474836504 bytes (~20GB)
      GCache Unused Buffers: 55699528 bytes
      GCache Free Space: 21419137320 bytes
      GCache Locked Buffers: 2/6968
      GCache Recovery Status: Gapless sequence found (1-6966)
      ```

      Protocol Versions

      ```
      GCS Protocol: 5
      Replication Protocol: 11
      Application Protocol: 4
      Galera Capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY,
      ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED,
      PREORDERED, STREAMING, NBO
      ```

      Additional Information

      • *Galera Cache Size*: 20G
      • *GCache Page Size*: 128M
      • *Cluster UUID*: 6e350d19-f301-11f0-bb57-471d08f203bb
      • *Affected Node*: ttd-db-mariadb-0 (index 2)
      • *Donor Node*: ttd-db-mariadb-0 (index 1)
      • *Cluster State*: PRIMARY (2/3 nodes synced)
      • *InnoDB Buffer Pool*: 2GB
      • *SST Method*: mariabackup with IST optimization
      • *Network*: 10.59.47.0/24 subnet

      Related Issues

      • Galera seqno tracking inconsistency
      • InnoDB recovery vs Galera seqno synchronization
      • SST backup seqno validation
      • GCache-InnoDB state mismatch

      Attachments

      • Full MariaDB error log (provided above)
      • Galera configuration (detailed above)
      • Cluster topology information (detailed above)
      • InnoDB configuration (detailed above)

      Attachments

        Activity

          People

            seppo Seppo Jaakola
            Sahai Har Gagan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.