[MDEV-29661] WSREP GTID Sync Inconsistency Version 10.5.13 | Bionic & Focal Created: 2022-09-28  Updated: 2023-03-10

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.5.13
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: michael Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 1
Labels: gtid, wsrep
Environment:

Ubuntu Bionic and Focal


Issue Links:
Relates
relates to MDEV-10227 MariaDB Galera cluster gtid's falling... Closed

 Description   

I am utilising 4 Independent Mariadb Clusters on Focal and Bionic.

Whilst running version 10.5.13 I am noticing inconsistencies with WSREP GTID synchronisation issues.
This is an issue as I have replica's replication from either one of the Multimaster nodes, but switch to a new timeline and can not recover due to the `gtid_binlog` position being different.

An example two instances within a cluster are:
```node2:
Variable_name Value
gtid_binlog_pos 2-2-268
gtid_binlog_state 2-2-268
gtid_cleanup_batch_size 64
gtid_current_pos 2-2-268
gtid_domain_id 203
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_slave_pos
gtid_strict_mode OFF
wsrep_gtid_domain_id 2
wsrep_gtid_mode ON
node0:
Variable_name Value
gtid_binlog_pos 2-2-310
gtid_binlog_state 2-2-310
gtid_cleanup_batch_size 64
gtid_current_pos 2-2-310
gtid_domain_id 200
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_slave_pos
gtid_strict_mode OFF
wsrep_gtid_domain_id 2
wsrep_gtid_mode ON```

I have verified the configurations for log_slave_updates, gtid_domain_id, server_id details are correct according to the documentation.

This bug feels very remnant of the Ticket MDEV-10227



 Comments   
Comment by michael [ 2022-09-29 ]

created a new cluster on Focal and Bionic using version 10.5.13.

The cluster at the beginning in consistent state:

node0:
    Variable_name	Value
    gtid_binlog_pos	2-2-370
    gtid_binlog_state	2-2-370
    gtid_cleanup_batch_size	64
    gtid_current_pos	2-2-370
    gtid_domain_id	200
    gtid_ignore_duplicates	OFF
    gtid_pos_auto_engines	
    gtid_slave_pos	
    gtid_strict_mode	OFF
    wsrep_gtid_domain_id	2
    wsrep_gtid_mode	ON
node1:
    Variable_name	Value
    gtid_binlog_pos	2-2-370
    gtid_binlog_state	2-2-370
    gtid_cleanup_batch_size	64
    gtid_current_pos	2-2-370
    gtid_domain_id	201
    gtid_ignore_duplicates	OFF
    gtid_pos_auto_engines	
    gtid_slave_pos	
    gtid_strict_mode	OFF
    wsrep_gtid_domain_id	2
    wsrep_gtid_mode	ON
node2:
    Variable_name	Value
    gtid_binlog_pos	2-2-370
    gtid_binlog_state	2-2-370
    gtid_cleanup_batch_size	64
    gtid_current_pos	2-2-370
    gtid_domain_id	202
    gtid_ignore_duplicates	OFF
    gtid_pos_auto_engines	
    gtid_slave_pos	
    gtid_strict_mode	OFF
    wsrep_gtid_domain_id	2
    wsrep_gtid_mode	ON

Running a single DELETE command:

DELETE FROM `TokenRealm` WHERE `TokenRealm`.token_id = 313

GTID_BINLOG After single DELETE on Node0

node2:
    Variable_name	Value
    gtid_binlog_pos	2-2-371
    gtid_binlog_state	2-2-371
node1:
    Variable_name	Value
    gtid_binlog_pos	2-2-371
    gtid_binlog_state	2-2-371
node0:
    Variable_name	Value
    gtid_binlog_pos	2-2-381
    gtid_binlog_state	2-2-381

Generated at Thu Feb 08 10:10:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.