[MDEV-9856] wsrep_gtid_mode requires nodes to have the same log_bin path Created: 2016-04-01  Updated: 2019-05-23  Resolved: 2019-05-23

Status: Closed
Project: MariaDB Server
Component/s: Galera, Replication
Affects Version/s: 10.1.13
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 2
Labels: galera, gtid, replication

Issue Links:
Blocks
blocks MDEV-20720 Galera: Replicate MariaDB GTID to oth... Closed
Relates
relates to MDEV-9855 log_slave_updates is required for wsr... Closed
Sprint: 10.1.24

 Description   

wsrep_gtid_mode currently requires nodes to have the same log_bin path for it to work. This path can be checked at run-time by looking at log_bin_basename. Is this intentional?

For example, let's say we have a 2-node cluster.

Node 1 has the following:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'gtid%pos';
+------------------+------------------------+
| Variable_name    | Value                  |
+------------------+------------------------+
| gtid_binlog_pos  | 0-1-5,1-1-63334,3-1-15 |
| gtid_current_pos | 0-1-5,1-1-63334,3-1-15 |
| gtid_slave_pos   |                        |
+------------------+------------------------+
3 rows in set (0.00 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'log_bin_basename';
+------------------+----------------------------+
| Variable_name    | Value                      |
+------------------+----------------------------+
| log_bin_basename | /var/lib/mysql/mariadb-bin |
+------------------+----------------------------+
1 row in set (0.00 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'wsrep_gtid%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_gtid_domain_id | 3     |
| wsrep_gtid_mode      | ON    |
+----------------------+-------+
2 rows in set (0.00 sec)

Let's say that I start up node 2 with a configuration with these parameters:

wsrep_gtid_mode=ON
wsrep_gtid_domain_id=3
log_bin=mariadb-bin
wsrep_sst_method=rsync

When it starts up, it looks good:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'gtid%pos';
+------------------+------------------------+
| Variable_name    | Value                  |
+------------------+------------------------+
| gtid_binlog_pos  | 0-1-5,1-1-63334,3-2-14 |
| gtid_current_pos | 3-2-14                 |
| gtid_slave_pos   |                        |
+------------------+------------------------+
3 rows in set (0.01 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'log_bin_basename';
+------------------+----------------------------+
| Variable_name    | Value                      |
+------------------+----------------------------+
| log_bin_basename | /var/lib/mysql/mariadb-bin |
+------------------+----------------------------+
1 row in set (0.00 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'wsrep_gtid%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_gtid_domain_id | 3     |
| wsrep_gtid_mode      | ON    |
+----------------------+-------+
2 rows in set (0.00 sec)

But what happens if we change log_bin and restart the server? It doesn't look so good:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'gtid%pos';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| gtid_binlog_pos  | 3-1-1 |
| gtid_current_pos |       |
| gtid_slave_pos   |       |
+------------------+-------+
3 rows in set (0.01 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'log_bin_basename';
+------------------+-----------------------------+
| Variable_name    | Value                       |
+------------------+-----------------------------+
| log_bin_basename | /var/lib/mysql/mariadb-bin1 |
+------------------+-----------------------------+
1 row in set (0.00 sec)
 
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'wsrep_gtid%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_gtid_domain_id | 3     |
| wsrep_gtid_mode      | ON    |
+----------------------+-------+
2 rows in set (0.00 sec)



 Comments   
Comment by Sachin Setiya (Inactive) [ 2017-05-22 ]

Actually this is not by design , this is side effect of design. Internally gtid is not transferred between nodes, We simply calculate no of events and increase gtid, if we change the log file , our old gtid no is lost , so we have to start again.
Solving 10715 will solve this.

Comment by Sachin Setiya (Inactive) [ 2017-12-11 ]

Actually Solving 10715 wont solve this. Because gtid generated at node 1 can not be accurate gtid , say if node 2 get a event in the middle of after gtid transfer by A and the time which it takes to reach B. Although we can delete duplicate gtid this will harm performance and gtid duplication is not a check of data duplication

Comment by Jan Lindström (Inactive) [ 2019-05-23 ]

I would say this is not a bug. However, documentation could be improved on this area.

greenman Can you update the documentation.

Note that full GTID support is still to be announced later see https://jira.mariadb.org/browse/MDEV-10715

Comment by Geoff Montee (Inactive) [ 2019-05-23 ]

This is already documented here:

https://mariadb.com/kb/en/library/using-mariadb-gtids-with-mariadb-galera-cluster/#enabling-wsrep-gtid-mode

Generated at Thu Feb 08 07:37:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.