[MDEV-29171] changing the value of wsrep_gtid_domain_id with full cluster restart fails on some nodes Created: 2022-07-26  Updated: 2023-11-08  Resolved: 2023-01-17

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.5, 10.6, 10.7, 10.8
Fix Version/s: 10.5.19, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3

Type: Bug Priority: Critical
Reporter: Hartmut Holzgraefe Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 4
Labels: None

Issue Links:
Relates
relates to MDEV-25115 Changes to wsrep_gtid_domain_id in my... Closed
relates to MDEV-28015 Mariabackup | GTID value is missing, ... Closed

 Description   

I know that changing wsrep_gtid_domain_id in the configuration file on all nodes requires a full cluster stop and restart to pick up the change as only the first node will actually use the config file value, while all further nodes receive the value to use from their donor during IST/SST.

But when e.g. changing the relevant configuration settings from

wsrep-gtid-mode=ON
wsrep_gtid_domain_id=100
gtid_domain_id=...different value per node....

to

wsrep_gtid_domain_id=200

then stopping all nodes, staring node-1 with galera_new_cluster, and node-2 with systemctl start mariadb, all is fine so far, both nodes show wsrep_gtid_domain_id=200 in SHOW VARIABLES LIKE 'wsrep_gtid_domain_id'. But when starting a 3rd or further nodes they all usually still show 100 instead of 200.

Looking at the error logs I see node-3 usually use node-2 as donor, not node-1.

When I force the node started with galera_new_cluster as the default donor with wsrep_sst_donor=node-1 I get the correct new 200 value on all nodes though.

So somehow nodes seem to remember the previous value and pass that on to joiners instead of the value they received from their own donor, or read from their configuration file.



 Comments   
Comment by Ramesh Sivaraman [ 2022-08-04 ]

Reproduced the issue, if we set the secondary node as the donor node the previous wsrep_gtid_domain_id is selected when restarting the joiner node.
Node1

MariaDB [(none)]> select @@wsrep_gtid_domain_id,@@wsrep_node_name;
+------------------------+-------------------+
| @@wsrep_gtid_domain_id | @@wsrep_node_name |
+------------------------+-------------------+
|                    200 | galera-node1      |
+------------------------+-------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Node2

MariaDB [(none)]> select @@wsrep_gtid_domain_id,@@wsrep_node_name;
+------------------------+-------------------+
| @@wsrep_gtid_domain_id | @@wsrep_node_name |
+------------------------+-------------------+
|                    200 | galera-node2      |
+------------------------+-------------------+
1 row in set (0.001 sec)
 
MariaDB [(none)]> 

Node3

MariaDB [(none)]> select @@wsrep_gtid_domain_id,@@wsrep_node_name;
+------------------------+-------------------+
| @@wsrep_gtid_domain_id | @@wsrep_node_name |
+------------------------+-------------------+
|                      7 | galera-node3      |
+------------------------+-------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]>
MariaDB [(none)]> select variable_name, global_value, global_value_origin, global_value_path from information_schema.system_variables where variable_name='WSREP_GTID_DOMAIN_ID';
+----------------------+--------------+---------------------+-------------------+
| variable_name        | global_value | global_value_origin | global_value_path |
+----------------------+--------------+---------------------+-------------------+
| WSREP_GTID_DOMAIN_ID | 7            | CONFIG              | /etc/mysql/my.cnf |
+----------------------+--------------+---------------------+-------------------+
1 row in set (0.002 sec)
 
MariaDB [(none)]>  \q
Bye
$ sudo grep wsrep_gtid_domain_id /etc/mysql/my.cnf 
wsrep_gtid_domain_id = 200
$ 
$ sudo grep wsrep_sst_donor /etc/mysql/my.cnf 
wsrep_sst_donor=192.168.100.20
$

If we force SST after setting secondary node as donor node restarting node 3 will fail.

2022-08-04  6:31:32 1 [Note] WSREP: Server status change connected -> joiner
2022-08-04  6:31:32 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-08-04  6:31:32 0 [Note] WSREP: Joiner monitor thread started to monitor
2022-08-04  6:31:32 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'joiner' --address '192.168.100.30' --datadir '/var/lib/mysql/' --parent '706110' --binlog '/var/lib/mysql/master-bin' --binlog-index '/v
ar/lib/mysql/master-bin' --mysqld-args --wsrep_start_position=adf2999d-0c9a-11ed-aeea-0349b3a815aa:156,7-8-12'
WSREP_SST: [INFO] mariabackup SST started on joiner (20220804 06:31:32.077)
WSREP_SST: [INFO] SSL configuration: CA='', CAPATH='', CERT='', KEY='', MODE='DISABLED', encrypt='0' (20220804 06:31:32.119)
WSREP_SST: [INFO] Streaming with mbstream (20220804 06:31:32.222)
WSREP_SST: [INFO] Using socat as streamer (20220804 06:31:32.224)
WSREP_SST: [INFO] Evaluating timeout -k 310 300 socat -u TCP-LISTEN:4444,reuseaddr stdio | '/usr//bin/mbstream' -x; RC=( ${PIPESTATUS[@]} ) (20220804 06:31:32.256)
2022-08-04  6:31:32 1 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 158, STRv: 3
2022-08-04  6:31:32 1 [Note] WSREP: IST receiver addr using tcp://192.168.100.30:4568
2022-08-04  6:31:32 1 [Note] WSREP: Prepared IST receiver for 0-158, listening at: tcp://192.168.100.30:4568
2022-08-04  6:31:32 0 [Warning] WSREP: Member 0.0 (galera-node3) requested state transfer from '192.168.100.20', but it is impossible to select State Transfer donor: No route to host
2022-08-04  6:31:32 1 [ERROR] WSREP: Requesting state transfer failed: -113(No route to host)
2022-08-04  6:31:32 1 [ERROR] WSREP: State transfer request failed unrecoverably: 113 (No route to host). Most likely it is due to inability to communicate with the cluster primary component. Restart required.
2022-08-04  6:31:32 1 [Note] WSREP: ReplicatorSMM::abort()

Comment by Ramesh Sivaraman [ 2022-08-04 ]

jplindst SST failure is not due to wsrep_gtid_domain_id issue. wsrep_sst_donor must be a node name to trigger SST, not the IP address of a donor node. And wsrep_gtid_domain_id change works fine if the restart trigger SST.

Comment by Jan Lindström (Inactive) [ 2022-08-05 ]

Change on wsrep_gtid_domain_id does not reflect on node if node uses IST when joining the cluster.

Comment by Hartmut Holzgraefe [ 2022-08-05 ]

Bu this was a fresh cluster due to galera_new_cluster, and also the domain id changed fine when forcing all nodes to use the very first one as donor, only when using a different one as donor, e.g. node 3 using the one started 2nd, not 1st, it failed to pick up the correct domain id

Comment by Jan Lindström (Inactive) [ 2022-08-05 ]

hholzgra Was datadir on node3 empty? If not just delete gcache file and force SST.

Comment by Jan Lindström (Inactive) [ 2022-10-24 ]

ramesh Based on Hartmut comments, could you please try to reproduce?

Comment by Hartmut Holzgraefe [ 2022-10-24 ]

I still have my original VM test setup for this somewhere, "just" need to figure out where ...

Comment by Ramesh Sivaraman [ 2022-10-25 ]

hholzgra The wsrep_gtid_domain_id is only changed by SST when the secondary node is used as a donor node. Can you confirm if node3 has used SST or IST to join the cluster.

Comment by Hartmut Holzgraefe [ 2022-11-24 ]

How to reproduce:

start cluster of at least three nodes with

wsrep_gtid_domain_id=100

Shut down all nodes, node 1 last. Change configuration to

wsrep_gtid_domain_id=200

Start node 1 with galera-new-cluster, then start up the remaining nodes with systemctl start mariabd

Run

mysql -e "show variables like 'wsrep_gtid_domain_id'"

on all nodes, see that first and second node show the correct value 200, but later nodes that join using a different donor than node 1 show the old value 100

When enforcing an SST on one of the later nodes (node 3 and later) it will show the correct value 200 regardless of the actual donor node picked, but when then restarting the node after SST startup has completed and a different node than the first one is picked for SST (eg. by enforcing that with wsrep_sst_donor=node-2) it will flip back to the old value 100 again.

When enforcing an SST on all but the first node by purging the data directory the correct value 200 is picked up by all nodes, and persists over later restarts, too.

So we should either clearly document the procedure necessary to change wsrep_gtid_domain_id, or – preferred – figure out and fix the behavior so that changing the domain id is possible without enforced SST on all but the first node.

Comment by Ramesh Sivaraman [ 2022-11-26 ]

julien.fritsch jplindst The problem has been reproduced. Even after SST, the wsrep code always chooses the old wsrep_gtid_domain_id if the donor is a non-bootstrap node when the joiner is restarted.
Please check the --gtid-domain-id value on IST log
xtrabackup IST info
When Node1 becomes the donor (--gtid-domain-id is 200)

2022-11-26  5:27:30 1 [Note] WSREP: Server status change synced -> donor
2022-11-26  5:27:30 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-11-26  5:27:30 0 [Note] WSREP: Donor monitor thread started to monitor
2022-11-26  5:27:30 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'donor' --address '192.168.100.30:4444/xtrabackup_sst//1' --local-port 3306 --socket '/run/mysqld/mysqld.sock' --progress 0 --datadir '/var/lib/mysql/' --gtid 'ee9f290d-6d41-11ed-a940-ffb1d812fa61:61' --gtid-domain-id 200 --bypass --mysqld-args --wsrep-new-cluster --wsrep_start_position=ee9f290d-6d41-11ed-a940-ffb1d812fa61:8,100-101-3'
2022-11-26  5:27:30 1 [Note] WSREP: sst_donor_thread signaled with 0
2022-11-26  5:27:30 0 [Note] WSREP: async IST sender starting to serve tcp://192.168.100.30:4568 sending 62-63, preload starts from 63
2022-11-26  5:27:30 0 [Note] WSREP: IST sender 62 -> 63

When Node2 becomes the donor (--gtid-domain-id is 100)

2022-11-26  5:04:02 1 [Note] WSREP: Server status change synced -> donor
2022-11-26  5:04:02 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-11-26  5:04:02 0 [Note] WSREP: Donor monitor thread started to monitor
2022-11-26  5:04:02 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'donor' --address '192.168.100.30:4444/xtrabackup_sst//1' --local-port 3306 --socket '/run/mysqld/mysqld.sock' --progress 0 --datadir '/var/lib/mysql/' --gtid 'ee9f290d-6d41-11ed-a940-ffb1d812fa61:55' --gtid-domain-id 100 --bypass --mysqld-args --wsrep_start_position=ee9f290d-6d41-11ed-a940-ffb1d812fa61:7,100-101-3'
2022-11-26  5:04:02 1 [Note] WSREP: sst_donor_thread signaled with 0
2022-11-26  5:04:02 0 [Note] WSREP: async IST sender starting to serve tcp://192.168.100.30:4568 sending 56-57, preload starts from 57
2022-11-26  5:04:02 0 [Note] WSREP: IST sender 56 -> 57

rsync IST info

When Node1 becomes the donor

2022-11-26  5:36:23 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '192.168.100.30:4444/rsync_sst' --local-port 3306 --socket '/run/mysqld/mysqld.sock' --progress 1 --datadir '/var/lib/mysql/' --gtid 'ee9f290d-6d41-11ed-a940-ffb1d812fa61:69' --gtid-domain-id 200 --bypass --mysqld-args --wsrep-new-cluster --wsrep_start_position=ee9f290d-6d41-11ed-a940-ffb1d812fa61:66,200-101-0'
2022-11-26  5:36:23 1 [Note] WSREP: sst_donor_thread signaled with 0
2022-11-26  5:36:23 0 [Note] WSREP: async IST sender starting to serve tcp://192.168.100.30:4568 sending 70-71, preload starts from 71
2022-11-26  5:36:23 0 [Note] WSREP: IST sender 70 -> 71

When Node2 becomes the donor.

2022-11-26  5:35:18 1 [Note] WSREP: Server status change synced -> donor
2022-11-26  5:35:18 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-11-26  5:35:18 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '192.168.100.30:4444/rsync_sst' --local-port 3306 --socket '/run/mysqld/mysqld.sock' --progress 1 --datadir '/var/lib/mysql/' --gtid 'ee9f290d-6d41-11ed-a940-ffb1d812fa61:63' --gtid-domain-id 100 --bypass --mysqld-args --wsrep_start_position=ee9f290d-6d41-11ed-a940-ffb1d812fa61:64,100-101-3'
2022-11-26  5:35:18 0 [Note] WSREP: Donor monitor thread started to monitor
2022-11-26  5:35:18 1 [Note] WSREP: sst_donor_thread signaled with 0
2022-11-26  5:35:18 0 [Note] WSREP: async IST sender starting to serve tcp://192.168.100.30:4568 sending 64-69, preload starts from 69
2022-11-26  5:35:18 0 [Note] WSREP: IST sender 64 -> 69

This issue causes the GTID sequence to be incorrectly generated on the cluster nodes. If we generate any transaction on node3 then gtid_binlog_pos on node3 and node1 will use new wsrep_domain_id but replicated event on node2 will select old wsrep_gtid_domain_id.

Node3

MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 200-101-2         | 200-101-2          |                    100 |
+-------------------+--------------------+------------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Node1

MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 200-101-2         | 200-101-2          |                    200 |
+-------------------+--------------------+------------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Node2

MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 100-101-5         | 100-101-5          |                    200 |
+-------------------+--------------------+------------------------+
1 row in set (0.001 sec)
 
MariaDB [(none)]> 
{code:sql}
 
Similarly if we create any transaction on node2 then gtid_binlog_pos on node3 and node2 will use old wsrep_domain_id but replicated transaction on node1 will select new wsrep_gtid_domain_id.
*Node2*
{code:sql}
MariaDB [(none)]> create database db1;
Query OK, 1 row affected (0.006 sec)
 
MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 100-101-3         | 100-101-3          |                    200 |
+-------------------+--------------------+------------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Node3

MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 100-101-3         | 100-101-3          |                    100 |
+-------------------+--------------------+------------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Node 1

MariaDB [(none)]> select @@gtid_binlog_pos,@@gtid_current_pos,@@wsrep_gtid_domain_id;
+-------------------+--------------------+------------------------+
| @@gtid_binlog_pos | @@gtid_current_pos | @@wsrep_gtid_domain_id |
+-------------------+--------------------+------------------------+
| 200-101-1         | 200-101-1          |                    200 |
+-------------------+--------------------+------------------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> 

Generated at Thu Feb 08 10:06:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.