Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Incomplete
-
10.5.19, 10.6.12, 10.6.13, 10.6.14, 10.6.15
-
None
Description
Hi,
Recently, I tried to migrate and upgrade MariaDB in Galera Cluster.
For testing, I focus on target version: 10.6.12, with Galera Cluster plug-in bundled in MariaDB Package (Galera 4 26.4.14).
And my Galera Cluster settings as below:
server_id = 1
|
gtid_strict_mode NOT SET (Default: OFF) |
gtid_domain_id NOT SET (Default: 0) |
log_slave_updates = ON |
wsrep_sst_method = mariabackup
|
wsrep_gtid_domain_id = 1
|
wsrep_gtid_mode = ON |
Firstlly, I initialized datadir on Node1, and startup with --wsrep-new-cluster to start database with new Galera Cluster,
at this point, gtid_binlog_pos and gtid_current_pos were both 0-1-41,
after I created other user accounts, databases, and tables,
gtid_binlog_pos and gtid_current_pos became 0-1-41,1-1-31,
I think that was right, when datadir initalizing, instance did not in Galera Cluster, it use gtid_domain_id as DOMAIN ID in GTID,
and after instance startup as new Galera Cluster, any changed use wsrep_gtid_domain_id as DOMAIN ID in GTID.
Then I startup Node2 and Node3, both of them join cluster by SST method,
at this point, each nodes' gtid_binlog_pos and gtid_current_pos were both 0-1-41,1-1-31.
From here, GTID were different between each nodes.
I committed a traansaction on Node1, each nodes' gtid_binlog_pos and gtid_current_pos were:
Node1: 0-1-41,1-1-32
Node2: 0-1-32,1-1-31
Node3: 0-1-32,1-1-31
Then I committed another transaction on Node2, each nodes' gtid_binlog_pos and gtid_current_pos were:
Node1: 0-1-41,1-1-33
Node2: 0-1-33,1-1-31
Node3: 0-1-33,1-1-31
Finally, I committed a transaction on Node3, each nodes' gtid_binlog_pos and gtid_current_pos were:
Node1: 0-1-41,1-1-34
Node2: 0-1-34,1-1-31
Node3: 0-1-341-1-31
Obviously, only Node1 use wsrep_gtid_domain_id as DOMAIN ID in GTID, other nodes did not use wsrep_gtid_domain_id I set, they used default value of wsrep_gtid_domain_id (0).
I used SHOW GLOVAL VARIABLES LIKE '%GTID%';, each nodes showed wsrep_gtid_domain_id = 1 .
If GTID were different between instance, I think this cause GTID useless, it was not "GLOBAL".
For another checking, I shutdown Node1, then startup Node1 to join cluster,
I committed another transaction on Node3, another magic happened:
each nodes' gtid_binlog_pos and gtid_current_pos were:
Node1: 0-1-35,1-1-34
Node2: 0-1-35,1-1-31
Node3: 0-1-35,1-1-31
Yap, Node1 did not use wsrep_gtid_domain_id I set, too, and this time, I used SHOW GLOVAL VARIABLES LIKE '%GTID%';,
Node1 showed its wsrep_gtid_domain_id was 0, was not value I set (1).
I tried other version: 10.6.13、10.6.14、10.6.15, all of them had the same issue,
I noticed that all of them were bundled with Galera 4 26.4.14, so I tried 10.6.11 which bundled with Galera 4 26.4.13,
this version did not had this issue.
I thought it was plug-in's issue, so I replaced Galera Cluster's library files in MariaDB 10.6.12 (MariaDB 10.6.12 + Galera 4 26.4.13),
but the same issue still could be generated, then I tried 10.5.19, the same issue existed.
Based on testing descirbe above, I think it was MariaDB's bug.
Please help to confirm, thank you.