Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
6.4.1
-
None
-
MXS-SPRINT-165
Description
In upcoming Xpand release 6.1 (code name: Isolation Peak), we are making a change to node state. Objective of this feature is to reduce number of group changes when a node is added to the cluster or an existing node rejoins a cluster after recovering from a downtime.
When a new node is added to the cluster (or when it rejoins), (in most cases but not all) there will not be any group change. A newly added node will join the cluster and will be marked as "Late" until next group change (i.e. hard group change). On a next group change, this node will move from 'Late' to 'Normal' sub-state. While in 'Late' state, its available for all incoming transactions. Late nodes do not participate in some aspects of the system (e.g. lock manager, sequence manager) however that should not affect Maxscale.
This new sub state is unknown to maxscale and hence is unable to make sense of this and therefore throws an error.
I see following in logs when I add a new node to already existing cluster. New node appears fine and accepts workload via maxscale without any issues.
2022-08-25 19:50:01 warning: [xpandmon] 'late' is an unknown sub-state for a Xpand node.
|
2022-08-25 19:50:01 notice : Created server '@@Backend-Monitor:node-4' at 10.2.13.99:3306
|
2022-08-25 19:50:01 info : [xpandmon] Created volatile server found after 2 lookup attempts and a total sleep time of 2 milliseconds.
|
2022-08-25 19:50:01 info : [xpandmon] Updated Xpand node in bookkeeping: 4, '10.2.13.99', 3306, 3581.
|
2022-08-25 19:50:11 warning: [xpandmon] 'late' is an unknown sub-state for a Xpand node.
|
On Xpand:
[root@karma060 ~]# /opt/clustrix/bin/clx stat
|
Cluster Name: clca72aceead5c1f26
|
Cluster Version: Xpand-mainline1-17821
|
Cluster Status: OK
|
Cluster Size: 4 nodes - 16 CPUs per Node
|
Current Node: karma060 - nid 1
|
|
nid | Hostname | Status | Substate | IP Address | TPS | Used | Total
|
----+-----------+--------+----------+--------------+-----+---------------+--------
|
1 | karma060 | OK | normal | 10.2.15.149 | 35 | 1.1G (0.14%) | 761.9G
|
2 | karma090 | OK | normal | 10.2.14.208 | 0 | 1.2G (0.16%) | 761.9G
|
3 | karma065 | OK | normal | 10.2.14.119 | 0 | 1.2G (0.15%) | 761.9G
|
4 | karma199 | OK | late | 10.2.13.99 | 0 | 1.3G (0.17%) | 761.9G
|
----+-----------+--------+----------+--------------+-----+---------------+--------
|
35 | 4.7G (0.15%) | 3.0T
|
On Maxscale:
[root@karma075 ~]# maxctrl list servers
|
┌──────────────────────────┬─────────────────────────────┬──────┬─────────────┬─────────────────┬──────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ xpand1 │ karma060.colo.sproutsys.com │ 3306 │ 0 │ Master, Running │ │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ xpand2 │ karma065.colo.sproutsys.com │ 3306 │ 0 │ Master, Running │ │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Backend-Monitor:node-1 │ 10.2.15.149 │ 3306 │ 4 │ Master, Running │ │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Backend-Monitor:node-2 │ 10.2.14.208 │ 3306 │ 7 │ Master, Running │ │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Backend-Monitor:node-3 │ 10.2.14.119 │ 3306 │ 3 │ Master, Running │ │
|
├──────────────────────────┼─────────────────────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Backend-Monitor:node-4 │ 10.2.13.99 │ 3306 │ 3 │ Master, Running │ │
|
└──────────────────────────┴─────────────────────────────┴──────┴─────────────┴─────────────────┴──────┘
|
I am not sure how maxscale utilizes sub state from xpand but i did not see any impact of this on maxscale. In my opinion, We can safely ignore this sub-state and instead of Warning we should make it Info or simply not show it.