Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
2.5.6
-
None
-
Sky-GCP
MaxScale version: MariaDB MaxScale 2.5.6 (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
-
MXS-SPRINT-123
Description
In SkySQL when a node misbehaves for any reason, K8s kills the node and restarts a new instance with same hostname but different IP. When Xpandmon is configured to send traffic directly to Xpand nodes, it seem replacement node is not being identified correctly since it reappears with a different IP. As a result MaxScale stops connecting new sessions and errors on existing session.
Here is a example to demo this behavior. We have a 3 node cluster running in Sky-GCP. This is a configuration where Xpand is running with Mariadb server in same POD (1:1 config). MaxScale is configured both for Frontend (Mariadb nodes) and backend (Xpand nodes).
dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
|
Defaulting container name to maxscale.
|
Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
|
[root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
|
┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-3 │ 10.32.3.10 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-2 │ 10.32.2.232 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.232 │ 10.32.2.232 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.3.10 │ 10.32.3.10 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-1 │ 10.32.2.197 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.197 │ 10.32.2.197 │ 3306 │ 0 │ Master, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ Xpand-Bootstrap │ t1-mxp-0.t1-mxp │ 3308 │ 0 │ Down │ │
|
└────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
|
[root@t1-mdb-mxs-78cf79765-28fvz /]# exit
|
exit
|
dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
|
pod "t1-mxp-2" deleted
|
dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
|
Defaulting container name to maxscale.
|
Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
|
[root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
|
┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.3.11 │ 10.32.3.11 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-3 │ 10.32.3.10 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-2 │ 10.32.2.232 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.232 │ 10.32.2.232 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-1 │ 10.32.2.197 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.197 │ 10.32.2.197 │ 3306 │ 0 │ Master, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ Xpand-Bootstrap │ t1-mxp-0.t1-mxp │ 3308 │ 0 │ Down │ │
|
└────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
|
[root@t1-mdb-mxs-78cf79765-28fvz /]# exit
|
dev-jump:~/xpand-new $ kubectl get pods -o wide
|
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
|
t1-mdb-mxs-78cf79765-28fvz 3/3 Running 0 11m 10.32.3.37 gke-xpand-proj-user-n1s4-0145145e-1cp3 <none> <none>
|
t1-mdb-state-f4d489fd5-27rxz 1/1 Running 0 11m 10.32.1.15 gke-xpand-proj-user-n1s1-f50de098-mqqj <none> <none>
|
t1-mxp-0 4/4 Running 0 11m 10.32.2.197 gke-xpand-proj-user-n1s8-a767dc3a-3d0f <none> <none>
|
t1-mxp-1 4/4 Running 0 8m58s 10.32.2.232 gke-xpand-proj-user-n1s8-30eafa84-lqwh <none> <none>
|
t1-mxp-2 4/4 Running 0 2m26s 10.32.3.11 gke-xpand-proj-user-n1s8-767aa0a1-nx6x <none> <none>
|
dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
|
2021-01-14 16:27:19 notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
|
dev-jump:~/xpand-new $
|
And then we kill one of the pods to mimic K8s behavior when node misbehaves:
dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
|
pod "t1-mxp-2" deleted
|
When a new pod appears, its IP changes to 10.32.3.11 however Xpand config is still pointing to older IP: 10.32.3.10.
dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
|
Defaulting container name to maxscale.
|
Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
|
[root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
|
┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.3.11 │ 10.32.3.11 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-3 │ 10.32.3.10 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-2 │ 10.32.2.232 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.232 │ 10.32.2.232 │ 3306 │ 0 │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ @@Xpand-Monitor:node-1 │ 10.32.2.197 │ 3308 │ 0 │ Master, Running │ │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ 10.32.2.197 │ 10.32.2.197 │ 3306 │ 0 │ Master, Running │ 50-50-7126,51-51-1,52-52-1 │
|
├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
|
│ Xpand-Bootstrap │ t1-mxp-0.t1-mxp │ 3308 │ 0 │ Down │ │
|
└────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
|
However Xpand cluster identifies this correctly and comes up fine:
dev-jump:~ $ kubectl exec -it t1-mxp-0 -- bash
|
Defaulting container name to clustrix.
|
Use 'kubectl describe pod/t1-mxp-0 -n db00007507' to see all of the containers in this pod.
|
[root@t1-mxp-0 /]# /opt/clustrix/bin/clx status
|
Cluster Name: cl6299ac0c0fc1da53
|
Cluster Version: 5.3.13
|
Cluster Status: OK
|
Cluster Size: 3 nodes - 8 CPUs per Node
|
Current Node: t1-mxp-0 - nid 1
|
nid | Hostname | Status | IP Address | TPS | Used | Total
|
----+-----------+--------+--------------+-----+----------------+--------
|
1 | t1-mxp-0 | OK | 10.32.2.197 | 0 | 9.5M (0.00%) | 223.9G
|
2 | t1-mxp-1 | OK | 10.32.2.232 | 0 | 9.3M (0.00%) | 223.9G
|
3 | t1-mxp-2 | OK | 10.32.3.11 | 0 | 9.7M (0.00%) | 223.9G
|
----+-----------+--------+--------------+-----+----------------+--------
|
0 | 28.6M (0.00%) | 671.6G
|
List of POD's in this setup:
dev-jump:~/xpand-new $ kubectl get pods -o wide
|
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
|
t1-mdb-mxs-78cf79765-28fvz 3/3 Running 0 11m 10.32.3.37 gke-xpand-proj-user-n1s4-0145145e-1cp3 <none> <none>
|
t1-mdb-state-f4d489fd5-27rxz 1/1 Running 0 11m 10.32.1.15 gke-xpand-proj-user-n1s1-f50de098-mqqj <none> <none>
|
t1-mxp-0 4/4 Running 0 11m 10.32.2.197 gke-xpand-proj-user-n1s8-a767dc3a-3d0f <none> <none>
|
t1-mxp-1 4/4 Running 0 8m58s 10.32.2.232 gke-xpand-proj-user-n1s8-30eafa84-lqwh <none> <none>
|
t1-mxp-2 4/4 Running 0 2m26s 10.32.3.11 gke-xpand-proj-user-n1s8-767aa0a1-nx6x <none> <none>
|
dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
|
2021-01-14 16:27:19 notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
|
dev-jump:~/xpand-new $
|