Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3374

MaxScale fails to update IP for a existing node that reappears with a IP change

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.5.6
    • Fix Version/s: 2.5.7
    • Component/s: xpandmon
    • Labels:
      None
    • Environment:
      Sky-GCP
      MaxScale version: MariaDB MaxScale 2.5.6 (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
    • Sprint:
      MXS-SPRINT-123

      Description

      In SkySQL when a node misbehaves for any reason, K8s kills the node and restarts a new instance with same hostname but different IP. When Xpandmon is configured to send traffic directly to Xpand nodes, it seem replacement node is not being identified correctly since it reappears with a different IP. As a result MaxScale stops connecting new sessions and errors on existing session.

      Here is a example to demo this behavior. We have a 3 node cluster running in Sky-GCP. This is a configuration where Xpand is running with Mariadb server in same POD (1:1 config). MaxScale is configured both for Frontend (Mariadb nodes) and backend (Xpand nodes).

      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.10             │ 10.32.3.10      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      [root@t1-mdb-mxs-78cf79765-28fvz /]# exit
      exit
      dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
      pod "t1-mxp-2" deleted
      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.11             │ 10.32.3.11      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      [root@t1-mdb-mxs-78cf79765-28fvz /]# exit
      dev-jump:~/xpand-new $ kubectl get pods -o wide
      NAME                           READY   STATUS    RESTARTS   AGE     IP            NODE                                     NOMINATED NODE   READINESS GATES
      t1-mdb-mxs-78cf79765-28fvz     3/3     Running   0          11m     10.32.3.37    gke-xpand-proj-user-n1s4-0145145e-1cp3   <none>           <none>
      t1-mdb-state-f4d489fd5-27rxz   1/1     Running   0          11m     10.32.1.15    gke-xpand-proj-user-n1s1-f50de098-mqqj   <none>           <none>
      t1-mxp-0                       4/4     Running   0          11m     10.32.2.197   gke-xpand-proj-user-n1s8-a767dc3a-3d0f   <none>           <none>
      t1-mxp-1                       4/4     Running   0          8m58s   10.32.2.232   gke-xpand-proj-user-n1s8-30eafa84-lqwh   <none>           <none>
      t1-mxp-2                       4/4     Running   0          2m26s   10.32.3.11    gke-xpand-proj-user-n1s8-767aa0a1-nx6x   <none>           <none>
      dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
      2021-01-14 16:27:19   notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
      dev-jump:~/xpand-new $ 
      

      And then we kill one of the pods to mimic K8s behavior when node misbehaves:

      dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
      pod "t1-mxp-2" deleted
      

      When a new pod appears, its IP changes to 10.32.3.11 however Xpand config is still pointing to older IP: 10.32.3.10.

      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.11             │ 10.32.3.11      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      

      However Xpand cluster identifies this correctly and comes up fine:

      dev-jump:~ $ kubectl exec -it t1-mxp-0 -- bash
      Defaulting container name to clustrix.
      Use 'kubectl describe pod/t1-mxp-0 -n db00007507' to see all of the containers in this pod.
      [root@t1-mxp-0 /]# /opt/clustrix/bin/clx status
      Cluster Name:    cl6299ac0c0fc1da53
      Cluster Version: 5.3.13
      Cluster Status:   OK 
      Cluster Size:    3 nodes - 8 CPUs per Node
      Current Node:    t1-mxp-0 - nid 1
      nid |  Hostname | Status |  IP Address  | TPS |      Used      |  Total 
      ----+-----------+--------+--------------+-----+----------------+--------
        1 |  t1-mxp-0 |    OK  |  10.32.2.197 |   0 |   9.5M (0.00%) |  223.9G
        2 |  t1-mxp-1 |    OK  |  10.32.2.232 |   0 |   9.3M (0.00%) |  223.9G
        3 |  t1-mxp-2 |    OK  |   10.32.3.11 |   0 |   9.7M (0.00%) |  223.9G
      ----+-----------+--------+--------------+-----+----------------+--------
                                                  0 |  28.6M (0.00%) |  671.6G
      

      List of POD's in this setup:

      dev-jump:~/xpand-new $ kubectl get pods -o wide
      NAME                           READY   STATUS    RESTARTS   AGE     IP            NODE                                     NOMINATED NODE   READINESS GATES
      t1-mdb-mxs-78cf79765-28fvz     3/3     Running   0          11m     10.32.3.37    gke-xpand-proj-user-n1s4-0145145e-1cp3   <none>           <none>
      t1-mdb-state-f4d489fd5-27rxz   1/1     Running   0          11m     10.32.1.15    gke-xpand-proj-user-n1s1-f50de098-mqqj   <none>           <none>
      t1-mxp-0                       4/4     Running   0          11m     10.32.2.197   gke-xpand-proj-user-n1s8-a767dc3a-3d0f   <none>           <none>
      t1-mxp-1                       4/4     Running   0          8m58s   10.32.2.232   gke-xpand-proj-user-n1s8-30eafa84-lqwh   <none>           <none>
      t1-mxp-2                       4/4     Running   0          2m26s   10.32.3.11    gke-xpand-proj-user-n1s8-767aa0a1-nx6x   <none>           <none>
      dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
      2021-01-14 16:27:19   notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
      dev-jump:~/xpand-new $
      

        Attachments

          Activity

            People

            Assignee:
            johan.wikman Johan Wikman
            Reporter:
            msnijjar Manjinder Nijjar
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration