Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3374

MaxScale fails to update IP for a existing node that reappears with a IP change

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 2.5.6
    • 2.5.7
    • xpandmon
    • None
    • Sky-GCP
      MaxScale version: MariaDB MaxScale 2.5.6 (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
    • MXS-SPRINT-123

    Description

      In SkySQL when a node misbehaves for any reason, K8s kills the node and restarts a new instance with same hostname but different IP. When Xpandmon is configured to send traffic directly to Xpand nodes, it seem replacement node is not being identified correctly since it reappears with a different IP. As a result MaxScale stops connecting new sessions and errors on existing session.

      Here is a example to demo this behavior. We have a 3 node cluster running in Sky-GCP. This is a configuration where Xpand is running with Mariadb server in same POD (1:1 config). MaxScale is configured both for Frontend (Mariadb nodes) and backend (Xpand nodes).

      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.10             │ 10.32.3.10      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      [root@t1-mdb-mxs-78cf79765-28fvz /]# exit
      exit
      dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
      pod "t1-mxp-2" deleted
      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.11             │ 10.32.3.11      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      [root@t1-mdb-mxs-78cf79765-28fvz /]# exit
      dev-jump:~/xpand-new $ kubectl get pods -o wide
      NAME                           READY   STATUS    RESTARTS   AGE     IP            NODE                                     NOMINATED NODE   READINESS GATES
      t1-mdb-mxs-78cf79765-28fvz     3/3     Running   0          11m     10.32.3.37    gke-xpand-proj-user-n1s4-0145145e-1cp3   <none>           <none>
      t1-mdb-state-f4d489fd5-27rxz   1/1     Running   0          11m     10.32.1.15    gke-xpand-proj-user-n1s1-f50de098-mqqj   <none>           <none>
      t1-mxp-0                       4/4     Running   0          11m     10.32.2.197   gke-xpand-proj-user-n1s8-a767dc3a-3d0f   <none>           <none>
      t1-mxp-1                       4/4     Running   0          8m58s   10.32.2.232   gke-xpand-proj-user-n1s8-30eafa84-lqwh   <none>           <none>
      t1-mxp-2                       4/4     Running   0          2m26s   10.32.3.11    gke-xpand-proj-user-n1s8-767aa0a1-nx6x   <none>           <none>
      dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
      2021-01-14 16:27:19   notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
      dev-jump:~/xpand-new $ 
      

      And then we kill one of the pods to mimic K8s behavior when node misbehaves:

      dev-jump:~/xpand-new $ kubectl delete pod t1-mxp-2 && sleep 120
      pod "t1-mxp-2" deleted
      

      When a new pod appears, its IP changes to 10.32.3.11 however Xpand config is still pointing to older IP: 10.32.3.10.

      dev-jump:~/xpand-new $ kubectl exec -it t1-mdb-mxs-78cf79765-28fvz -- bash
      Defaulting container name to maxscale.
      Use 'kubectl describe pod/t1-mdb-mxs-78cf79765-28fvz -n db00007507' to see all of the containers in this pod.
      [root@t1-mdb-mxs-78cf79765-28fvz /]# maxctrl -u $(cat /etc/maxscale-cfg/maxscale-api-username) -p\'$(cat /etc/maxscale-cfg/maxscale-api-password)\' list servers
      ┌────────────────────────┬─────────────────┬──────┬─────────────┬──────────────────────────────┬────────────────────────────┐
      │ Server                 │ Address         │ Port │ Connections │ State                        │ GTID                       │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.3.11             │ 10.32.3.11      │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-3 │ 10.32.3.10      │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-2 │ 10.32.2.232     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.232            │ 10.32.2.232     │ 3306 │ 0           │ Relay Master, Slave, Running │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ @@Xpand-Monitor:node-1 │ 10.32.2.197     │ 3308 │ 0           │ Master, Running              │                            │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ 10.32.2.197            │ 10.32.2.197     │ 3306 │ 0           │ Master, Running              │ 50-50-7126,51-51-1,52-52-1 │
      ├────────────────────────┼─────────────────┼──────┼─────────────┼──────────────────────────────┼────────────────────────────┤
      │ Xpand-Bootstrap        │ t1-mxp-0.t1-mxp │ 3308 │ 0           │ Down                         │                            │
      └────────────────────────┴─────────────────┴──────┴─────────────┴──────────────────────────────┴────────────────────────────┘
      

      However Xpand cluster identifies this correctly and comes up fine:

      dev-jump:~ $ kubectl exec -it t1-mxp-0 -- bash
      Defaulting container name to clustrix.
      Use 'kubectl describe pod/t1-mxp-0 -n db00007507' to see all of the containers in this pod.
      [root@t1-mxp-0 /]# /opt/clustrix/bin/clx status
      Cluster Name:    cl6299ac0c0fc1da53
      Cluster Version: 5.3.13
      Cluster Status:   OK 
      Cluster Size:    3 nodes - 8 CPUs per Node
      Current Node:    t1-mxp-0 - nid 1
      nid |  Hostname | Status |  IP Address  | TPS |      Used      |  Total 
      ----+-----------+--------+--------------+-----+----------------+--------
        1 |  t1-mxp-0 |    OK  |  10.32.2.197 |   0 |   9.5M (0.00%) |  223.9G
        2 |  t1-mxp-1 |    OK  |  10.32.2.232 |   0 |   9.3M (0.00%) |  223.9G
        3 |  t1-mxp-2 |    OK  |   10.32.3.11 |   0 |   9.7M (0.00%) |  223.9G
      ----+-----------+--------+--------------+-----+----------------+--------
                                                  0 |  28.6M (0.00%) |  671.6G
      

      List of POD's in this setup:

      dev-jump:~/xpand-new $ kubectl get pods -o wide
      NAME                           READY   STATUS    RESTARTS   AGE     IP            NODE                                     NOMINATED NODE   READINESS GATES
      t1-mdb-mxs-78cf79765-28fvz     3/3     Running   0          11m     10.32.3.37    gke-xpand-proj-user-n1s4-0145145e-1cp3   <none>           <none>
      t1-mdb-state-f4d489fd5-27rxz   1/1     Running   0          11m     10.32.1.15    gke-xpand-proj-user-n1s1-f50de098-mqqj   <none>           <none>
      t1-mxp-0                       4/4     Running   0          11m     10.32.2.197   gke-xpand-proj-user-n1s8-a767dc3a-3d0f   <none>           <none>
      t1-mxp-1                       4/4     Running   0          8m58s   10.32.2.232   gke-xpand-proj-user-n1s8-30eafa84-lqwh   <none>           <none>
      t1-mxp-2                       4/4     Running   0          2m26s   10.32.3.11    gke-xpand-proj-user-n1s8-767aa0a1-nx6x   <none>           <none>
      dev-jump:~/xpand-new $ kubectl logs t1-mdb-mxs-78cf79765-28fvz maxscale | grep "MariaDB MaxScale"
      2021-01-14 16:27:19   notice : MariaDB MaxScale 2.5.6 started (Commit: fddc0526ee79ac9a87f7a7170f3204263240ab57)
      dev-jump:~/xpand-new $
      

      Attachments

        1. maxscale.11a874d96b69829d967676047af18badc9ba884f.log
          31 kB
          Jens Röwekamp
        2. maxscale.cnf.rtf
          2 kB
          Manjinder Nijjar
        3. maxscale-after.log
          11 kB
          Jens Röwekamp
        4. maxscale-before.log
          8 kB
          Jens Röwekamp

        Activity

          People

            johan.wikman Johan Wikman
            msnijjar Manjinder Nijjar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.