Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2010

MaxScale Failover Not Working as Expected

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.13
    • Fix Version/s: 2.3.0
    • Component/s: mariadbmon
    • Labels:
      None
    • Sprint:
      MXS-SPRINT-64, MXS-SPRINT-65

      Description

      We are testing with maxscale 2.2.13 with failover scenarios we noticed that one scenario is failing.

      Tested this for enough times and the same thing is observed.

      scenario.

      node1 ==> Master
      node2 ==> Slave
      node3 ==> Slave

      success
      ===================================

      node1 ==> down , rejoined as slave
      node2 ==> Master, Promoted as Master
      node3 ==> Slave , No Change

      Success

      ===================================

      when node1 and node2 is brough down at a time node3 is promoted as Master Successfully
      node1 ==> down , Down At a Time
      node2 ==> down , Down At a Time
      node3 ==> Master , Promoted as Master

      Success

      ====================================
      ====================================
      when bringing up both the nodes at a time (Node2 followed by Node1)

      node1 ==> Running
      node2 ==> Running
      node3 ==> Already Master

      out put

      node1 ==> Slave Running
      node2 ==> Master Running
      node3 ==> Running (Out of Cluster) Including Data Loss

      Failure

      In the above scenario we started both nodes at a time node2 followed by node1
      The current Master went to running state.

      [maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
      ┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server │ Address │ Port │ Connections │ State │ GTID │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node1 │ 10.1.1.96 │ 6603 │ 0 │ Down │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node2 │ 10.1.1.81 │ 6603 │ 0 │ Down │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node3 │ 10.1.1.82 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
      └────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
      [maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
      ┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server │ Address │ Port │ Connections │ State │ GTID │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node1 │ 10.1.1.96 │ 6603 │ 0 │ Slave, Running │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node2 │ 10.1.1.81 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node3 │ 10.1.1.82 │ 6603 │ 0 │ Running │ 1-1-342807 │
      └────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
      [maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
      ┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server │ Address │ Port │ Connections │ State │ GTID │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node1 │ 10.1.1.96 │ 6603 │ 0 │ Slave, Running │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node2 │ 10.1.1.81 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node3 │ 10.1.1.82 │ 6603 │ 0 │ Running │ 1-1-343390 │
      └────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
      

      ====================================
      ====================================

      While doing individual failover nodes this is promoting perfectly.

      But when doing both nodes at a time this is failing on same error.

      Uploaded failover maxctrl output for your reference.

        Attachments

          Activity

            People

            Assignee:
            esa.korhonen Esa Korhonen
            Reporter:
            ccalender Chris Calender
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: