Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
2.2.13
-
None
-
MXS-SPRINT-64, MXS-SPRINT-65
Description
We are testing with maxscale 2.2.13 with failover scenarios we noticed that one scenario is failing.
Tested this for enough times and the same thing is observed.
scenario.
node1 ==> Master
node2 ==> Slave
node3 ==> Slave
success
===================================
node1 ==> down , rejoined as slave
node2 ==> Master, Promoted as Master
node3 ==> Slave , No Change
Success
===================================
when node1 and node2 is brough down at a time node3 is promoted as Master Successfully
node1 ==> down , Down At a Time
node2 ==> down , Down At a Time
node3 ==> Master , Promoted as Master
Success
====================================
====================================
when bringing up both the nodes at a time (Node2 followed by Node1)
node1 ==> Running
node2 ==> Running
node3 ==> Already Master
out put
node1 ==> Slave Running
node2 ==> Master Running
node3 ==> Running (Out of Cluster) Including Data Loss
Failure
In the above scenario we started both nodes at a time node2 followed by node1
The current Master went to running state.
[maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
|
┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node1 │ 10.1.1.96 │ 6603 │ 0 │ Down │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node2 │ 10.1.1.81 │ 6603 │ 0 │ Down │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node3 │ 10.1.1.82 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
|
└────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
|
[maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
|
┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node1 │ 10.1.1.96 │ 6603 │ 0 │ Slave, Running │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node2 │ 10.1.1.81 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node3 │ 10.1.1.82 │ 6603 │ 0 │ Running │ 1-1-342807 │
|
└────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
|
[maxscale@x18tcldgpapp06 ~]$ maxctrl list servers
|
┌────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node1 │ 10.1.1.96 │ 6603 │ 0 │ Slave, Running │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node2 │ 10.1.1.81 │ 6603 │ 0 │ Master, Running │ 1-3-341679 │
|
├────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┤
|
│ node3 │ 10.1.1.82 │ 6603 │ 0 │ Running │ 1-1-343390 │
|
└────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┘
|
====================================
====================================
While doing individual failover nodes this is promoting perfectly.
But when doing both nodes at a time this is failing on same error.
Uploaded failover maxctrl output for your reference.