Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3268

Maxscale should auto detect master if there is none in cluster

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: mariadbmon
    • Labels:
      None
    • Sprint:
      MXS-SPRINT-127, MXS-SPRINT-128

      Description

      Hi Team,

      Recently, customer have observed one scenario where master was down due to some reason and maxscale said that "Master has failed. If master status does not change in 4 monitor passes, failover begins." but before failover happened, master started and maxscale set it to "Slave, Running" .

      2020-10-17 01:14:19 error : Monitor was unable to connect to server node2[10.232.86.133:6603] : ''
      2020-10-17 01:14:19 notice : Server changed state: node2[10.232.86.133:6603]: master_down. [Master, Running] -> [Down]
      2020-10-17 01:14:19 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins.
      2020-10-17 01:14:44 warning: [mariadbmon] The current master server 'node2' is no longer valid because it is in read-only mode, but there is no valid alternative to swap to.
      2020-10-17 01:14:44 error : [mariadbmon] No Master can be determined. Last known was 10.232.86.133:6603
      2020-10-17 01:14:44 notice : Server changed state: node2[10.232.86.133:6603]: slave_up. [Down] -> [Slave, Running]
      

      So now they had three node with "Slave, Running" and Maxscale didn't make any of the server to master. Finally, they had to restart node2 server and then failover happened.

      2020-10-17 01:54:01 error : Monitor was unable to connect to server node2[10.232.86.133:6603] : ''
      2020-10-17 01:54:01 error : [mariadbmon] No Master can be determined. Last known was 10.232.86.133:6603
      2020-10-17 01:54:01 notice : Server changed state: node2[10.232.86.133:6603]: slave_down. [Slave, Running] -> [Down]
      2020-10-17 01:54:01 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins.
      ...
      2020-10-17 01:54:21 notice : [mariadbmon] Selecting a server to promote and replace 'node2'. Candidates are: 'node1', 'node3'.
      2020-10-17 01:54:21 notice : [mariadbmon] Selected 'node1'.
      2020-10-17 01:54:21 notice : [mariadbmon] Performing automatic failover to replace failed master 'node2'.
      2020-10-17 01:54:21 notice : [mariadbmon] Redirecting 'node3' to replicate from 'node1' instead of 'node2'.
      2020-10-17 01:54:21 notice : [mariadbmon] All redirects successful.
      2020-10-17 01:54:22 notice : [mariadbmon] All redirected slaves successfully started replication from 'node1'.
      2020-10-17 01:54:22 notice : [mariadbmon] Failover 'node2' -> 'node1' performed.
      

      Can we add some functionality in maxscale which can check about master server frequently ?
      and if there is no master then based on GTID, it can decide which has latest ID and make it master and other nodes to slaves?

        Attachments

          Activity

            People

            Assignee:
            esa.korhonen Esa Korhonen
            Reporter:
            niljoshi Nilnandan Joshi
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: