Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2036

A slave with sql thread stopped causes wrong master after failover

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 2.2.13
    • 2.2.14
    • mariadbmon
    • None
    • MXS-SPRINT-65

    Description

      As described in the support ticket. In short, a slave with IO thread running but SQL thread stopped is in limbo, and causes wrong master to be selected after a failover unless the new master has other slaves.

      This is again an effect of the way the 2.2 monitor works. The slave which is still connected or trying to connect to the master (IO thread is on or connecting) but not actually replicating (sql thread is off) is counted as a slave of that node, even if the master node is down. During switchover/failover, servers with a broken slave sql thread are not redirected (since they are not real slaves and cannot replicate from the new master anyway). This difference produces the weird result where the old master gets to be master even after failover. In 2.3 this doesn't happen because the monitor works differently.

      Fixing this in 2.2 requires choosing between changing the master selection code or the failover/switchover code. I will try with the latter, since changing the former could affect various other places as well.

      Attachments

        Activity

          People

            esa.korhonen Esa Korhonen
            esa.korhonen Esa Korhonen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.