Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-6539

MariaDB Monitor emergency promotion logic can violate safe failover

    XMLWordPrintable

Details

    • Task
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • None
    • mariadbmon
    • None

    Description

      Primary server invalidation rule 2 states:

      It has been down for more than failcount monitor passes and has no running replicas. Running replicas behind a downed relay count. A replica in this context is any server with at least a partially running replication connection (either io or sql thread is running). The replicas must also be down for more than failcount monitor passes to allow new master selection.

      This causes issues when standalone primary goes down, as another server in the cluster then ends up promoted. The new master may not have all data, and can violate safe failover gtid rules.

      The promotion rule could be disabled, but then the monitor would get stuck on a downed primary, unable to select another server. A manual command would be required to resolve this. Or perhaps some automated smartness.

      Attachments

        Activity

          People

            Unassigned Unassigned
            esa.korhonen Esa Korhonen
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.