Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29950

one galera node got hardware issue but caused other 2 nodes split brain

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6.7
    • None
    • Galera
    • None
    • redhat x86-64 on vmware

    Description

      our galera cluster is 3 nodes configration (2 db nodes + 1 arbitrator). 2 days ago, one db node is down due to hardware issue. The remaining db node and arbitrator got split brain and db service down.

      Checked from log, remaining nodes do not have message of each other until the dead node is confirmed down. There is around 10s time. We don't know why the good nodes do not declare each other stable in this 10s.

      Kindly advise the directory to troubleshoot the problem.

      Only 2 galera timeout are set while other timeout settings are still default values.

      gmcast.peer_timeout=PT10S;
      evs.suspect_timeout=PT12S;

      DB configuration file and error log of each node are attached

      Attachments

        Activity

          People

            Unassigned Unassigned
            frelist William Wong
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.