Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35823

Galera cluster down with error in one DB evs_proto.cpp:handle_install_timer():728

Details

    Description

      Hi Support Team,

      We have a 3-node cluster, 2 of which are DB (service IP addr: 172.17.153.82 and 172.21.153.82) and 1 is Galera witness (IP addr: 172.18.16.82). The application connects to DB via an HA proxy located at the same site as the DB, which will hunt between the 2 DB nodes and connect to the healthy one.

      All 3 nodes (2 DBs + 1 witness) are at different sites. There was a network maintenance at the site 172.17.x.x and we expected that the DB on that site would be inaccessible while the other 2 nodes will still form a cluster, and applications can still write to the DB node at 172.21.x.x. However, the application failed to connect to the remaining DB node (Lost connection to server at 'handshake: reading initial communication packet', system error: 11).

      We needed to bootstrap after the network maintenance was over.

      At node 1, the following error was observed:
      exception from gcomm, backend must be restarted: evs::proto(fac95f38-8d6b, GATHER, view_id(REG,15b6ecac-8b9b,58)) failed to form singleton view after exceeding max_install_timeouts 3, giving up (FATAL) at /home/buildbot/buildbot/build/gcomm/src/evs_proto.cpp:handle_install_timer():728

      On the other hand, the log files at the other nodes appeared to be the expected ones. We would like to know:

      1. Why bootstraping was needed to resume?
      2. Is the above error message normal and what is its meaning? A similar ticket MDEV-32110 has been raised by someone else but there is no feedback so far.

      Please advise.

      Thanks and best regards,

      Lawrence

      Attachments

        Activity

          No workflow transitions have been executed yet.

          People

            janlindstrom Jan Lindström
            LawrenceMan Lawrence Man
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.