[MXS-3096] Semi automation of Galera recovery through GaleraMon - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Minor
Resolution: Won't Do
Affects Version/s: None
Fix Version/s: N/A
Component/s: galeramon, maxctrl
Labels:
None

Description

Recovering a Galera cluster is not always an easy task automate, but here are two ways we could do it, up to a certain degree.

1. Suggestion for the recovery algorithm

get wsrep_last_committed from the mysql interface;
if not possible, ssh to the remaining nodes and play the galera_recovery;
select the running node with the highest seqno to execute: set global wsrep_provider_options="pc.bootstrap=1";
if no running nodes, then, ssh to one of the nodes and execute galera_new_cluster.

For the idea described above for the algorithm, we should always check if there is a most advanced in replication node so we can execute the set global wsrep_provider_options="pc.bootstrap=1" on the right node, avoiding though to bootstrapping the wrong node. We see some ways of doing this check and would like to discuss the best of checking the most advanced node in replication or even bootstrapping the latest master in case we have seqno as -1 or even don't have a grastate.dat on disk.

In case some of the nodes are not reachable, just store the information for display in the galeramon monitor.

Introduce auto_failover parameter for galeramon with following parameters :

true, false, force
false: disable failover for galeramon (default)
true: enables the previous recovery algorithm
force: same as true, but ignores unreachable nodes and bootstrap the cluster using only reachable nodes

2. Suggestion for the recovery command run by MaxScale:

Introduce a new command for galeramon:

maxctrl call command mariadbmon failover cluster-monitor --force

This above command triggers a failover for galeramon, launching the bootstrap operation (cf previous algorithm) ignoring unreachable nodes.

Attachments

Activity

People

Assignee:: Todd Stoffel (Inactive)

Reporter:: Sylvain ARBAUDIE

Votes:: 6 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2020-07-30 10:47

Updated:: 2022-09-08 12:26

Resolved:: 2022-09-08 12:26

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.