[MXS-3096] Semi automation of Galera recovery through GaleraMon Created: 2020-07-30 Updated: 2022-09-08 Resolved: 2022-09-08 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | galeramon, maxctrl |
| Affects Version/s: | None |
| Fix Version/s: | N/A |
| Type: | New Feature | Priority: | Minor |
| Reporter: | Sylvain ARBAUDIE | Assignee: | Todd Stoffel (Inactive) |
| Resolution: | Won't Do | Votes: | 6 |
| Labels: | None | ||
| Description |
|
Recovering a Galera cluster is not always an easy task automate, but here are two ways we could do it, up to a certain degree. 1. Suggestion for the recovery algorithm
For the idea described above for the algorithm, we should always check if there is a most advanced in replication node so we can execute the set global wsrep_provider_options="pc.bootstrap=1" on the right node, avoiding though to bootstrapping the wrong node. We see some ways of doing this check and would like to discuss the best of checking the most advanced node in replication or even bootstrapping the latest master in case we have seqno as -1 or even don't have a grastate.dat on disk. In case some of the nodes are not reachable, just store the information for display in the galeramon monitor. Introduce auto_failover parameter for galeramon with following parameters :
2. Suggestion for the recovery command run by MaxScale: Introduce a new command for galeramon:
This above command triggers a failover for galeramon, launching the bootstrap operation (cf previous algorithm) ignoring unreachable nodes. |
| Comments |
| Comment by Massimo [ 2020-07-31 ] |
|
For what i see on the future request you propose to be able to run the command anytime: maxctrl call command mariadbmon failover cluster-monitor --force so i wonder how you get " wsrep_last_committed "in case max_connections has been reached , for instance? Would be good to consider to stop all the connections once the command is trigger and then go ahead with the check of wsrep_last_committed, keep in mind that all queries have to end the commit, otherwise the wsrep_last_committed could be changed. |