[MXS-3185] Move any external replication settings to the new master in case of a failover with galeramon Created: 2020-09-15  Updated: 2022-09-08  Resolved: 2022-09-08

Status: Closed
Project: MariaDB MaxScale
Component/s: galeramon
Affects Version/s: 2.5
Fix Version/s: N/A

Type: New Feature Priority: Major
Reporter: Petko Vasilev (Inactive) Assignee: Todd Stoffel (Inactive)
Resolution: Won't Do Votes: 0
Labels: None

Issue Links:
Relates
relates to MXS-3922 Redirection of replication slaves to ... Closed

 Description   

Set up:

  • a galera cluster with a maxscale in front using galeramon
  • the "master" is configured to replicate from an external server
    Action:
  • shut down the local master

Desired result:

  • the external replication settings should get transferred to the new local master and the replication should continue (same as when using a primary/replica cluster with mariadbmon)
    Actual result:
  • the external replication settings are not transferred and are lost; external replication stops

It would be useful to have the desired result as then external replication could be much more robust



 Comments   
Comment by Johan Wikman [ 2020-09-16 ]

petko.vasilev The Galera monitor works so that it (by default) assigns the master role to the server whose wsrep_local_index is the smallest.

Whose responsibility is it to ensure that the "master" is configured to replicate from an external server ?

Since MaxScale should be able to configure a server to replicate from an external server when the master role moves, it should be possible for MaxScale, when started, to always reconfigure a Galera cluster so that all replication from external servers is performed by the server given the master role.

Comment by Petko Vasilev (Inactive) [ 2020-09-16 ]

Initially it would be the user that will configure the replication on the correct server.
After that, maxscale should be the one moving it around if needed.

Comment by Johan Wikman [ 2020-09-16 ]

What if there is a Galera node, that is not the master, that is replicating from an external server? First nothing would happen, but if that node ever were to become the master, thereafter the replication settings would follow the master, which seems a bit non-intuitive. But would make sense nonetheless.

So I think the feature should simply be that if the Galera node currently being treated as master goes down, then any external replication settings it had, will be configured on the new node chosen as master (which implies that the Galera monitor probes and stored all replication settings).

What should happen when the old master is rejoined?

Comment by Petko Vasilev (Inactive) [ 2020-09-16 ]

In our case, we will have only one galera node replicating at any given time. I don't know what the correct behavior would be there are multiple.
If the master goes back up, it is not master any more so it should not be replicating.
As similar as possible to the primary/replica with galeramon would be ideal.

Comment by Johan Wikman [ 2020-09-17 ]

I think it should work fine (in principle at least) if MaxScale would configure the new master, with whatever replication settings the old master had. I don't know if there can be some conflicts in the settings, if the new master is already replicating from somewhere.

When the old master comes back up and rejoins

  • its replication settings must be removed by the administrator, or
  • MaxScale must remember that the node used to be the old master and that its replication settings have been moved to the new master and that they now must be removed.

The latter is more tricky and requires that MaxScale persistently remembers the cluster state.

A thought; what if the replication on all nodes were configured in an identical fashion and then MaxScale's role would be to start the replication on the master node and stop it on all other nodes?

Comment by Assen Totin (Inactive) [ 2020-10-12 ]

I'd say this relates to MXS-2885 (albeit it suggest putting this in Binlog router) - which also has Galera in mind, but may actually work with any backend that needs to fail over replica nodes.

Generated at Thu Feb 08 04:19:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.