[MXS-4692] Consistent failover and routing for multiple clusters Created: 2023-08-01  Updated: 2023-12-15

Status: Open
Project: MariaDB MaxScale
Component/s: None
Affects Version/s: None
Fix Version/s: Icebox

Type: New Feature Priority: Major
Reporter: Kathryn Sizemore Assignee: Joe Cotellese
Resolution: Unresolved Votes: 0
Labels: None

Attachments: PNG File Screenshot 2023-08-01 at 12.21.40 PM.png    
Issue Links:
Relates
relates to MXS-3206 Monitoring Multiple MaxScale through ... Closed

 Description   

Implement a mechanism that implements region awareness in Maxscale and fulfills the following needs:

  • A primary cluster should be able to fail over to a secondary DR cluster in a manner that does not cause split-brain scenarios to occur.
  • The failover should have the option of being sticky: if you failed over to the DR cluster, do not fall back automatically.
  • The traffic from all MaxScales must flow into the correct cluster at all times.
  • All of this must be done in a way that prevents diverging histories from occurring.

Original title: Support Cascading MaxScale Nodes

Original description:
Create new MaxScaleMonitor similar to MariaDBMonitor that allows for monitoring and failover of Cascading maxscale nodes. Top level MaxScale node(s) should monitor downstream MaxScale node health and control traffic to MaxScale nodes. Downstream MaxScale nodes should handle fail over and traffic to database nodes.

Requirements:

  • MaxScale should be able to monitor the health of downstream MaxScale nodes and failover accordingly.
  • Failover should have the option of being sticky - meaning if you failed over to a DR, do not fall back automatically
  • Would be nice to have "region awareness" or be conscious of routing to MaxScale in same region as current primary/master database node
  • This should all be viewable and configurable in the GUI


 Comments   
Comment by Johan Wikman [ 2023-08-02 ]

I think this would fit quite nicely with the functionality of MaxScale. Does not come without a latency cost though. You could even have transaction replay at the top-level, which would hide from the client when one downstream MaxScale is switched to another. Although you probably want to minimize the used functionality in the top-level MaxScale to reduce the risk of it going down.

Generated at Thu Feb 08 04:30:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.