[MXS-4637] bootstrap process for Xpand should be region-aware Created: 2023-06-09  Updated: 2023-08-07  Resolved: 2023-08-07

Status: Closed
Project: MariaDB MaxScale
Component/s: xpandmon
Affects Version/s: None
Fix Version/s: 23.08.0

Type: New Feature Priority: Critical
Reporter: Christine Lieu (Inactive) Assignee: Johan Wikman
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
Sprint: MXS-SPRINT-185, MXS-SPRINT-186, MXS-SPRINT-187

 Description   

As part of XPT-518 (Transylvania), Xpand will now support Multi-Region High Availability Clusters that:

  • Allows Xpand to be deployed across multiple regions and zones
  • Provides full, zero RPO high availability in the face of a regional failure

If regions are configured, there will be 3 regions (Primary, Secondary, and Observer) and it is expected that the customer will have MaxScale deployed in each region (perhaps optional for the Observer).

For the first iteration, Xpand supports sending traffic to the Primary region only. The secondary is a standby, in case of complete failure of the Primary region (and we expect that the user will have to handle failover, e.g. using Route 53 or the like).

The bootstrap process needs to be updated so that it is region-aware, and only sends traffic to the relevant region (e.g. MaxScale in the Primary should spread connections across nodes in the primary region only).



 Comments   
Comment by Christine Lieu (Inactive) [ 2023-06-09 ]

Query that can be used to view Region info:
SELECT r.name region_name,
r.type,
z.zoneid,
z.name zone_name,
n.nodeid,
n.hostname
FROM system.zones z,
system.regions r,
system.nodeinfo n
WHERE z.region = r.region
AND n.zone = z.zoneid
ORDER BY r.type;

Comment by Johan Wikman [ 2023-07-24 ]

clieu I don't have a running Transylvania setup yet, so I have not tried things out, but apparently all nodes from all regions are visible in system.nodeinfo. Is that correct?

In order to ignore the nodes that are not in the same region it itself is in, MaxScale needs to know the region it is in. Can that be figured out at runtime or does it have to be a configuration setting of MaxScale?

The region changes mean that MaxScale needs to alter its behaviour depending on the version of Xpand. How can MaxScale detect that it is Transylvania it is talking to? If regions are not used, will that query return the relevant data or does MaxScale need to handle that case separately?

Comment by Christine Lieu (Inactive) [ 2023-07-24 ]

> all nodes from all regions are visible in system.nodeinfo. Is that correct?
Yes

> In order to ignore the nodes that are not in the same region it itself is in, MaxScale needs
> to know the region it is in. Can that be figured out at runtime or does it have to be a
> configuration setting of MaxScale?
I'm not sure. Would it make sense for only the Primary to use the boot strap process, and MaxScale in the secondary region should be manually configured? What would runtime configuration look like?

> The region changes mean that MaxScale needs to alter its behaviour depending on the
> version of Xpand. How can MaxScale detect that it is Transylvania it is talking to? If
> regions are not used, will that query return the relevant data or does MaxScale need to
> handle that case separately?

The Transylvania release number (afaik) will be 23.08.xx. You could key off that (or just check if the version is Xpand 5.3, 6.x)

Comment by Johan Wikman [ 2023-07-25 ]

> Would it make sense for only the Primary to use the boot strap process, and MaxScale in the
> secondary region should be manually configured?

If the primary goes down and Xpand promotes the secondary to primary, what happens when the old primary comes back up again? Does it become the secondary or is the current primary demoted back to being secondary so that the old primary again becomes the current primary?

If the secondary only temporarily becomes the primary, that could support MaxScale being configured differently in the primary and secondary cases. If the primary/secondary roles can switch permanently (or until the next region failure) then I think MaxScale should be configured in the same manner everywhere.

> What would runtime configuration look like?

Without being able to figure out the region autonomously, there would have to be a specific region setting.

[Monitor]
type=monitor
module=xpandmon
...
region=us-east-1

The downside of that is that the configuration of each MaxScale instance would have to be different (different region entry). Although I suppose they anyway have to be as you probably want to use nodes in your region as bootstrap nodes.

If the region can reliably be figured out from within an instance, then something like this would suffice.

[Monitor]
type=monitor
module=xpandmon
...
use_regions=true

However, the way you figure out the region from within an instance seems a bit shady - https://stackoverflow.com/questions/4249488/find-region-from-within-an-ec2-instance - so being explicit about it seems like a better initial choice. At a later stage region=auto could mean that the region is figured out by MaxScale itself.

Comment by Johan Wikman [ 2023-07-27 ]

clieu It seems to me that MaxScale can ignore the zones completely and treat all nodes in a region alike, irrespective of the zone they are in. Is that correct?

What is the layout of the system.regions table? I'll create that manually in my docker xpand-single setup and then I can do most of the work with that.

Currently when MaxScale refreshes its view of the cluster it executes

            SELECT ni.nodeid, ni.iface_ip, ni.mysql_port, ni.healthmon_port, sn.nodeid 
            FROM system.nodeinfo AS ni 
            LEFT JOIN system.softfailed_nodes AS sn ON ni.nodeid = sn.nodeid;

Conceptually I want to add WHERE region = 'this_region' to that query.

Comment by Christine Lieu (Inactive) [ 2023-07-28 ]

Sorry, I'm not great at JIRA formatting ;p

[root@alpo002 ~]# mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 31746
Server version: 5.0.45-Xpand-transylvania-18705 
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MySQL [(none)]> select * from system.regions;
+---------------------+-----------+-----------+
| region              | name      | type      |
+---------------------+-----------+-----------+
| 7255261506333243396 | us-east-2 | PRIMARY   |
| 7255261506427736068 | us-east-1 | SECONDARY |
| 7255261506522226692 | us-west-2 | OBSERVER  |
+---------------------+-----------+-----------+
3 rows in set (0.00 sec)
 
MySQL [(none)]> show create table system.regions;
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                        |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| regions | CREATE TABLE `regions` (
  `region` oid not null,
  `name` varchar(256) CHARACTER SET utf8 not null,
  `type` varchar(65535) CHARACTER SET utf8 not null,
  PRIMARY KEY (`region`) /*$ PAYLOAD (`name`,`type`) */,
  UNIQUE KEY `_regions_ind_1` (`name`)
) CHARACTER SET utf8 COLLATE utf8_general_ci /*$ AUTO_STATISTICS=NONE */
 |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Comment by Johan Wikman [ 2023-07-31 ]

Thank clieu! No problem with the formatting, I can tune later

Could you do the same with system.zones and also system.nodeinfo.

Comment by Johan Wikman [ 2023-08-03 ]

Ok, so without regions, when MaxScale refreshes its view of the cluster it executes:

SELECT ni.nodeid, ni.iface_ip, ni.mysql_port, ni.healthmon_port, sn.nodeid 
FROM system.nodeinfo AS ni 
LEFT JOIN system.softfailed_nodes AS sn ON ni.nodeid = sn.nodeid;

With regions, when MaxScale should only consider the nodes that are in its region, I think the equivalent query is:

SELECT nir.nodeid, nir.iface_ip, nir.mysql_port, nir.healthmon_port, sn.nodeid
FROM (SELECT ni.nodeid, ni.iface_ip, ni.mysql_port, ni.healthmon_port, z.region
      FROM system.nodeinfo AS ni LEFT JOIN system.zones AS z ON ni.zone = z.zoneid) AS nir
LEFT JOIN system.softfailed_nodes AS sn ON nir.nodeid = sn.nodeid
WHERE nir.region = the_region;

clieu could you verify that the query does the right thing? I am still not able to run Transylvania myself.

Generated at Thu Feb 08 04:30:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.