Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
-
MXS-SPRINT-132
Description
Overview
Synchronize configuration changes done on one MaxScale to other MaxScale instances in the group. If a new MaxScale joins the group, will synchronize with the rest of the cluster.
Group Membership
The group of MaxScale instances is implicitly defined by the database cluster they use. This retains the feature of MaxScale instances not being aware of each other which makes provisioning of new nodes much easier.
Data Storage
The configuration information is stored in a table in the database which contains the cluster name (the cluster name from MaxScale configuration), the configuration version number (incremented for each change) and the actual configuration data as JSON. The cluster name is derived from the name of the monitor that monitors the cluster. The version number starts from 0 and the first dynamic change causes it to be set to 1. This means the first update stored in the database will have the version number 1 and all subsequent configuration updates increment the version.
Updating the Configuration
Before a configuration change is done in MaxScale, the current version is selected from the database inside of a transaction using SELECT ... FOR UPDATE. If the version is the same as the version stored in MaxScale, the update can proceed. If the version is not the same, an error is returned to the client. Once the modifications to the internal state of MaxScale have been done, an attempt to update the row is made. As we used SELECT ... FOR UPDATE to read the version, either we succeed in updating the row or we fail due to a deadlock conflict in the transaction. This guarantees that only one MaxScale succeeds in updating the configuration. The MaxScale that fails to perform the update will return an error to the client and read the new configuration from the database.
Locally Cached Configuration
Whenever the configuration changes, it is stored as JSON on disk. If this cached configuration is available, it will be used instead of the static configuration files. This allows the configuration changes to persist through a crash as well as through temporary outages in the cluster. If the transition from the current configuration to the configuration stored in the cluster fails, the cached configuration is discarded. This guarantees that only one attempt is done to start with a cached configuration and all subsequent startups use the static configuration files.
Configuration Synchronization
The propagation of configuration updates is done by periodically polling the database for the latest configuration version. If the version in the database is newer than the one in MaxScale, the diff between the new configuration and the current configuration is applied. If successful, the MaxScale is now in sync with the cluster. If the application of the new configuration from the database fails, an attempt to roll back any partial changes is made. If the rollback is successful, this version of the configuration is ignored and MaxScale will attempt to synchronize again with the next update. If both the application of the configuration and the rollback attempt fail, then the MaxScale configuration is in an indeterminate state. In this case, the MaxScale will discard any cached configurations and restart the process using the static configuration files. This means there are three possible outcomes for an update from configration version 3 to version 4:
- Update is successful -> version 4, configuration from version 4
- Update fails but rollback is successful -> version 4, configuration from version 3
- Both update and rollback fail -> version 0, process is restarted, configuration from static files
The benefit of gracefully rolling back failed changes and ignoring the update is that it allows local failures to be tolerated without causing a severe outage. For example, this can happen if the TLS certificates are updated but the files do not exist on all nodes of the cluster.
If the application of the latest configuration found in the cluster fails when starting with static configuration files, the process is not restarted to prevent an infinite restart loop. At this point it is assumed that it is better to leave MaxScale running with the static configuration than to keep restarting it in the hopes that the configuration is eventually applied successfully. In this case, new configuration updates are attempted similarly to how they are attempted in the case where the rollback is successful.
Propagation of Errors
Whenever a node in the cluster fails to apply the configuration, it stores the error message in a field in the corresponding row. This way any failures to apply the configuration are made visible to other MaxScale instances which can then be displayed in the GUI. This can also be used to signal that a MaxScale instances has read the configuration from the database.
Original description:
In certain cases (like telco infra) having 1+1 MaxScale instances (active and standby via keepalived and friends) is deemed insufficient and both instances are required to be running and used by clients. While it is possible to use two MaxScale instances independently, the question of keeping their configuration in sync remains tricky and needs some external state machine like Puppet + a cleverly crafted config files to provide unique values where needed.
This request is to provide a way for sync'ing the config of one MaxScale to another in a more automated way than just export->scp->import. We see several intriguing options:
- Configure active config replication - especially useful when changes are made via the API. Still leaves open the question of how to get config updates in a MaxScale instance that was down for some time and comes up. (The benefit from this is that it may open the door for implementing connection hand-over between MaxScale instances when using traditional failover with "jumping" IP address).
- Use external config storage like etcd which is used for similar purpose in complex and proven projects like Redhat OpenShift. Personally, this is my favourite, at least when it comes to configuring replicated instances of a service.