Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Won't Do
-
None
-
None
Description
Hello folks,
We would like MaxScale to have the ability to control when replicas are available or placed on a "standby" mode, whenever 2
specific thresholds are reached:
1. replication lag greater then X seconds
Queries returned by replicas lagging too far behind their primary server can possibly return stale/wrong data. In order to prevent
wrong information sent back to the client, we would like to prevent new queries hitting the replica with replication lag greater than
a given threshold.
2. number of active queries is greater then Y queries
Clients would like the ability to prevent queries from hitting a server after a given number of active queries has been reached.
This can be for a variety of reasons, i.e.: application design, on-going backups causing locks, etc ...
We would like MaxScale to prevent new query requests from being sent to a replica whenever one of the 2 thresholds above
are exceeded. A new State within MaxScale would show the servers which are affected by the above as
"standby (throttled)" (or something else you deem more appropriate) and also a new column showing the lag, example below:
┌───────────────┬────────────────┬──────┬─────────────┬─────────────────┬────────────────────────────┐─────────────────┐
|
│ Server │ Address │ Port │ Lag | Connections │ State │ GTID │
|
├───────────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────────────────────┤─────────────────┤
|
│ dbServer1 │ 192.168.88.101 │ 3306 │ 0 | 20 │ Master, Running │ 0-8180-15692671 │
|
├───────────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────────────────────┤─────────────────┤
|
│ dbServer2 │ 192.168.88.102 │ 3306 │ 0 | 40 │ Slave, Running │ 0-8180-15692671 │
|
├───────────────┼────────────────┼──────┼─────────────┼─────────────────┼────────────────────────────┤─────────────────┤
|
│ dbServer3 │ 192.168.88.103 │ 3306 │ 500 | 40 │ Slave, Standby(throttled) │ 0-8180-15690132 │
|
└───────────────┴────────────────┴──────┴─────────────┴─────────────────┴────────────────────────────┘─────────────────┘
|
|
Both thresholds should be independent of each other and these settings should be dynamic and no restart required.
We could have a failsafe logic, and if only 1 replica is available, these 2 thresholds would be ignored.
Once threshold is cleared (i.e. lag falls below it), replicas are automatically made available and the state is updated.
Existing queries are not affected.
Scenario:
1. replication lag greater then X seconds or
2. number of active queries is greater then Y queries
If #1 or #2 is met, new queries are not sent to replicas matching that threshold.
Once #1 or #2 falls below the configured threshold, new queries can be routed to the replica again.