Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Markus put this well in MXS-4149-
An alternative way to deal with these sort of situations would be to have graceful shutdowns of MaxScale nodes. This would allow open connections to be migrated to a replacement node once they're done with their active transactions. This wouldn't save transactions that are lost due to unexpected outages but the use-case for "needing to restart" would be served quite well with this.
In short, many customers have reported getting into states where it becomes necessary to restart MaxScale to avoid a crash or other issue. An example is due to rising/runaway memory usage. In many cases, the causes leading up to this are detectable via monitoring- ex, by tracking a server's remaining free memory or storage space. This means customers can proactively trigger the restart rather than waiting for a crash or true emergency.
These customers are already leveraging techniques like cooperative monitoring to obtain HA from multiple MaxScale nodes. So why is a regular restart not good enough? Because a regular restart terminates and bounces back connections currently open on the MaxScale node being restarted. This makes MaxScale's HA setup appear and behave unreliably in these cases to applications/clients/etc.
It should instead be possible for MaxScale to be aware it has "sister" nodes which it could migrate connections or transactions to in these cases. A "graceful restart" mechanism which has MaxScale drain its active and future connections to a "sister" node before restarting would resolve this concern and provide customers with a valuable tool needed for them and their operations teams to help themselves.
Beyond-initial scope, but once MXS-3822 is implemented, there will be a lot of runway in future MaxScale versions to enhance this feature by enabling automatic graceful restarts and such.
MXS-4149 is related to this issue as MXS-4149 is the preferable, desired future-state. However, the graceful restart functionality requested in this feature is expected to be easier and quicker to implement and should provide a manual solution customers can benefit from ASAP and build around as necessary. MXS-4149 and other, further improvements would be ways for MaxScale to add value.
Attachments
Issue Links
- blocks
-
MXS-4149 Cooperative Transaction Replay
- Open
- is blocked by
-
CONC-599 Add support for connection redirection on the Connector/C
- Open
-
CONCPP-101 Add support for connection redirection on the Connector/C++
- Open
-
CONJ-981 Add support for connection redirection
- Closed
-
CONJS-207 Add support for connection redirection
- Closed
-
CONPY-207 Add support for connection redirection on the Connector/Python
- Open
-
MDEV-15935 Connection Redirection Mechanism in MariaDB Client/Server Protocol
- Closed
-
ODBC-364 Add support for connection redirection on the Connector/ODBC
- Open
-
R2DBC-66 Add support for connection redirection
- Closed
- is duplicated by
-
MXS-4737 Graceful shutdown of MaxScale when using more than one MaxScale instance
- Closed
- relates to
-
MDEV-32053 New features requested by customer on 2023-08-28
- Open
-
MXS-4635 Provide load balancing metadata to connectors
- Closed
-
MDEV-33926 Graceful shutdown proxy hint feature - MariaDB version of pxc_maint_mode
- Open