Long-running DDLs are known to cause problems on Galera clusters, hence we promote the concept or RSU in MariaDB server as a replacement However, the traditional RSU requires external management of the process and while it is doable via MaxScale API, having a native ability in MaxScale for this would be much appreciated.
It may go along these lines (or anything similar as long as it is easy to use and can be triggered by the client within its session without the need to have access to the MaxScale API):
- An RSU mode is enabled on MaxScale session via some SQL statement like SET... . The mode is per-session.
- A DDL is sent to MaxScale in the same session.
- MaxScale puts one Galera node in maintenance mode, sets the RSU variable on the node, runs the DDL, then unsets the RSU variable and enables the node.
- MaxScale repeates the previous on all remaining Galera nodes one by one.
- The client is kept "on hold" during the whole execution flow, so the connection stays active and blocked.
- Once the last node completes the DDL and gets back online, MaxScale releases the client.
- The client unsets the RSU mode variable on MaxScale.
The actual way the DDL is executed on the Galera node is left to the node itself, so we don't want to implement more complex stuff like the shadow DDL of pt-osc.
Also, blocking the client is OK as having a non-blocking DDL on the client will be probably be more complex and confusing; keeping the connection intact is a matter of the client - likely we only need to specify what happens if the connection is dropped before the process is completed, but, I guess, would be OK to continue the RSU until completed (as our DDL is non-transactional).