[MXS-4149] Cooperative Transaction Replay Created: 2022-05-31  Updated: 2023-12-15

Status: Open
Project: MariaDB MaxScale
Component/s: None
Affects Version/s: None
Fix Version/s: Icebox

Type: New Feature Priority: Major
Reporter: Juan Assignee: Joe Cotellese
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
Blocks
is blocked by MXS-4153 Graceful Restart Open

 Description   

Many customers run MaxScale in container environments where unique memory leaks and other stability issues can affect individual instances and require them to be restarted.

Although we currently have good HA architecture patterns, as well as client-side java connector failover capability, and cooperative monitoring makes multiple MaxScale instances able to work together effectively in managing back-end topologies, destroying an instance because it's leaking, for example, is still an incident with adverse consequences because whatever transactions are in-flight in current connections handled by the given MaxScale instance are lost.

Transaction_replay manages this problem beautifully in the event of a lost back-end server. Having a similar mechanism to cache the client connection, recognize it from a second MaxScale in an HA configuration, and deliver back to that client whatever results might not have successfully made it from a destroyed MaxScale.

Although replicating the transaction replay queue on every MaxScale is one way this could be accomplished, now that some MaxScale state information is already stored server-side, specifically who the controlling MaxScale in a cooperative group is controlling, it, replay & session information could also be stored on the database servers themselves.



 Comments   
Comment by markus makela [ 2022-06-01 ]

Unless the connectors support some form of transaction replay, MaxScale alone cannot do this: if the MaxScale where the transaction is active goes down and the client connects to the second MaxScale, it needs to somehow indicate which session it was on the old MaxScale instance for the new MaxScale instace to know which transaction to replay. Without connector support, this cannot be done.

An alternative way to deal with these sort of situations would be to have graceful shutdowns of MaxScale nodes. This would allow open connections to be migrated to a replacement node once they're done with their active transactions. This wouldn't save transactions that are lost due to unexpected outages but the use-case for "needing to restart" would be served quite well with this.

A mechanism similar to what is described in MDEV-15935 would allow MaxScale to signal that it is about to go down and new connections would be able to redirect themselves to a different MaxScale instance. This could also work for ongoing sessions and even open transactions but that would require significant cooperation from the connectors. A minimal implementation would at least allow transparent connection migrations from one node to another without it affecting applications. This would be sort of how the Drain state for servers in MaxScale works but for MaxScale itself.

Comment by markus makela [ 2022-06-01 ]

A small step towards implementing something for this would be for MaxScale to generate fake "session state change" by injecting a custom variable (e.g. the redirect_url from MDEV-15935) into the OK packet that is sent after authentication has been accepted. As connectors already expose this information (at least the C connector does), it could be used by the client applications to manually redirect to a different server. This would allow MaxScale to implement it and let the connectors catch up with their implementations when they can.

Something as simple as drain=<URL> as a listener parameter would allow this to be done on the listener level and could be wrapped into a maxctrl drain maxscale <URL> command to do it for all listeners. The only thing that would change in MaxScale would be that it would start signaling new (and possibly existing) connections with the extra information stating that they should avoid this MaxScale and redirect to the given URL. This would start out as a manual process but combining with something like the configuration synchronization (i.e. config_sync_cluster), it could be automated to have all but the current MaxScale be included in the URL.

Comment by Johan Wikman [ 2022-09-12 ]

A pre-requisite for this is that connectors support MDEV-15935. Currently there apparently is no activity on that front.

However, cooperative transaction replay would be very complex to arrange, as it would mean that the transaction recording made by one MaxScale would have to be stored in a manner that allows another MaxScale to later replay it. Furthermore, when a connection is, invisibly from the application, moved (when MDEV-15935 is implemented) to another MaxScale, the connection identity as seen by the first MaxScale would have to be transferred to the other MaxScale, so that it then would know which transaction to replay.

Recently most MaxScale leaks have been caused by MaxScale being mis-configured. In such cases, implementing very complex functionality for being able to move a transaction from one MaxScale to another would not bring any actual benefit. MXS-4161 will attempt to address that, i.e. detect whether the resource consumption implied by the configuration is in conflict with the available resources.

Moving the fix-version to 23.08 for now.

Comment by markus makela [ 2022-09-16 ]

The MariaDB JDBC driver already has a built-in transaction replay feature. This is one way of solving the problem and could end up being a simpler solution than anything that is built into MaxScale.

Generated at Thu Feb 08 04:26:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.