Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The driver has failover implementation, but there is some limitation that can be solved using a redo transaction implementation. There are many benefits to using this approach.
Current state
On a failover occurring on a slave connection, reconnection is done to another slave if possible and the query is re-executed on that new slave or on the master connection if no slave connection is reestablished without interruption.
Problem is when failover occurs on a master connection: the only case when that is handled transparently is when a query was not in a transaction and was a SELECT command.
There is no other possibility because when this failover occurs, the driver has no way to know that interruption occurs after the server received and handles command or not. Then, connection is reestablished and error is changed from SQLNonTransientConnectionException to SQLTransientConnectionException.
Proposed implementation
Jdbc default with autocommit enable. When a failover occurs on a primary connection with auto-commit enable, the driver can still not know more and will just reconnect connection and throw an exception like current implementation. (An exception can be done for PING command)
Redo transaction approach is to save commands (COM_EXECUTE / COM_STMT_EXECUTE, COM_STMT_LONG_DATA) in transaction. When a failover occurs during a transaction and failing command is not a COMMIT/ROLLBACK command, the connector can automatically reconnect and replay transaction, making failover completely transparent.
Auto-commit and transaction state already exist in protocol using SERVER_STATUS_IN_TRANS and SERVER_STATUS_AUTOCOMMIT flag in server status flag in all server version.
The drawback of redo transaction implementation is saving transaction in buffer until completion, but this can be avoided setting a maximum buffer length (if transaction is too big, clearing saving buffer, then throwing a exception on failover).
Most of the time, queries occurs in transaction (ORM for example doesn't permit using auto-commit), so redo transaction implementation will solve most of failover cases transparently for user point of view.
Typology consideration
This solution is particulary adapted for
- galera: driver can reconnect to another master with more transparent failover
- maxscale : Maxscale already implement redo transaction, but that concerns connections between maxscale to servers. This will complete MariaDB solution adding that failover layer for connection failure between client and maxscale (like reconnect to another maxscale transparently for example).