[CONJ-803] failover re-execution using "redo transaction" - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0.0
Component/s: Failover
Labels:
None

Description

The driver has failover implementation, but there is some limitation that can be solved using a redo transaction implementation. There are many benefits to using this approach.

Current state

On a failover occurring on a slave connection, reconnection is done to another slave if possible and the query is re-executed on that new slave or on the master connection if no slave connection is reestablished without interruption.

Problem is when failover occurs on a master connection: the only case when that is handled transparently is when a query was not in a transaction and was a SELECT command.
There is no other possibility because when this failover occurs, the driver has no way to know that interruption occurs after the server received and handles command or not. Then, connection is reestablished and error is changed from SQLNonTransientConnectionException to SQLTransientConnectionException.

Proposed implementation

Jdbc default with autocommit enable. When a failover occurs on a primary connection with auto-commit enable, the driver can still not know more and will just reconnect connection and throw an exception like current implementation. (An exception can be done for PING command)

Redo transaction approach is to save commands (COM_EXECUTE / COM_STMT_EXECUTE, COM_STMT_LONG_DATA) in transaction. When a failover occurs during a transaction and failing command is not a COMMIT/ROLLBACK command, the connector can automatically reconnect and replay transaction, making failover completely transparent.

Auto-commit and transaction state already exist in protocol using SERVER_STATUS_IN_TRANS and SERVER_STATUS_AUTOCOMMIT flag in server status flag in all server version.

The drawback of redo transaction implementation is saving transaction in buffer until completion, but this can be avoided setting a maximum buffer length (if transaction is too big, clearing saving buffer, then throwing a exception on failover).

Most of the time, queries occurs in transaction (ORM for example doesn't permit using auto-commit), so redo transaction implementation will solve most of failover cases transparently for user point of view.

Typology consideration

This solution is particulary adapted for

galera: driver can reconnect to another master with more transparent failover
maxscale : Maxscale already implement redo transaction, but that concerns connections between maxscale to servers. This will complete MariaDB solution adding that failover layer for connection failure between client and maxscale (like reconnect to another maxscale transparently for example).

Attachments

Activity

People

Assignee:: Diego Dupin

Reporter:: Diego Dupin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2020-06-29 11:40

Updated:: 2021-05-07 09:37

Resolved:: 2021-04-22 11:54

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.