[MDEV-20996] Maxscale auto-failover with semi-sync replication is not providing a true HA solution - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Fix Version/s: N/A
Component/s: Replication
Labels:
None

Description

We have be using maxscale-2.3 with mariadbmon monitor and auto-failover for our HA solution with 3 database nodes - Master/Slave/Slave. With traffic, realistically, you MUST use semi-sync replication to make this viable, otherwise near 100% of the time a Master failed server will not come back as slave w/o 1236 error due to transactions committed to storage engine that have not yet been replicated to any slave.

Therefore, we use semi-sync replication with wait_point AFTER_SYNC. Now given this, see https://mariadb.com/kb/en/library/semisynchronous-replication/#configuring-the-master-wait-point. There are known issues with semi-sync replication after master failure/crash which will result in the same issue, Master not coming back as Slave due to a prepared transaction that is committed by automatic crash recovery. We had tried working around this by performing an automatic "Manual heuristic recovery rollback" but that did not prevent the transaction from going through after the failed master came back and we still got the 1236 replication error.

I am aware of MENT-203 (resulting from MDEV-19733), but this is in the queue as a feature request, which may have been fine before maxscale starting supporting auto-failover as an HA solution. However, supporting an HA solution with maxscale, this is now a bug and prevents maxscale with auto-failover from truely being a robust HA solution.

Maybe a short term solution would be to allow the user to disable auto-crash recovery? Not sure if this would be a viable long term solution but we are also looking for a way to make this more reliable before a true solution to this is provided.

Attachments

Issue Links

causes

MXS-2775 Document that a crashed master can break auto_rejoin with semisynchronous replication

Closed

duplicates

MDEV-21117 refine the server binlog-based recovery for semisync

Closed

relates to

MXS-2542 Add rebuild server to MariaDB Monitor

Closed

Activity

Ascending order - Click to sort in descending order

View 5 older comments

Marko Mäkelä added a comment - 2019-11-20 21:56

Elkin, we’d have to discuss this some time next week. Before MariaDB Server 10.3 implemented ~~MDEV-15158~~, InnoDB stored the latest binlog position in the TRX_SYS page in the system tablespace. Since then, it is being written to the rollback segment header page on transaction commit.

I don’t know if it is feasible to store the acknowledgements inside InnoDB rollback segment or undo pages. At the very least, the semantics should be clarified. And in any case, this should be tested with innodb_undo_log_truncate=ON.

Marko Mäkelä added a comment - 2019-11-20 21:56 Elkin , we’d have to discuss this some time next week. Before MariaDB Server 10.3 implemented MDEV-15158 , InnoDB stored the latest binlog position in the TRX_SYS page in the system tablespace. Since then, it is being written to the rollback segment header page on transaction commit. I don’t know if it is feasible to store the acknowledgements inside InnoDB rollback segment or undo pages. At the very least, the semantics should be clarified. And in any case, this should be tested with innodb_undo_log_truncate=ON .

Geoff Montee (Inactive) added a comment - 2019-11-20 22:34

If ~~MXS-2542~~ were implemented, then this could also be fixed in MaxScale by automatically rebuilding the crashed master.

Geoff Montee (Inactive) added a comment - 2019-11-20 22:34 If MXS-2542 were implemented, then this could also be fixed in MaxScale by automatically rebuilding the crashed master.

Andrei Elkin added a comment - 2019-11-21 11:33 - edited

https://jira.mariadb.org/browse/MDEV-20996?focusedCommentId=138318&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-138318 suggestion is in other words to turn --tc-heuristic-recover=rollback replication safe. Its current behaviour is not as rolled back transaction may remain in binlog.
Find ~~MDEV-21117~~ for more.

Andrei Elkin added a comment - 2019-11-21 11:33 - edited https://jira.mariadb.org/browse/MDEV-20996?focusedCommentId=138318&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-138318 suggestion is in other words to turn --tc-heuristic-recover=rollback replication safe. Its current behaviour is not as rolled back transaction may remain in binlog. Find MDEV-21117 for more.

Max Mether added a comment - 2020-04-28 15:58 - edited

For a real lossless HA solution you need MDEV-19140
The current asnych or semi-synch replication solutions cannot provide this.

Max Mether added a comment - 2020-04-28 15:58 - edited For a real lossless HA solution you need MDEV-19140 The current asnych or semi-synch replication solutions cannot provide this.

Ralf Gebhardt added a comment - 2021-04-08 13:39

This Issue is addressed as part of ~~MDEV-21117~~

Ralf Gebhardt added a comment - 2021-04-08 13:39 This Issue is addressed as part of MDEV-21117

People

Assignee:: Unassigned

Reporter:: Richard Lane

Votes:: 2 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 2019-11-06 15:42

Updated:: 2024-07-07 22:33

Resolved:: 2021-04-08 13:39

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.