Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20996

Maxscale auto-failover with semi-sync replication is not providing a true HA solution

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • N/A
    • Replication
    • None

    Description

      We have be using maxscale-2.3 with mariadbmon monitor and auto-failover for our HA solution with 3 database nodes - Master/Slave/Slave. With traffic, realistically, you MUST use semi-sync replication to make this viable, otherwise near 100% of the time a Master failed server will not come back as slave w/o 1236 error due to transactions committed to storage engine that have not yet been replicated to any slave.

      Therefore, we use semi-sync replication with wait_point AFTER_SYNC. Now given this, see https://mariadb.com/kb/en/library/semisynchronous-replication/#configuring-the-master-wait-point. There are known issues with semi-sync replication after master failure/crash which will result in the same issue, Master not coming back as Slave due to a prepared transaction that is committed by automatic crash recovery. We had tried working around this by performing an automatic "Manual heuristic recovery rollback" but that did not prevent the transaction from going through after the failed master came back and we still got the 1236 replication error.

      I am aware of MENT-203 (resulting from MDEV-19733), but this is in the queue as a feature request, which may have been fine before maxscale starting supporting auto-failover as an HA solution. However, supporting an HA solution with maxscale, this is now a bug and prevents maxscale with auto-failover from truely being a robust HA solution.

      Maybe a short term solution would be to allow the user to disable auto-crash recovery? Not sure if this would be a viable long term solution but we are also looking for a way to make this more reliable before a true solution to this is provided.

      Attachments

        Issue Links

          Activity

            Elkin, we’d have to discuss this some time next week. Before MariaDB Server 10.3 implemented MDEV-15158, InnoDB stored the latest binlog position in the TRX_SYS page in the system tablespace. Since then, it is being written to the rollback segment header page on transaction commit.

            I don’t know if it is feasible to store the acknowledgements inside InnoDB rollback segment or undo pages. At the very least, the semantics should be clarified. And in any case, this should be tested with innodb_undo_log_truncate=ON.

            marko Marko Mäkelä added a comment - Elkin , we’d have to discuss this some time next week. Before MariaDB Server 10.3 implemented MDEV-15158 , InnoDB stored the latest binlog position in the TRX_SYS page in the system tablespace. Since then, it is being written to the rollback segment header page on transaction commit. I don’t know if it is feasible to store the acknowledgements inside InnoDB rollback segment or undo pages. At the very least, the semantics should be clarified. And in any case, this should be tested with innodb_undo_log_truncate=ON .

            If MXS-2542 were implemented, then this could also be fixed in MaxScale by automatically rebuilding the crashed master.

            GeoffMontee Geoff Montee (Inactive) added a comment - If MXS-2542 were implemented, then this could also be fixed in MaxScale by automatically rebuilding the crashed master.
            Elkin Andrei Elkin added a comment - - edited

            https://jira.mariadb.org/browse/MDEV-20996?focusedCommentId=138318&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-138318 suggestion is in other words to turn --tc-heuristic-recover=rollback replication safe. Its current behaviour is not as rolled back transaction may remain in binlog.
            Find MDEV-21117 for more.

            Elkin Andrei Elkin added a comment - - edited https://jira.mariadb.org/browse/MDEV-20996?focusedCommentId=138318&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-138318 suggestion is in other words to turn --tc-heuristic-recover=rollback replication safe. Its current behaviour is not as rolled back transaction may remain in binlog. Find MDEV-21117 for more.
            maxmether Max Mether added a comment - - edited

            For a real lossless HA solution you need MDEV-19140
            The current asnych or semi-synch replication solutions cannot provide this.

            maxmether Max Mether added a comment - - edited For a real lossless HA solution you need MDEV-19140 The current asnych or semi-synch replication solutions cannot provide this.
            ralf.gebhardt Ralf Gebhardt added a comment -

            This Issue is addressed as part of MDEV-21117

            ralf.gebhardt Ralf Gebhardt added a comment - This Issue is addressed as part of MDEV-21117

            People

              Unassigned Unassigned
              rvlane Richard Lane
              Votes:
              2 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.