Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-4153

Graceful Restart

    XMLWordPrintable

Details

    • New Feature
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • LongTerm
    • None
    • None

    Description

      Markus put this well in MXS-4149-

      An alternative way to deal with these sort of situations would be to have graceful shutdowns of MaxScale nodes. This would allow open connections to be migrated to a replacement node once they're done with their active transactions. This wouldn't save transactions that are lost due to unexpected outages but the use-case for "needing to restart" would be served quite well with this.

      In short, many customers have reported getting into states where it becomes necessary to restart MaxScale to avoid a crash or other issue. An example is due to rising/runaway memory usage. In many cases, the causes leading up to this are detectable via monitoring- ex, by tracking a server's remaining free memory or storage space. This means customers can proactively trigger the restart rather than waiting for a crash or true emergency.

      These customers are already leveraging techniques like cooperative monitoring to obtain HA from multiple MaxScale nodes. So why is a regular restart not good enough? Because a regular restart terminates and bounces back connections currently open on the MaxScale node being restarted. This makes MaxScale's HA setup appear and behave unreliably in these cases to applications/clients/etc.

      It should instead be possible for MaxScale to be aware it has "sister" nodes which it could migrate connections or transactions to in these cases. A "graceful restart" mechanism which has MaxScale drain its active and future connections to a "sister" node before restarting would resolve this concern and provide customers with a valuable tool needed for them and their operations teams to help themselves.

      Beyond-initial scope, but once MXS-3822 is implemented, there will be a lot of runway in future MaxScale versions to enhance this feature by enabling automatic graceful restarts and such.

      MXS-4149 is related to this issue as MXS-4149 is the preferable, desired future-state. However, the graceful restart functionality requested in this feature is expected to be easier and quicker to implement and should provide a manual solution customers can benefit from ASAP and build around as necessary. MXS-4149 and other, further improvements would be ways for MaxScale to add value.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rob.schwyzer@mariadb.com Rob Schwyzer
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.