Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-4390

Deferred execution of session commands

    XMLWordPrintable

Details

    • New Feature
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • None
    • Icebox
    • readwritesplit
    • None

    Description

      Overview

      Due to how session commands are handled currently, all the servers that a session is connected to end up preparing all prepared statements. This is especially wasteful if the majority of the work is done on one server (e.g. the currently selected master server) and the others are idle. The lazy_connect feature partially solves this problem but it still ends up doing unnecessary work after all the connections have been established.

      A solution to this problem that doesn't require partitioning the session command history is to delay the execution of the history until a backend connection is used. This means that the history would lag behind on backends that aren't needed currently. Once a backend is needed, the part of the history that has not yet been executed is executed right before the real command gets executed. This should avoid the problem of preparing a prepared statment on all servers when it actually ends up being used only once and only on a single backend.

      A further improvement to this when used with lazy_connect is to detect queries whose results are known to always be the same. Queries like SET NAMES, SET autocommit=1 and anything else that is expected to always return the same response can be a target of this optimization. One method of doing this is to observe the of executions of the commands and once a certain confidence level is reached, cache the response so that can be returned by MaxScale instead of being executed on the database. This eliminates half of the network latency for these types of queries.

      This also allows lazy_connect to defer the connection creation to the point where the type of the first query that is not a session command is known. This will then determine the type of the workload (read-only or read-write) that is to be executed which in turn eliminates the creation of connections to servers that aren't needed.

      Even for prepared statements there's a possibility that the result is known in advance. Since MaxScale internally assigns the prepared statement IDs, by the time the response gets back into the routing/filtering components, the result will always be consistent regardless of how many times it is executed. This means that if the set of prepared statements that is executed is always in the same order and the results are always the same, these can be returned by MaxScale without having to wait for it to be executed on the backend. This could help avoid the problems described in MXS-4639 by immediately returning the result to the client protocol.

      Preliminary Design

      For every executed query, store the result in memory. If a result is returned for a stored query that is different from the first one, mark the query as unpredictable. If after a set number of observations has been made and there has only been one result, the query is fully predictable. For all fully predictable queries, start returning the result from memory. If there are any open connections, the result must be sent but the answer from it must be discarded. The normal behavior of session command execution then follow for all the ignored results where any connections that produce a different result are discarded.

      Attachments

        Issue Links

          Activity

            People

              JoeCotellese Joe Cotellese
              markus makela markus makela
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.