Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22429

Request: Inter query caching of subqueries.

    XMLWordPrintable

Details

    Description

      Context:

      I'm using MariaDB 10.3.22 in perhaps less typical use case to manipulate and analyse a large dataset. There are a few GB of tables which are almost static (changes propagated perhaps every few weeks) access by one or two clients running long and complex queries to summaries and compare between tables. Currently Query cache has TTL of 12h so that every data everything is calculated from scratch one a day but within that day query re-runs happen from cache.

      Request

      In this use case the ability for MariaDB to cache subqueries between queries executions (prehaps for multiple days) would result in a significant performance increase. Example:

      • If when faced with a statement with multiple Union (ALL) commands with Select statements between, could MariaDB be forced to cache the result of each '(SELECT ... FROM ...)' rather than caching the query as a whole. The penalty of having to run the Union every time would far be outweighs be hit rate increase from caching of sub-queries.
      • Or when executing a query which select from a view rather than a table, the view is cached separately (and in addition) to the query result so that future queries using that view don't require view to be re-executed.

      Reasoning

      The database in question has multiple view that hierarchically create even more complex summaries of datasets. This tree might be 3 levels deep with each level dependent of two other views, and the bulk of the queries being run on the top level of views.

      There are also some summarisation queries and views with produces statistics across most of the tables in the database and take the longest to run (~5mins). This uses Select statements to produce a single row summarising either a table or a subset of rows within that table. These select statements are then Union'ed together to produce a 15/20 row long table which summaries lots of different datasets. This understandable takes a while. Although there is no reason why each of these select statements needs to be run very often as the data sets they draw from rarely change, these summary table get twerked often. It is very tedious that if you want to insert another row into summery or edit how or two rows is formulated, this requires a 5 minute wait while the all the select statements are needlessly executed again.

      In both these cases (Views & Union) there is lots of scope to use caching to speed up execution times as the majority of the processing time is spend duplicated recently done work. However as far as I can work out from the documentation, there is no scope with MariaDB currently for inter query caching of subqueries. The MaxScale filtering appears to allow a lot of fine control over whether or not to cache a query, but as far as I can ascertain, this is only about whether of not whole query result is cached. On the other hand the subquery cache caches exactly the parts of queries which would be useful, but there is no options for allowing these to persist beyond the query execution.

      Attachments

        Activity

          People

            Unassigned Unassigned
            simsp001 P Sims
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.