Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-6155

Heterogenous data handling for Exasol Router

    XMLWordPrintable

Details

    • New Feature
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Goal:
      The purpose of this version of MariaDB Exa is to allow customers to have heterogenous data on some tables. This means that historical data can be removed from the InnoDB tables on the MariaDB side while it still remains intact on the Exasol side. In order to deal and properly manage this discrepancy in data MaxScale as the central has to be aware of this so that the queries can be executed on the right side.

      Proposal: Use Partitioning to manage historical data
      MariaDB already has a pretty advanced set of partitioning features. This makes it somewhat easy to remove historical data by just deleting whole partitions (ALTER TABLE … DROP PARTITION). Partitioning further has the benefit of being part of the table definition and thus easily viewable by MaxScale or other tools. Furthermore our current CDC solution ignores DDL changes to PARITIONING (all DDL changes at the moment) and changes to the CDC would thus not be required.

      Details on the MariaDB Server side:
      1. The base table is partitioned by a DATETIME column in some set of chunks (could be months or days depending on table size, insertion rate etc).
      2. At a regular interval (say once / month or once / day) these partitions are re-organized so that the oldest partition is removed (with ALTER TABLE … DROP PARTITION) and newer partitions are created (with ALTER TABLE … REORGANIZE PARTITIONS) etc.
      3. If LIST partitioning is used this is fine by itself, if RANGE partitioning is used there should also be a constraint that refuses writes in the range of the deleted partitions.

      2. Should be handled by a script (or stored procedure) that is launched on a regular basis.

      Details on the MaxScale side:
      Each time the script is changing the partitioning scheme of the table MaxScale should be made aware of these changes so that SELECT queries potentially touching ranges outside the partitioning RANGE of the table should always only go to Exasol. This can be done by:
      1) The script/procedure directly
      2) MaxScale looking at the table definition
      3) MaxScale looking at the INFORMATION_SCHEMA tables

      Is MaxScale able to catch such queries ?

      MaxScale can regularly check information_schema to become aware of partitions, but probably better to provide a way for tri

      Attachments

        Activity

          People

            allen.herrera Allen Herrera
            maxmether Max Mether
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.