Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2165

CDC Data Adapter Integration - Design

    XMLWordPrintable

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • 2.4.0
    • Documentation
    • None

    Description

      High-level Design

      The system has three distinct types of processes:

      • Replicating events from the master
      • Processing events generated in ROW format and sending them to ColumnStore
      • Processing events generated in STATEMENT format and sending them to ColumnStore

      The process that replicates the events must drive the other two processes by delegating work and performing the necessary synchronizations when DDL statements are being processed. The driving process, henceforth referred to as the IO process, has one or more "outputs": The process that handles STATEMENT based events, henceforth referred to as the SQL process, and the process that handles the ROW based event handlers, henceforth referred to as the Table processes.

      The SQL process receives all events that are replicated as plain-text SQL statements. With MIXED format replication most of the data will be in this format. All DDL statements as well as DML statements that weren't deemed necessary to replicate in ROW format events are sent as plain-text SQL. With ROW format replication, only DDL statements are sent as plain-text SQL and all other events are replicated as binary ROW format events.

      A Table process exists for each opened table in the replication stream. This process will handle the conversion of raw binary format data into a suitable native format that can be exported to ColumnStore via the ColumnStore Bulk Load API. As the ColumnStore API only supports INSERT operations, UPDATE and DELETE operations replicated as ROW events must be translated back into SQL statements and executed on the database directly.

      The SQL and Table processes are mutually exclusive, if the SQL process is running the Table process must be synchronized and stopped and vice versa. This guarantees that all ROW events are processed before any schema changes can occur and that all SQL statements are processed before ROW events with a new table layout are processed.

      Limitations

      • Performance with moderate rates of UPDATE and DELETE queries is expected to be bad due to the lack of support for direct PM updates and general slowness for SQL layer DML queries.
      • Upon failure the last transaction that was applied to ColumnStore is to be stored inside the data adapter. This implies a possibility of applying the same transaction twice.

      Attachments

        Activity

          People

            markus makela markus makela
            johan.wikman Johan Wikman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.