Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-701

Add binlog filtering to MaxScale

    XMLWordPrintable

Details

    • New Feature
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • binlogrouter
    • None
    • 2017-45, 2017-46, MXS-SPRINT-67

    Description

      There are many use cases where a user wants to replicate only some subset of a master's objects. In some cases, only a single table should be replicated. In other cases, a single database should be replicated. In other cases, perhaps a single table should be excluded.

      Today, this can be done using the replicate-* options on the slave. However, those options affect the slave SQL thread. The slave IO thread still fetches the entire binlog from the master and writes the whole thing to disk in the form of relay logs. Then the IO thread skips over potentially many gigabytes of log entries to apply the limited subset that match the configured replicate-* entries. This means the master is tied up transferring enormous amounts of unnecessary data over the network, all the while waiting for the slave IO thread to write that data to disk.

      An alternative is to use binlog-* options on the master to prevent it from writing certain log entries to disk. This is unacceptable. It means that the binlog is no longer than authoritative record, with the result that the binlog cannot be used for roll-forward recovery. It also precludes separate sets of slaves each with different requirements.

      Another solution could use the MaxScale binlog router. If MaxScale were located close to the master, to minimize network bottlenecks, it could write binlogs locally and provide filtering to clients. This means the binlog on the master would still be authoritative, it would add MaxScale as an async (or semi-sync) DR copy of the binlog, and the CPU load of processing the per-slave filters would be handled on the MaxScale node rather than on the master.

      To facilitate this use case, there are several different implementation possibilities:

      1) Separate listeners, each with its own filters. Slaves that need only a specific subset of log entries would connect to the listener that was configured to serve only that subset.
      2) Configuration based on slave server-id. The binlog router would be configured to serve a specific subset of objects to particular server-ids.
      3) MariaDB Server could be modified to accept the same replicate-* rules that exist today, but to communicate those rules to the master when it connects. When MariaDB connected to a MaxScale binlog router as master, the binlog router would do the filtering locally. (A future enhancement could add filtering support of this kind to a MariaDB Server master, but that would not be required upfront for this slave behavior to be valuable!)

      This feature would add enormous value to MaxScale and to the MaxScale binlog router.

      Attachments

        Activity

          People

            markus makela markus makela
            kolbe Kolbe Kegel (Inactive)
            Votes:
            3 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.