[MXS-701] Add binlog filtering to MaxScale - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.3.0
Component/s: binlogrouter
Labels:
None

Sprint:
2017-45, 2017-46, MXS-SPRINT-67

Description

There are many use cases where a user wants to replicate only some subset of a master's objects. In some cases, only a single table should be replicated. In other cases, a single database should be replicated. In other cases, perhaps a single table should be excluded.

Today, this can be done using the replicate-* options on the slave. However, those options affect the slave SQL thread. The slave IO thread still fetches the entire binlog from the master and writes the whole thing to disk in the form of relay logs. Then the IO thread skips over potentially many gigabytes of log entries to apply the limited subset that match the configured replicate-* entries. This means the master is tied up transferring enormous amounts of unnecessary data over the network, all the while waiting for the slave IO thread to write that data to disk.

An alternative is to use binlog-* options on the master to prevent it from writing certain log entries to disk. This is unacceptable. It means that the binlog is no longer than authoritative record, with the result that the binlog cannot be used for roll-forward recovery. It also precludes separate sets of slaves each with different requirements.

Another solution could use the MaxScale binlog router. If MaxScale were located close to the master, to minimize network bottlenecks, it could write binlogs locally and provide filtering to clients. This means the binlog on the master would still be authoritative, it would add MaxScale as an async (or semi-sync) DR copy of the binlog, and the CPU load of processing the per-slave filters would be handled on the MaxScale node rather than on the master.

To facilitate this use case, there are several different implementation possibilities:

1) Separate listeners, each with its own filters. Slaves that need only a specific subset of log entries would connect to the listener that was configured to serve only that subset.
2) Configuration based on slave server-id. The binlog router would be configured to serve a specific subset of objects to particular server-ids.
3) MariaDB Server could be modified to accept the same replicate-* rules that exist today, but to communicate those rules to the master when it connects. When MariaDB connected to a MaxScale binlog router as master, the binlog router would do the filtering locally. (A future enhancement could add filtering support of this kind to a MariaDB Server master, but that would not be required upfront for this slave behavior to be valuable!)

This feature would add enormous value to MaxScale and to the MaxScale binlog router.

Attachments

Activity

People

Assignee:: markus makela

Reporter:: Kolbe Kegel (Inactive)

Votes:: 3 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2016-04-28 17:56

Updated:: 2018-09-26 08:26

Resolved:: 2018-09-26 08:26

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.