[MDEV-9345] Replication to enable filtering on master Created: 2015-12-30  Updated: 2023-10-05

Status: Open
Project: MariaDB Server
Component/s: Replication
Fix Version/s: None

Type: Task Priority: Major
Reporter: VAROQUI Stephane Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
PartOf
includes MDEV-6593 domain_id based replication filters Closed

 Description   

This task enable replication source to fetch only binlog events specifics to a set of tables without any modification of the application code and to fetch part of the schema without being force to copy all the binlogs.

Today MDEV-6593 force the application to set a domain id to enable binlog filtering on the master, this scenario is changing for ever the event order detaching some events into out of order domain for all the slaves. That is not desirable in many cases.

It is preferable that filters can be loaded into the master connections thread and filtering take place inside the master.



 Comments   
Comment by VAROQUI Stephane [ 2017-05-29 ]

I don't agree with priority of this MDEV.

This requirement is every where. replication filtering is very used and sending all binlogs for just a a table or ignoring some big logs table on the master is a must do such events should not transit the network

Do you have something in mind that can enable like one time @ the biggest web shop in France to move a 100T DB that stop replicating when provisioning a new slave for changing a SAN?

We used multi source for each independent domain and fetch them in different replicated stream this enable to catch up after provisioning without blocking what is already replicated , but such solution is limited by the number of source as it will duplicate the network traffic for each source

Comment by Oli Sennhauser [ 2023-10-05 ]

If I understand Stephane correctly he wants to have binlog filtering on the binlog dump thread instead of the binary log writer thread.

This request has some significant advantages:

  • nothing is lost in the binary log (for PiTR). This is the most important requirement!
  • Sensitive data do not leave the Master (Slave could be a target for attacks in a less secure network zone). So we can avoid an intermediate slave just for filtering... Second most important requirement.
  • Reduces network traffic
  • Reduces relay log footprint on the slaves
  • We can build different slave groups for filtering (for example some app1 Slaves, some app2 Slaves) and the configuration is only located on one place (Master)
Generated at Thu Feb 08 07:33:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.