[MDEV-9353] feature request: binlog ignore table - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: None
Component/s: Replication
Labels:
None

Description

Hello,
I couldn't find if that was ever discussed before...
I'd like to propose a feature of ignoring a certain table(s) activity being written to binlog.

This can save on traffic + ease on slaves that need to replicate just few tables but have to download gigabytes of binary logs and slow down because replaying it takes time.

So maybe instead of ignoring certain tables on slave it's possible to let the master avoid creating binlog with it?

At this moment I have to manipulate with rewrite-db to let something replicate to a different database and avoid that db from appearing in binlogs. Probably this won't work with statement log format, but row replication should do the thing imo. Anyway this can be left to dba/sysadmin to decide

Hope you find it interesting and possible to implement
Thanks!
Alex

p.s. alternatively it could be a feature for maxscale binlog proxy to let certain slaves download only relevant parts..or something like that

Attachments

Activity

Ascending order - Click to sort in descending order

Michael Widenius added a comment - 2016-04-05 18:51 - edited

Do you mean that in addition to the current --binlog-ignore-db=database_name option, you would also want to see an option --binlog-ignore-table=database.table" ?

Or would you like to have an create option that a table is not logged ?

This is something that can safely be done only with row level replication.

Michael Widenius added a comment - 2016-04-05 18:51 - edited Do you mean that in addition to the current --binlog-ignore-db=database_name option, you would also want to see an option --binlog-ignore-table=database.table " ? Or would you like to have an create option that a table is not logged ? This is something that can safely be done only with row level replication.

Alex added a comment - 2016-04-05 19:25 - edited

Hi Michael,
thanks for getting back to me!
I personally would like to see both binlog-ignore-table or binlog-do-table, same as done with replication. This will introduce an easy way to manipulate with replication.
My guess it can be done for both statement and row replication types. Somewhere I have to use statement and somewhere a row replication type.
I am not sure I understand what you mean by "Or would you like to have an create option that a table is not logged ?"

As I explained - I solved it by different manipulations using rewrite-do-db and splitting databases.

Maybe it's an offtopic for this very JIRA issue but that also comes as a result from inability to manipulate with tables in binary logs.
There is one more thing I'd like to propose, but 1st let me explain why and how I came to it:

I have now 2 databases (because of the aforementioned reasons). 1st database has around 500gbytes/day and no binary logs.
The 2nd database (which creates smaller binary logs now) works with some tables that are frequently truncated and re-filled (from 1st database data).
In order to not let these temp tables to replicate (the data is a dozen of gigabytes every hour) I am using hardlinks for same tables to be accessible from both databases.
So tables are filled in 1st database, then 'flush table <table_name>' command sent to 2nd database, and only after doing that the application gets access to the data in those tables.
The problems with it: a. hard links to be monitored and not forgotten. b. flushing tables clears all memory caches/reopens tables and thus working with those tables takes more time
but as a result my binary logs are finally small, I'd say - even tiny, because only results of different calculations are replicated now to a central server.

The problem I faced here is inability to use 'set sql_bin_log=0' on app level (this to fill those tables in 2nd database). It can be set only by user with SUPER privileges. Nobody will set this privilege to an app user that should (usually) perform only select/insert/delete/update. Flush privileges (and using hardlinks) gave me the desired solution, but I don't like it much.

So I wanted to ask if you ever reconsidered introducing additional permission or just letting a regular user set sql_bin_log. I know it's kinda a dangerous thing for production, but there are many other dangerous things for production (wrong permission , port 3306 open worldwide and so on). I wouldn't say it's more dangerous than any other thing that's missed. So why to keep giving this special attention?

Thanks! (And sorry again if that's offtopic)

Alex added a comment - 2016-04-05 19:25 - edited Hi Michael, thanks for getting back to me! I personally would like to see both binlog-ignore-table or binlog-do-table, same as done with replication. This will introduce an easy way to manipulate with replication. My guess it can be done for both statement and row replication types. Somewhere I have to use statement and somewhere a row replication type. I am not sure I understand what you mean by "Or would you like to have an create option that a table is not logged ?" As I explained - I solved it by different manipulations using rewrite-do-db and splitting databases. Maybe it's an offtopic for this very JIRA issue but that also comes as a result from inability to manipulate with tables in binary logs. There is one more thing I'd like to propose, but 1st let me explain why and how I came to it: I have now 2 databases (because of the aforementioned reasons). 1st database has around 500gbytes/day and no binary logs. The 2nd database (which creates smaller binary logs now) works with some tables that are frequently truncated and re-filled (from 1st database data). In order to not let these temp tables to replicate (the data is a dozen of gigabytes every hour) I am using hardlinks for same tables to be accessible from both databases. So tables are filled in 1st database, then 'flush table <table_name>' command sent to 2nd database, and only after doing that the application gets access to the data in those tables. The problems with it: a. hard links to be monitored and not forgotten. b. flushing tables clears all memory caches/reopens tables and thus working with those tables takes more time but as a result my binary logs are finally small, I'd say - even tiny, because only results of different calculations are replicated now to a central server. The problem I faced here is inability to use 'set sql_bin_log=0' on app level (this to fill those tables in 2nd database). It can be set only by user with SUPER privileges. Nobody will set this privilege to an app user that should (usually) perform only select/insert/delete/update. Flush privileges (and using hardlinks) gave me the desired solution, but I don't like it much. So I wanted to ask if you ever reconsidered introducing additional permission or just letting a regular user set sql_bin_log. I know it's kinda a dangerous thing for production, but there are many other dangerous things for production (wrong permission , port 3306 open worldwide and so on). I wouldn't say it's more dangerous than any other thing that's missed. So why to keep giving this special attention? Thanks! (And sorry again if that's offtopic)

Alex added a comment - 2016-04-05 19:29

Now I think I understand what you meant by "Or would you like to have an create option that a table is not logged ?"
I guess you meant something to be included as part of table's schema creation , some special option? If yes, I still think binlog-do-table and binlog-ignore-table are much easier to manipulate with (since it looks like replicate-do-table and replicate-ignore-table) and people will get used to it in no time. Using 'wild' for both options would be amazing

thanks again!

Alex added a comment - 2016-04-05 19:29 Now I think I understand what you meant by "Or would you like to have an create option that a table is not logged ?" I guess you meant something to be included as part of table's schema creation , some special option? If yes, I still think binlog-do-table and binlog-ignore-table are much easier to manipulate with (since it looks like replicate-do-table and replicate-ignore-table) and people will get used to it in no time. Using 'wild' for both options would be amazing thanks again!

MariaDB Server

feature request: binlog ignore table

Details

Description

Attachments

Activity

People

Dates

Git Integration