[MDEV-30421] SAMU-64 Allow administrators to enable or disable parallel replication on a per-table basis Created: 2023-01-17  Updated: 2024-02-05

Status: Stalled
Project: MariaDB Server
Component/s: Replication
Fix Version/s: 11.5

Type: New Feature Priority: Major
Reporter: Aleksey Midenkov Assignee: Aleksey Midenkov
Resolution: Unresolved Votes: 0
Labels: None


 Description   

SAMU-64 Allow administrators to enable or disable parallel replication on a per-table basis

Per-domain dedicated thread for processing ordered transactions. The
thread is reserved from the total number of domain threads (controlled
by slave_parallel_threads and slave_domain_parallel_threads). Whether
the event goes to ordered thread depends on FL_ALLOW_PARALLEL flag as
well as several other conditions. FL_ALLOW_PARALLEL is passed from
master and is set for the event depending on master configuration
directives. To allow dedicated slave on server one must enable it
explicitly with configuration directive:
 
  set global slave_ordered_thread= 1;
 
Originally it was controlled by skip_parallel_replication session
variable which can be changed per-statement. This patch adds several
more directives to control it on per-schema and per-table levels:
 
  parallel_do_db
  parallel_do_table
  parallel_ignore_db
  parallel_ignore_table
  parallel_wild_do_table
  parallel_wild_ignore_table
 
Each directive is comma-separated list of fully-qualified table
names. Spaces after comma are ignored (but not before).
 
"Table" directives take precedence over "db" directives. "Do"
directives take precedence over "ignore" directives. "Wild" directives
are checked if "do" and "ignore" directives did not match.
 
If none of the above directives present everything is considered
parallel. If any of the above directives present and the table did not
match anything in the lists it is considered ordered.
 
Examples:
 
  set @@global.parallel_do_db=	   "db_parallel";
  set @@global.parallel_ignore_db= "db_serial";
  set global parallel_do_table=  "db_serial.t3,  db_serial.t1";
  set global parallel_wild_ignore_table= "db_parallel.non_parallel_%"
 
Normal behaviour of ordered transaction is before start to wait any of
prior transactions to commit: they get into different commit
groups. But since all the ordered transactions (within one domain) go
to a single thread we may avoid that restriction with this directive
on slave:
 
  set global slave_ordered_dont_wait= 1;
 
When set events without explicit FL_WAITED flag going to ordered
thread nonetheless accept optimistic speculation. I.e. they get into
same commit group with parallel events: ordered event is executed in
parallel with parallel events.



 Comments   
Comment by Aleksey Midenkov [ 2023-03-07 ]

Please review test refactorings https://github.com/MariaDB/server/commits/bb-10.3-midenok2

Comment by Andrei Elkin [ 2023-03-07 ]

Thanks Aleksey! Hopefully involved tests analysis/maintenance will be easier after your improvements.

Comment by Aleksey Midenkov [ 2023-03-21 ]

f() { find /home/midenok/src/mariadb/10.3/src/mysql-test/suite/binlog_encryption -name 'rpl_*.test' -print -exec cat '{}' ';'|grep 'rpl/include.*\.inc; }
f |while read b a; do b=${a%%.inc}.test; b=${b/include/t}; c=${b/rpl\/t/binlog_encryption}; sed -i -re "s|include|t|; s|\.inc$|.test|" $c; done
f |while read b a; do b=${a%%.inc}.test; b=${b/include/t}; c=${b/rpl\/t/binlog_encryption}; git rm $b; git mv $a $b; done

Comment by Aleksey Midenkov [ 2023-03-26 ]

Please review bb-11.0-midenok-MDEV-30421

Comment by Andrei Elkin [ 2023-06-05 ]

As a part of the proposed design review

  • discussed an alternative approach to handle data dependent transactions;
  • designed and poc-ed data dependency handling by optimistic parallel slave.
Comment by Aleksey Midenkov [ 2023-07-03 ]

Please review bb-10.4-midenok-MDEV-30421 based on dependency PoC

Comment by Kristian Nielsen [ 2023-08-10 ]

If I understand this correctly, it adds a number of configuration variables --parallel-do-db etc., which affect the flags in the binlogged GTID event based on which tables are participating in the binlogged event group.

I think this should instead use CREATE TABLE options. Instead of adding a table to the --parallel-ignore-table, one would ALTER TABLE t SKIP_PARALLEL=1 (probably need some thought on the best name for the table option).

Table option is more appropriate for a couple of reasons.

This affects the transaction when it is committed (binlogged), after this it cannot be changed. Thus, it is a property of the table at the time of commit, not a property of the configuration of the server when the event is sent to the slave or replicated on the slave. Therefore, it does not add any flexibility to be able to control this with configuration options, unlike for example --replicate-ignore-table. All such properties of a table should for consistency be a part of the table definition.

There may be many other similar properties of a table in the future that affect replication. One example that could be useful is to mark a table unsafe for statement-based replication, so that all queries against it will use ROW mode in --binlog-format=mixed. And there will probably be others. It is much better to add CREATE TABLE options for each of these than to add separate set of new configuration variables for each one.

Being able to specify the option on a per-table basis is more flexible for the DBA, and more efficient for the server to check (just a lookup in the table share, as compared to a match of a potential complex set of filters against each table).

Comment by Ralf Gebhardt [ 2023-08-10 ]

Elkin, I like the idea to use table options instead of environment variables to control table related behaviors for the replication. This can also help for cases like MDEV-22327, where we could have a table option to skip replication added by default for federated engines. What do you think?

Comment by Andrei Elkin [ 2023-11-15 ]

ralf.gebhardt], I lean towards knielsen's refinement. Yet afair midenok may correct me, the primary use case was
to execute on the parallel slave some transactions that conflict with each other in a 'dedicated' thread. I can't help to perceive it as a domain to be controlled (logged with) e.g via Kristian's way. On the slave side, this domain would be recognized to be automatically configured with one worker thread.

Generated at Thu Feb 08 10:16:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.