[MXS-4137] Add native support for conditional group commit tuning for parallel replication Created: 2022-05-17  Updated: 2023-12-15

Status: Open
Project: MariaDB MaxScale
Component/s: mariadbmon
Affects Version/s: None
Fix Version/s: 24.02

Type: New Feature Priority: Major
Reporter: Rob Schwyzer Assignee: Joe Cotellese
Resolution: Unresolved Votes: 2
Labels: None


 Description   

Note this is most applicable when parallel replication is configured to use conservative mode, but can be helpful for optimistic as well.

https://mariadb.com/kb/en/parallel-replication/?msclkid=e5e356a3d15711ec8f4e03f95e015af9#conservative-mode-of-in-order-parallel-replication

In such setups, we want to tune the primary MariaDB server with binlog_commit_wait_count and binlog_commit_wait_usec. This trades some maximum write throughput on the primary to have the primary group and organize commits to the binary log in such a way that a replica can confidently identify groups of transactions it can safely apply in parallel.

The problem this creates- which is how this feature relates to MaxScale's MariaDB Monitor- is that there is no reason or desire to impose these requirements or limitations on the replicas. We don't want or need the replicas to sacrifice write performance to try to reorganize their binary logs like this. Instead, we want this to be setup on the primary but disabled on the replicas.

The below GitHub link provides further clarification and context for this, including instructions on how to achieve this result with promotion and demotion scripts-

https://github.com/mariadb-faisalsaeed/documentation/blob/master/parallel-replicaiton.md#understanding-parallel-replication

So again, it is nice that we can achieve this with promotion and demotion scripts... but this is such a core piece of functionality for conservative parallel replication that it would be much better to not leave this as something we are asking customers to manage on their own. If instead we could embed this logic in MariaDB Monitor and allow customers to configure it there, that would be much more ideal. It would also improve customer adoption and adherence to best-practice for this. As incorrect configuration here will cause replication lag or other problems, improving customer adherence to best-practice for this will directly improve overall customer experience with MariaDB and MaxScale asynchronous replication.

Larger opportunity...

The minimum ask of this feature is detailed above this header. However, this feature could also be more generally resolved via a more general request like this-

Enhance MariaDB Monitor to enable configuration of MariaDB configurations where users can provide one value for the primary and one value for replica(s).

Ex, rather than adding an option like, enforce_group_commit_tuning=true, there could be a more general setting like-

[MariaDB-Monitor]
...
primary_replica_variables=(binlog_commit_wait_count,10,0), (binlog_commit_wait_usec,1000,0), (sync_binlog,1,0), (innodb_flush_log_at_trx_commit,1,0)

The above setup is-

primary_replica_variables=(VARIABLE, VALUE FOR PRIMARY, VALUE FOR REPLICAS), (VARIABLE, VALUE FOR PRIMARY, VALUE FOR REPLICAS)

This is a little messy, but there are likely viable alternatives for how to handle this (no shortage of examples with optimizer_switch and wsrep_provider_options). Could even go with just having customers provide config files, ex-

[MariaDB-Monitor]
...
primary_configs=/var/lib/maxscale/primary.cnf
replica_configs=/var/lib/maxscale/replicas.cnf

Where /var/lib/maxscale/primary.cnf-

binlog_commit_wait_count=10
binlog_commit_wait_usec=1000
sync_binlog=1
innodb_flush_log_at_trx_commit=1

And where /var/lib/maxscale/replicas.cnf-

binlog_commit_wait_count=0
binlog_commit_wait_usec=0
sync_binlog=0
innodb_flush_log_at_trx_commit=0

In an ideal world, MaxGUI will provide an interface to configure all of this as well, and also in an ideal world all of this can be configured dynamically.



 Comments   
Comment by markus makela [ 2023-04-30 ]

The problem this creates- which is how this feature relates to MaxScale's MariaDB Monitor- is that there is no reason or desire to impose these requirements or limitations on the replicas.

rob.schwyzer@mariadb.com does this only affect replicas where non-replication traffic generates writes? In other words, if the replicas are read-only, would this feature have any effect on them? Write performance on replicas seems like something that's not at all relevant for asynchronous replication that's managed by MaxScale: a replica never receives writes and thus one would assume that any group commit ordering and/or delays wouldn't affect it. If this is not the case and you're requesting something else, can you give a clear example where the problem occurs?

Elkin Would you be able to confirm whether binlog_commit_wait_count and binlog_commit_wait_usec have an effect on transactions commited by replication? If these configuration variables affect the replicas that are used in this read-only behavior, I'd say that this is something that should be fixed in the server.

If this behavior does not affect replication in any significant way, like you mentioned, this seems like something that's already implemented in the form of promotion and demotion scripts and this feature would offer a more limited set of actions to take (i.e. no dynamic SQL). Perhaps a better way to deal with this issue would be to make the SQL scripts more easy to edit from the GUI? Having to invent a new mechanism for doing something that's already doable in a more versatile and generally applicable way doesn't seem very useful at this point. I do see the value in "best practices" but this does not require this kind of a feature to be implemented and could be done with something as simple as a default SQL script for promotion and demotion.

Comment by Rob Schwyzer [ 2023-05-01 ]

Andrei Elkin Would you be able to confirm whether binlog_commit_wait_count and binlog_commit_wait_usec have an effect on transactions commited by replication? If these configuration variables affect the replicas that are used in this read-only behavior, I'd say that this is something that should be fixed in the server.

This would be the critical bit of knowledge, yeah. The concern would be the impact to local binlogging on replicas. I'll message Faisal to recheck this though to ensure there's not another use-case from his customers.

Comment by Andrei Elkin [ 2023-05-03 ]

markus makela, the variables may be relevant the parallel (read-only) slave, not for the serial one. I am not yet fully confident in their effect 'cos on slave there's a separate grouping to commit in the master's binlog order. The two groupings overlap so the user has to careful at playing around with the vars.

Generated at Thu Feb 08 04:26:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.