[MXS-4359] Give a way to use slave_selection_criteria = LEAST_BEHIND_MASTER with Galera Created: 2022-10-21  Updated: 2023-04-04  Resolved: 2023-04-04

Status: Closed
Project: MariaDB MaxScale
Component/s: galeramon, readwritesplit
Affects Version/s: None
Fix Version/s: N/A

Type: New Feature Priority: Major
Reporter: Valerii Kravchuk Assignee: Todd Stoffel (Inactive)
Resolution: Won't Do Votes: 0
Labels: None


 Description   

Looks like readwritesplit router assumes that Galera nodes are all always in sync. In reality they may have write sets pending to be applied and wsrep_local_recv_queue may be used as a measure of current "lagging" for Galera node.

With this we may be able to apply slave_selection_criteria = LEAST_BEHIND_MASTER and route to the Galera node that is more close to "master" or less loaded.



 Comments   
Comment by markus makela [ 2022-10-22 ]

Would wsrep-causal-reads be an alternative to this?

On average, how large is the lag between the reading of the writeset from the cluster and the application of it? The main problem with using replication lag in the routing logic is that it's only updated by the monitor and thus is a relatively coarse measurement of lag. In the case of traditional async replication the lag might be minutes in the worst case which is used by readwritesplit to rule out severely lagging servers. For lag that's less than that you're probably better off enabling causal_reads in MaxScale to eliminate replication lag from the user's point of view.

In the case of Galera, if the "replication" lag is significantly less than the value of monitor_interva, it might make more sense to force wsrep-causal-reads to be used instead of using wsrep_local_recv_queue as a measurement of lag as it avoids the same problem that causal_reads=fast suffers from: if the servers are lagging too much, almost all of the traffic gets routed to a single node. The same limitations also apply to causal_reads=fast_global where it is only useful for very low write throughput and for workloads that are largely read-only.

Comment by markus makela [ 2022-10-24 ]

This looks like a new feature and not a generic task. I converted the type into New Task.

Generated at Thu Feb 08 04:28:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.