Would wsrep-causal-reads be an alternative to this?
On average, how large is the lag between the reading of the writeset from the cluster and the application of it? The main problem with using replication lag in the routing logic is that it's only updated by the monitor and thus is a relatively coarse measurement of lag. In the case of traditional async replication the lag might be minutes in the worst case which is used by readwritesplit to rule out severely lagging servers. For lag that's less than that you're probably better off enabling causal_reads in MaxScale to eliminate replication lag from the user's point of view.
In the case of Galera, if the "replication" lag is significantly less than the value of monitor_interva, it might make more sense to force wsrep-causal-reads to be used instead of using wsrep_local_recv_queue as a measurement of lag as it avoids the same problem that causal_reads=fast suffers from: if the servers are lagging too much, almost all of the traffic gets routed to a single node. The same limitations also apply to causal_reads=fast_global where it is only useful for very low write throughput and for workloads that are largely read-only.
Would wsrep-causal-reads be an alternative to this?
On average, how large is the lag between the reading of the writeset from the cluster and the application of it? The main problem with using replication lag in the routing logic is that it's only updated by the monitor and thus is a relatively coarse measurement of lag. In the case of traditional async replication the lag might be minutes in the worst case which is used by readwritesplit to rule out severely lagging servers. For lag that's less than that you're probably better off enabling causal_reads in MaxScale to eliminate replication lag from the user's point of view.
In the case of Galera, if the "replication" lag is significantly less than the value of monitor_interva, it might make more sense to force wsrep-causal-reads to be used instead of using wsrep_local_recv_queue as a measurement of lag as it avoids the same problem that causal_reads=fast suffers from: if the servers are lagging too much, almost all of the traffic gets routed to a single node. The same limitations also apply to causal_reads=fast_global where it is only useful for very low write throughput and for workloads that are largely read-only.