[MXS-3024] Readconnroute load balancing is not atomic Created: 2020-06-05  Updated: 2020-10-16  Resolved: 2020-09-11

Status: Closed
Project: MariaDB MaxScale
Component/s: readconnroute
Affects Version/s: 2.4.9
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Anthony Assignee: markus makela
Resolution: Incomplete Votes: 0
Labels: None
Environment:

RHEL7
3 nodes Percona XtraDB Cluster & MaxScale



 Description   

It's possible that two threads can both pick the same server when one of the threads should pick another server. This happens because the server selection process doesn't try to repeat the selection process if another thread manages to increment the counter before it. In the end this will cause transient small imbalances in the connection counts between servers.

Original description:


Hi,

I have found some strange thing with V 2.4 in synced mode.

If I try a sysbench by example with 15 thread, normally, the connections are splitted by the numbers of nodes (so 5 co by node in this context).

Cf conf:

[root@router ~]# cat /etc/maxscale.cnf
[maxscale]
threads=auto
 
[dbserv1]
type=server
address=192.168.116.111
port=3306
protocol=MariaDBBackend
 
[dbserv2]
type=server
address=192.168.116.112
port=3306
protocol=MariaDBBackend
 
[dbserv3]
type=server
address=192.168.116.113
port=3306
protocol=MariaDBBackend
 
[Galera-Monitor]
type=monitor
module=galeramon
servers=dbserv1, dbserv2, dbserv3
user=monitor_user
passwd=my_password
monitor_interval=2000
 
[Galera-Service]
type=service
router=readconnroute
router_options=synced
servers=dbserv1, dbserv2, dbserv3
user=maxscale
passwd=maxscale_pw
 
[Galera-Listener]
type=listener
service=Galera-Service
protocol=MariaDBClient
port=4306

Once I launch maxctrl list servers:

│ dbserv1 │ dbserv1 │ 3306 │ 7           │ Slave, Synced, Running  │      │
├──────────────┼────────────────────────────┼──────┼─────────────┼─────────────────────────┼──────┤
│ dbserv2 │ dbserv2 │ 3306 │ 4           │ Master, Synced, Running │      │
├──────────────┼────────────────────────────┼──────┼─────────────┼─────────────────────────┼──────┤
│ dbserv3 │ dbserv3 │ 3306 │ 4           │ Slave, Synced, Running  │      │

Why the traffic is not splitted by 3 on each node?

Best regards,



 Comments   
Comment by Anthony [ 2020-07-11 ]

To be closed, readwritesplit now used with master/slave mode + slave selection based on response time

Comment by markus makela [ 2020-07-13 ]

Were you able to reproduce this behavior consistently with the provided config?

There's most likely a small window during which two connections in MaxScale both pick the same server with the lowest number of connections and increment the connection counter at the same time. This results in a temporary imbalance in the connections that would only be visible when using something like sysbench that creates all the connections immediately.

Comment by Anthony [ 2020-09-10 ]

Hi,

You can close it, this issue cannot be reproduced since the creation of this ticket.

Best regards,

Generated at Thu Feb 08 04:18:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.