[MXS-3855] incorrect routing of selects with disable_master_failback - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Not a Bug
Affects Version/s: 6.1.4
Fix Version/s: N/A
Component/s: galeramon, readwritesplit
Labels:
None
Environment:
centos7, kernel 3.10.0-1127.el7.x86_64
maxscale 6.1.4
mariadb 10.4.21
galera cluster

Description

Description:

With disable_master_failback=true set with a readwritesplit router, the selects are not distributed correctly after the master went down and up again.

How to repeat:

galera cluster config:

[mysqld]

server-id=1

log-error=/var/log/mysql/error.log

log_bin=mysql-bin

log_slave_updates=ON

expire_logs_days=3

general_log=ON

[galera]

wsrep_on=ON

wsrep_provider='/usr/lib64/galera-4/libgalera_smm.so'

wsrep_cluster_address='gcomm://192.168.0.201,192.168.0.202,192.168.0.203'

binlog_format=row

default_storage_engine=InnoDB

innodb_autoinc_lock_mode=2

wsrep_node_address='192.168.0.201'

bind-address=0.0.0.0

maxscale config:

[maxscale]

threads=auto

[server1]

type=server

address=192.168.0.201

port=3306

protocol=MariaDBBackend

[server2]

type=server

address=192.168.0.202

port=3306

protocol=MariaDBBackend

[server3]

type=server

address=192.168.0.203

port=3306

protocol=MariaDBBackend

[Galera-Monitor]

type=monitor

module=galeramon

servers=server1,server2,server3

user=monitor_user

password=my_password

monitor_interval=2000ms

disable_master_failback=true

[Read-Write-Service]

type=service

router=readwritesplit

servers=server1,server2,server3

user=maxscale

password=maxscale_pw

[Read-Write-Listener]

type=listener

service=Read-Write-Service

protocol=MariaDBClient

port=3307

address=192.168.0.200

1. Do a `tail -f general.log` on all servers.
2. Repeatably do a `select whatever from a_table` via `mysql -h=192.168.0.200 --port=3307`. You can observe that the selects are evenly distributed on the slaves in a round robin fashion. So far so good.
3. `systemctl stop mariadb` on the master. Now you can see that the selects are only received by the one remaining slave. The master does not execute the selects. So far so good.
4. `systemctl start mariadb` on the host that used to be the master. Since disable_master_failback is set, it does not become the master again. `maxctrl list servers` says that the state is "Slave, Synced, Running". The selects are only received/executed by one slave from now on.

Expected behaviour:

The select queries should be evenly distributed on all slaves in a round-robin fashion again, after the host comes back up.

Solution:

Restarting the maxscale service on the current master fixes the problem. Restarting the service on the slaves does not.

Attachments

Activity

People

Assignee:: markus makela

Reporter:: Thomas Benkert

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2021-11-05 15:35

Updated:: 2021-11-17 15:48

Resolved:: 2021-11-17 15:48

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.