[MXS-1476] priority value ignored when a Galera node rejoins with a lower wsrep_local_index than current master - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: 2.1.9
Fix Version/s: N/A
Component/s: galeramon
Labels:
- galera
Environment:
Ubuntu 16.06, MariaDB 10.1.28, 2-node Galera cluster with Galera Arbitrator, MaxScale 2.1.9, mysql client version 15.1 Distrib. 10.1.28-MariaDB for debian-linux-gnu (x86_64) using readline 5.2

Sprint:
2017-46, 2017-47

Description

Hi,

I'm testing MaxScale, set up to connect only to the master node, on Ubuntu 16.04 fronting a Galera cluster. The cluster comprises of 2 MariaDB 10.1 instances on different servers, with a Galera Arbitrator instance running on the MaxScale server. I'm testing this using the MariaDB client (mysql) from a fourth machine.

My test scenario is to see what the client experiences if I stop a MariaDB node part-way through a transaction, commit the transaction, then restart the MariaDB node. I start with the "slave" node, to give me a baseline for comparison, before doing it with the master node. However, the baseline case has given me inconsistent results, which I first thought was due to TLS, but may actually be something else as I've now reproduced it on a non-TLS connection.

If the slave MariaDB node comes back online with a lower wsrep_local_index value than the master, MaxScale sends a 2003 response to the client when it next sends anything to the database, sends a QUIT to the master node, and terminates both connections immediately.

If the slave comes back with a higher wsrep_local_index than the master, this doesn't seem to happen.

(I can't see a pattern as to how the wsrep_local_index value is assigned to Galera nodes rejoining the cluster, other than preferring to keep their previous value if any.)

When it disconnects, I see the following lines logged in /etc/syslog:

Oct 16 11:45:04 maxscale01 maxscale[10675]: [galeramon] There are no cluster members

Oct 16 11:45:04 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: lost_master. [Master, Synced, Running] -> [Running]

Oct 16 11:45:05 maxscale01 maxscale[10675]: [galeramon] Found cluster members

Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: new_master. [Running] -> [Master, Synced, Running]

Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode2[172.100.1.23:3306]: slave_up. [Down] -> [Slave, Synced, Running]

MaxScale is configured as follows (the commented-out configuration is uncommented when connecting via TLS):

[dbnode1]

type=server

address=172.16.1.22

port=3306

protocol=MySQLBackend

priority=1

[dbnode2]

type=server

address=172.16.1.23

port=3306

protocol=MySQLBackend

priority=2

[Galera Monitor]

type=monitor

module=galeramon

servers=dbnode1,dbnode2

user=galeramon

passwd=galeramon

monitor_interval=1000

available_when_donor=true

use_priority=true

[Galera Service]

type=service

router=readrouteconn

router_options=master

servers=dbnode1,dbnode2

user=galeramon

passwd=galeramon

[MaxAdmin Service]

type=service

router=cli

[Galera Listener]

type=listener

service=Galera Service

protocol=MySQLClient

port=3306

#ssl=required

#ssl_version=TLSv12

#ssl_cert=/etc/mysql/ssl/server-cert.pem

#ssl_key=/etc/mysql/ssl/server-key.pem

#ssl_ca_cert=/etc/mysql/ssl/ca-cert.pem

#ssl_cert_verify_depth=1

[MaxAdmin Listener]

type=listener

service=MaxAdmin Service

protocol=maxscaled

socket=default

Attachments

Activity

People

Assignee:: markus makela

Reporter:: Pak Chan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2017-10-16 16:01

Updated:: 2017-12-14 17:04

Resolved:: 2017-12-14 17:04

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.