Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
2.1.9
-
Ubuntu 16.06, MariaDB 10.1.28, 2-node Galera cluster with Galera Arbitrator, MaxScale 2.1.9, mysql client version 15.1 Distrib. 10.1.28-MariaDB for debian-linux-gnu (x86_64) using readline 5.2
-
2017-46, 2017-47
Description
Hi,
I'm testing MaxScale, set up to connect only to the master node, on Ubuntu 16.04 fronting a Galera cluster. The cluster comprises of 2 MariaDB 10.1 instances on different servers, with a Galera Arbitrator instance running on the MaxScale server. I'm testing this using the MariaDB client (mysql) from a fourth machine.
My test scenario is to see what the client experiences if I stop a MariaDB node part-way through a transaction, commit the transaction, then restart the MariaDB node. I start with the "slave" node, to give me a baseline for comparison, before doing it with the master node. However, the baseline case has given me inconsistent results, which I first thought was due to TLS, but may actually be something else as I've now reproduced it on a non-TLS connection.
If the slave MariaDB node comes back online with a lower wsrep_local_index value than the master, MaxScale sends a 2003 response to the client when it next sends anything to the database, sends a QUIT to the master node, and terminates both connections immediately.
If the slave comes back with a higher wsrep_local_index than the master, this doesn't seem to happen.
(I can't see a pattern as to how the wsrep_local_index value is assigned to Galera nodes rejoining the cluster, other than preferring to keep their previous value if any.)
When it disconnects, I see the following lines logged in /etc/syslog:
Oct 16 11:45:04 maxscale01 maxscale[10675]: [galeramon] There are no cluster members
|
Oct 16 11:45:04 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: lost_master. [Master, Synced, Running] -> [Running]
|
Oct 16 11:45:05 maxscale01 maxscale[10675]: [galeramon] Found cluster members
|
Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: new_master. [Running] -> [Master, Synced, Running]
|
Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode2[172.100.1.23:3306]: slave_up. [Down] -> [Slave, Synced, Running]
|
MaxScale is configured as follows (the commented-out configuration is uncommented when connecting via TLS):
[dbnode1]
|
type=server
|
address=172.16.1.22
|
port=3306
|
protocol=MySQLBackend
|
priority=1
|
|
[dbnode2]
|
type=server
|
address=172.16.1.23
|
port=3306
|
protocol=MySQLBackend
|
priority=2
|
|
[Galera Monitor]
|
type=monitor
|
module=galeramon
|
servers=dbnode1,dbnode2
|
user=galeramon
|
passwd=galeramon
|
monitor_interval=1000
|
available_when_donor=true
|
use_priority=true
|
|
[Galera Service]
|
type=service
|
router=readrouteconn
|
router_options=master
|
servers=dbnode1,dbnode2
|
user=galeramon
|
passwd=galeramon
|
|
[MaxAdmin Service]
|
type=service
|
router=cli
|
|
[Galera Listener]
|
type=listener
|
service=Galera Service
|
protocol=MySQLClient
|
port=3306
|
#ssl=required
|
#ssl_version=TLSv12
|
#ssl_cert=/etc/mysql/ssl/server-cert.pem
|
#ssl_key=/etc/mysql/ssl/server-key.pem
|
#ssl_ca_cert=/etc/mysql/ssl/ca-cert.pem
|
#ssl_cert_verify_depth=1
|
|
[MaxAdmin Listener]
|
type=listener
|
service=MaxAdmin Service
|
protocol=maxscaled
|
socket=default
|