Status: Closed (View Workflow)
Resolution: Cannot Reproduce
Affects Version/s: 2.1.9
Fix Version/s: N/A
Environment:Ubuntu 16.06, MariaDB 10.1.28, 2-node Galera cluster with Galera Arbitrator, MaxScale 2.1.9, mysql client version 15.1 Distrib. 10.1.28-MariaDB for debian-linux-gnu (x86_64) using readline 5.2
I'm testing MaxScale, set up to connect only to the master node, on Ubuntu 16.04 fronting a Galera cluster. The cluster comprises of 2 MariaDB 10.1 instances on different servers, with a Galera Arbitrator instance running on the MaxScale server. I'm testing this using the MariaDB client (mysql) from a fourth machine.
My test scenario is to see what the client experiences if I stop a MariaDB node part-way through a transaction, commit the transaction, then restart the MariaDB node. I start with the "slave" node, to give me a baseline for comparison, before doing it with the master node. However, the baseline case has given me inconsistent results, which I first thought was due to TLS, but may actually be something else as I've now reproduced it on a non-TLS connection.
If the slave MariaDB node comes back online with a lower wsrep_local_index value than the master, MaxScale sends a 2003 response to the client when it next sends anything to the database, sends a QUIT to the master node, and terminates both connections immediately.
If the slave comes back with a higher wsrep_local_index than the master, this doesn't seem to happen.
(I can't see a pattern as to how the wsrep_local_index value is assigned to Galera nodes rejoining the cluster, other than preferring to keep their previous value if any.)
When it disconnects, I see the following lines logged in /etc/syslog:
MaxScale is configured as follows (the commented-out configuration is uncommented when connecting via TLS):