Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
2.1.0, 2.2.0
-
None
-
3 node Galera cluster, mariadb 10.1.22, linux CentOS 6.7
Description
Running a query whilst all nodes are 'locked' (e.g. running some DDL) long enough for the query to timeout triggers failover logic.
The primaryFail method of MastersFailoverListener.java calls currentProtocol.isValid(0) as the final part of the check to see if the connection has been re-established. Ultimately, the ping() method is invoked which sends a ping packet to the node.
When the node against which the query was run recovers, that node sends the query results to the client. The node THEN sends the reply to the ping packet to the client.
Unfortunately, when ping() reads the input buffer, it sees the result of the query and NOT the reply for the ping packet.
At this point the connection would appear 'broken' to the client since all queries are passed back the result of the prior query.
I have reproduced this behaviour against v2.2.0 and v2.1.0.
Whilst analysing this problem, I noticed that the isValid() method did not run the Galera specific code even though I believe that I am using an appropriate url (jdbc:mariadb:sequential//node-1:3306,node-2:3306/mydb). Lastly, a few days ago, left a comment CONJ-400. It would be great if someone could respond to my questions.