[MXS-1109] Routing many prepares statements gives "#HY000Lost connection to backend server" Created: 2017-01-27  Updated: 2017-06-07  Resolved: 2017-03-14

Status: Closed
Project: MariaDB MaxScale
Component/s: mariadbbackend
Affects Version/s: 2.0.3
Fix Version/s: 2.1.1

Type: Bug Priority: Major
Reporter: Johan Nilsson Assignee: markus makela
Resolution: Fixed Votes: 2
Labels: None
Environment:

Linux CentOS 6.8, MySQL 5.6.19, php 5.3.3


Attachments: File maxscale.cnf     File maxscale2.log.gz     File tcpstream30-31.pcap    
Issue Links:
Relates
relates to MXS-619 creating many short sessions in paral... Closed
Sprint: 2017-27, 2017-28, 2017-29

 Description   

We have in production one max scale, routing and connection pooling queries between one web portal and a mysql server.

When running multiple prepare statements in parallell on the web-portal, maxscale closes the client connection with "#HY000Lost connection to backend server".
Tcpdump-ing the connections shows that the mysql-server replies correctly, but maxscale closes the client connection with "lost connection".
This causes the web-client session to fail.



 Comments   
Comment by markus makela [ 2017-01-30 ]

Can you upload the maxscale.cnf and the error logs, both with all the sensitive information removed?

Also, if possible, please give an example set of SQL commands that causes this problem.

Comment by Johan Nilsson [ 2017-01-31 ]

Uploaded maxscale.cnf and the max scale-logfile with info and debug enabled.
Please let me know if you need any other logfiles.

Also, uploaded a tcpdump-file containing two tcp-streams, one for webserver <> maxscale, and the other for maxscale <> mysql-server at the same time. This clearly shows the client getting the "#HY000"-error, even though the mysql-server responds correctly.

The statements where we see the problem is prepared statement-calls, like
"SELECT id FROM phpsessions WHERE userid = ? AND session_expires >= ? LIMIT 1"

Comment by markus makela [ 2017-02-06 ]

I've been looking at the TCP dump and it does seem like the client on port 27313 is executing a prepared statement while MaxScale is still streaming a resultset back to the client. That in itself is not in any way wrong or unexpected but it might be a clue as to why it might not work.

What sort of a connector are you using?

Comment by Johan Nilsson [ 2017-02-06 ]

The setup when the trace was taken:
web-portal (using PHP 5.3.3 without connection pooling) <- unix socket -> MaxScale 2.0.3
<- tcp > MaxScale 2.0.3 < tcp -> MySQL 5.6 19

So, the MaxScale installed on the web-server is used to pool connections towards the second MaxScale, which is used to be able to easy switch primary database backend server.

We have tried to exclude either of the MaxScale-instances, but without any reduction in the frequency of session disconnects.

Comment by markus makela [ 2017-02-08 ]

One thing that would be good to rule out is the use of persistent connections. If you remove both the persistpoolmax and persistmaxtime parameters from all servers, do the disconnections still happen?

Comment by Johan Nilsson [ 2017-02-08 ]

Removed persistpoolmax and persistmaxtime from both instances, and the problem disappeared.
Tried to enable again on one of the instances, and the problem reappeared almost instantly.
So it seems like this is caused by the connection pooling...

Comment by markus makela [ 2017-02-14 ]

I can confirm that this is caused by the persistent connections. This only happens with the 2.0 version of MaxScale and the upcoming 2.1 version of MaxScale has fixed this. I'll investigate why if fails with 2.0 and if a fix for the 2.0 version can be created.

Comment by markus makela [ 2017-02-14 ]

I've managed to pinpoint the problem to the use of prepared statements with persistent connections. With normal queries, the persistent connections work. When the text protocol query is switched to a binary protocol prepared statement, the errors again appear.

Comment by markus makela [ 2017-02-16 ]

Can you try if the 2.1 beta release of MaxScale fixes this problem?

Comment by Johan Nilsson [ 2017-02-17 ]

We installed the 2.1 beta-release, but were unable to test.
Logging on to socket worked fine in mysql command-line client, but php just got error
Can't create link: Connect failed: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)

tcp-dumping showed traffic from maxscale to mysql.

So it almost seems like there's some incompatibility with php in 2.1.0-beta...

Comment by markus makela [ 2017-02-17 ]

Does the MaxScale error log provide any information as to why it fails?

Comment by Johan Nilsson [ 2017-02-17 ]

Unfortunately not. It only states when connection is opened/closed

Comment by markus makela [ 2017-02-27 ]

Is the /var/lib/mysql/mysql.sock configured in maxscale.cnf and is the file created when MaxScale starts?

Stopping MaxScale, removing any stale sockets and restarting MaxScale should help solve any problems caused by old socket files.

Comment by Johan Nilsson [ 2017-03-01 ]

Yes, the socket in configured in maxscale.cnf, and created when MaxScale starts.

Comment by markus makela [ 2017-03-02 ]

In the configuration you posted in the issue, the socket is defined with a different name:

[MyHotelProxyListener]
type=listener
service=MyHotelProxy
protocol=MySQLClient
port=13306
socket=/var/lib/mysql/myhotel.sock

Is this still configured the same? If so, please change it to /var/lib/mysql/mysql.sock.

Comment by Johan Nilsson [ 2017-03-02 ]

Well, the socket is only used on the web-frontends, and I believe that I posted the configuration for the central server (that isn't recieving any traffic on the socket)
The configuration is correct, since traffic flows to the database. The only issue is that the amount of prepared statements we have doesn't work...

Comment by markus makela [ 2017-03-14 ]

This seems to have been fixed by the 2.1 version of MaxScale.

Generated at Thu Feb 08 04:04:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.