[MXS-1224] Connections in CLOSE_WAIT state Created: 2017-04-19  Updated: 2017-04-28  Resolved: 2017-04-28

Status: Closed
Project: MariaDB MaxScale
Component/s: maxinfo
Affects Version/s: 1.4.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Marco Menzel Assignee: markus makela
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Debian Jessie
Maxscale with READ/WRITE-split to MariaDB Galera Server 10.1


Attachments: JPEG File maxscale-issue-load.jpg     File maxscale.cnf    
Issue Links:
Problem/Incident
is caused by MXS-773 100% CPU on idle MaxScale with MaxInfo Closed
Sprint: 2017-32

 Description   

After some Time working, the number of CLOSE_WAIT connection/sockets increases:

tcp 1 0 max1.mue:49726 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:49479 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:46006 galera3.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:46136 galera3.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:54137 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 localhost:mysql localhost:56359 CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:38202 galera2.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:45324 galera3.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 localhost:6603 localhost:52505 CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:54058 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:47840 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:54058 galera1.mue:mysql CLOSE_WAIT 1032/maxscale
tcp 1 0 max1.mue:47840 galera1.mue:mysql CLOSE_WAIT 1032/maxscale

in Total 23860 in CLOSE_WAIT state! Maxscale will be unusable, no new connections, timeouts etc.



 Comments   
Comment by markus makela [ 2017-04-19 ]

Do the clients create a lot of short sessions? If so, enabling TCP socket reuse might help with this problem.

Comment by Marco Menzel [ 2017-04-19 ]

Yes, there are some short sessions. I think TCP socket reuse only helps with "TIME-WAIT"?
CLOSE_WAIT means maxscale is still running, and hasn't closed the socket (and the kernel is waiting for it to do so)?

Comment by markus makela [ 2017-04-19 ]

I think you are right, I mixed it up with TIME-WAIT which does cause somewhat similar problems. Can you reproduce this with all kinds of clients or just some specific client type?

Also please upload your MaxScale configuration with all sensitive data removed.

Comment by markus makela [ 2017-04-20 ]

Do you use the maxinfo service? One possible reason for these problems are that you might be hitting MXS-773 (I saw your comment there). Are you using maxinfo for monitoring MaxScale? If so, I'd recommend removing the maxinfo part and seeing if this fixes the problem.

I would also recommend testing with only one service. Pick either Read Connection Router or RW Split Router and see if this has any effect on the number of connections in CLOSE_WAIT state.

Comment by Marco Menzel [ 2017-04-20 ]

I can't reproduce it, it happens sometimes daily on different maxscale-nodes, today:

CRITICAL
05:57:21

[ 1/5 ] CRITICAL - 238 sockets in close-wait state!
WARNING
05:54:21

[ 1/5 ] WARNING - 147 sockets in close-wait state!
WARNING
05:53:21

[ 4/5 ] WARNING - 126 sockets in close-wait state!
WARNING
05:52:21

[ 3/5 ] WARNING - 107 sockets in close-wait state!
WARNING
05:51:21

[ 2/5 ] WARNING - 90 sockets in close-wait state!

WARNING
05:50:21

[ 1/5 ] WARNING - 66 sockets in close-wait state!

Comment by Marco Menzel [ 2017-04-20 ]

i will remove maxinfo

Comment by Marco Menzel [ 2017-04-26 ]

seems to work without maxinfo, will you backport the fix to 1.4.x?

Comment by markus makela [ 2017-04-28 ]

Closing this is a duplicate of MXS-773. Please follow MXS-773 for a possible timetable for a 1.4 update.

Comment by Marco Menzel [ 2017-04-28 ]

thx, but MXS-773 is closed?

Comment by markus makela [ 2017-04-28 ]

Yes, it was fixed for 2.0.1. We'll update the fix versions when it has been backported to 1.4.

Generated at Thu Feb 08 04:05:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.