[MXS-3331] Could not bind connecting socket to local address Created: 2020-12-09 Updated: 2021-09-12 Resolved: 2021-09-02 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | Core |
| Affects Version/s: | 2.5.5 |
| Fix Version/s: | 2.5.16 |
| Type: | Bug | Priority: | Major |
| Reporter: | Michal | Assignee: | markus makela |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
docker container - buster, mariadb backends 10.3.25 |
||
| Sprint: | MXS-SPRINT-139 |
| Description |
|
Hello, We tried to setup maxscale 2.5.5 on production after successfully setuped and working maxscale in LAB env. Test env is same as production env. We had N maxscales sitting on some servers and HA was fullfilled by keepalived. keepalived's vip : 192.168.205.254 LOG :
Config :
Is this normal ? Regards, |
| Comments |
| Comment by markus makela [ 2020-12-09 ] | |||||||||
|
Does it work if you remove local_address from the config? | |||||||||
| Comment by Michal [ 2020-12-09 ] | |||||||||
|
Well, as we were on production yesterday, we haven't tried to edit configuration from hand (and started to rollback maxscale feature). What I can say is that yes, it's working with/without local_address on our LAB env, can't confirm that it's working on production ... Do you see some potentional problem from code ? Per documentation is local_address just adress/interface which is used to create connection between maxscale and backend, isn't it ? | |||||||||
| Comment by Michal [ 2020-12-09 ] | |||||||||
|
Only difference what I can see is that on LAB we have only interface, and on production Bond. | |||||||||
| Comment by markus makela [ 2020-12-09 ] | |||||||||
|
Yes, it's the address that outbound connections bind to. It's only required if you need to use a specific interface. Most often it's not required. | |||||||||
| Comment by markus makela [ 2020-12-09 ] | |||||||||
|
Looking at the code it seems the error is logged whenever the attempt to bind fails. Based on the actual error returned from the call, it looks like something is already bound to that address. | |||||||||
| Comment by Michal [ 2020-12-09 ] | |||||||||
|
Well, yes of course, there are several services listening on that address, but is this problem ? | |||||||||
| Comment by markus makela [ 2020-12-09 ] | |||||||||
|
My apologies, I only meant that the error states that something is bound to that specific address/port combination. It of course is possible to bind to the same network interface on different ports. | |||||||||
| Comment by Michal [ 2020-12-09 ] | |||||||||
|
No problem, I just have no idea what to do with it . | |||||||||
| Comment by markus makela [ 2020-12-09 ] | |||||||||
|
It's definitely not something we've seen before and I think we'll have to try and reproduce this on our side. | |||||||||
| Comment by Michal [ 2020-12-09 ] | |||||||||
|
If I am correct allocating is controlled by https://man7.org/linux/man-pages/man2/bind.2.html , so allocating is controlled by OS , isn't int ? | |||||||||
| Comment by markus makela [ 2021-08-30 ] | |||||||||
|
Have you had a chance to test if this still happens with the latest 2.5 release? | |||||||||
| Comment by markus makela [ 2021-08-30 ] | |||||||||
|
Seems like this might be a socket number limitation being hit. How many connections on average are you seeing and how long do they last? This blog post as well as this one suggest that this would be the case. I think you can verify this by monitoring the amount of network sockets that are open when you see this error. If this is indeed the case, the fix should be as simple as adding this:
|