[MDEV-13329] node cannot join cluster if being monitored by maxscale Created: 2017-07-15 Updated: 2018-04-27 Resolved: 2018-04-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera |
| Affects Version/s: | 10.1, 10.2, 10.3 |
| Fix Version/s: | 10.1.33, 10.2.15, 10.3.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Andrii Nikitin (Inactive) | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Sprint: | 10.2.11 | ||||||||
| Description |
|
When MaxScale is monitoring all 4 nodes:
When MaxScale is monitoring only one node (m2):
|
| Comments |
| Comment by Andrii Nikitin (Inactive) [ 2017-07-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
EDIT: see also alternative instructions in later comments.
Commands below will initialize and verify four nodes cluster on local machine on ports 3306 - 3309
Last command will show wsrep_cluster_size on each node. When it is 4 on every node, it should be a sign that cluster has been initialized properly.
Optionally try whole sequence several times to confirm that it works reliably.
Now the sequence of cluster commands will leave monitored nodes in state 'WSREP has not yet prepared node for application use' forever, while the other ones will work properly. In particular, with example cnf log above output will be:
General log has only commands like 'SHOW STATUS' and I wasn't able reproduce the problem when executing such commands against nodes directly (i.e. without MaxScale). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrii Nikitin (Inactive) [ 2017-11-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I am able to reliably reproduce the problem on system with 10.0 Galera packages installed, which has suitable maxscale packages from list https://downloads.mariadb.com/MaxScale/2.1.10/ . (E.g. xenial)
output before patch is always like below:
Output after patch:
sachin.setiya.007 seppo could you confirm that the patch above is reasonable? Node is ongoing sst, so no need to try wait forever until all clients are gracefully disconnected. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrii Nikitin (Inactive) [ 2017-11-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
sachin.setiya.007 please review one line patch in previous comment | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by markus makela [ 2017-11-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Workaround would be to stop the monitor in MaxScale to force the closing of the connections. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrii Nikitin (Inactive) [ 2017-11-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The problem happens because the node waits all connections to gracefully disconnect before starting sst. E.g. following command will attempt to connect and then just remain idle. If command like this runs (e.g. by some monitoring software or just broken connection waits for some timeout to be detected) on joining node, then node remains in 'Ininitalized' state.
See also https://github.com/AndriiNikitin/bugs/blob/master/MDEV-13329-simple.sh . which gives in result:
|