[MXS-3028] Node wrongly in Maintenance, Running when the node is actually Down Created: 2020-06-06  Updated: 2021-04-20  Resolved: 2021-04-12

Status: Closed
Project: MariaDB MaxScale
Component/s: galeramon
Affects Version/s: 2.3.19
Fix Version/s: 6.0.0

Type: Bug Priority: Major
Reporter: MikaH Assignee: Esa Korhonen
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos 7.6


Sprint: MXS-SPRINT-128, MXS-SPRINT-129

 Description   

When setting backend Galera-node to maintenance-mode and after that stopping that node, MaxScale does not detect that the node has went down and showing wrongly Maintenance, Running. It should be Maintenance, Down.

There are zero lines on MaxScale log when node on maintenance-mode is stopped.

This kills the logic of our in-house solution where we use output of command maxctrl list servers to see status of the running Galera Cluster.



 Comments   
Comment by markus makela [ 2020-06-06 ]

The Maintenance mode will stop all monitoring of the server as long as it is set which also stops the monitoring of whether the server is up or down. This is expected behavior and is done to prevent false state changes during server maintenance. If a server is in Maintenance mode, any other status bit shown while it is on is not the actual representation of the server state.

I think we could change it so that when the maintenance mode is on, the only value shown in the output would be Maintenance. This would be more accurate as we don't know (or care, for that matter) what the state of a server in maintenance is.

Comment by MikaH [ 2020-06-06 ]

Hmm. If you cannot or want change the behaviour how maintenance-mode works (to detect a real status of the node in maintenance-mode), it is best to remove, like already mentioned, state of the node in maintenance-mode and show just "Maintenance", not "Maintenance, Running".

Better option would be still monitor the node(s) in maintenance-mode but not routing queries against those, imo.

Comment by Esa Korhonen [ 2021-03-31 ]

This seems to depend on the monitor. MariaDBMonitor continues to monitor even servers in [Maintenance] and will show the correct status and gtid. We should perhaps decide on the general policy. In my opinion, monitoring should continue, as not monitoring e.g. a relay slave set to [Maintenance] could cause issues.

Generated at Thu Feb 08 04:18:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.