Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
22.08.13, 23.08.6
-
None
Description
It is observed in the maxscale instances where "cooperative monitoring" is enabled, it logs, like "lost the exclusive locks on a majority of the servers and configuring that node can't perform failover. Followed by "Server Changed state", which is confusing or to understand what caused it.
2024-05-15 03:23:50 warning: [mariadbmon] 'MariaDB-Monitor' lost the exclusive lock on a majority of its servers. Configured automatic cluster manipulation operations (e.g. failover) can not be performed.
2024-05-15 03:23:50 warning: [mariadbmon] 'MariaDB-Monitor' holds 2 lock(s) without lock majority, and will release them. The monitor of the primary MaxScale must have failed to acquire all server locks.
2024-05-15 03:23:50 notice : Server changed state: test-mariadb-1[test-mariadb-1:3306]: new_slave. [Master, Running] -> [Slave, Running]
...
2024-05-15 03:23:51 notice : Server changed state: test-mariadb-1[test-mariadb-1:3306]: new_master. [Slave, Running] -> [Master, Running]
Understood that with the current monitoring code, there is a limitation where the locks may be "lost" without losing the connection. In reality, the connection is lost but automatic reconnection hides it and this makes it look like that the locks are lost due to no error. It could be due to small timeout or a little network hiccups.
Raising this request to enhance our maxscale logging capabilities for better understanding of this scenario.