[MXS-4538] No valid servers in cluster 'MariaDB-Monitor' Created: 2023-03-02  Updated: 2023-09-26  Resolved: 2023-09-26

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 22.08.3
Fix Version/s: 6.4.11, 22.08.9, 23.02.5, 23.08.2

Type: Bug Priority: Major
Reporter: Dmitry Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Attachments: PNG File listservers.png    
Sprint: MXS-SPRINT-191

 Description   

Hi

I'm using 2 MaxScale 22.08.3 instances in a cluster (the configuration is bellow), everything seems to be working fine, the traffic is routed properly, I could switch master/slave, GTID is the same on both DB servers.

But the command maxctrl show maxscale on db1 shows an error in the Sync section, though checksums are the same:

db1:

{                                                                
    "checksum": "3ed123cc4400120c992a540745d3de675c6d4575",      
    "nodes": {                                                  
        "0cca53d01c64": "OK",                                    
        "51d4030d7309": "OK"                                    
    },                                                          
    "origin": "",                                                
    "status": "No valid servers in cluster 'MariaDB-Monitor'.",  
    "version": 1                                                
}        

db2:

{                                                                  
    "checksum": "3ed123cc4400120c992a540745d3de675c6d4575",        
    "nodes": {                                                      
        "0cca53d01c64": "OK",                                      
        "51d4030d7309": "OK"                                        
    },                                                              
    "origin": "",                                                  
    "status": "OK",                                                
    "version": 1                                                    
}                                                                  

 maxctrl list servers 

shows that replication is ok and the DBs are ok:

In this thread: https://groups.google.com/g/maxscale/c/ExxVP1VNMu8/m/6B6K7Z-6AAAJ?utm_medium=email&utm_source=footer
I've been told that this may be a bug and I should report it here, so here it is

The following configuration is used:

# Global parameters
[maxscale]
# Substitute environment variables
substitute_variables=1
threads=1
 
config_sync_cluster=MariaDB-Monitor
config_sync_user=$MARIADB_MAXSCALE_USER
config_sync_password=$MARIADB_MAXSCALE_PASSWORD
admin_secure_gui=false
admin_host=127.0.0.1
max_auth_errors_until_block=0
# Extended logging for troubleshooting. It can be activated dynamically using "maxctrl alter maxscale log_info true"
#log_info=true
 
# Server definitions
[db1]
type=server
address=$MAXSCALE_DB1_ADDRESS
# we use internal port of db1 server, not the exposed by docker
port=$MAXSCALE_DB1_PORT
protocol=MariaDBBackend
proxy_protocol=1
[db2]
type=server
address=$MAXSCALE_DB2_ADDRESS
# we use internal port of db1 server, not the exposed by docker
port=$MAXSCALE_DB2_PORT
protocol=MariaDBBackend
proxy_protocol=1
 
 
# Monitor for the servers
[MariaDB-Monitor]
type=monitor
module=mariadbmon
servers=db1,db2
user=$MARIADB_MAXSCALE_USER
password=$MARIADB_MAXSCALE_PASSWORD
monitor_interval=2s
enforce_read_only_slaves=on
 
 
# Service definitions
# https://mariadb.com/kb/en/mariadb-maxscale-2208-readconnroute/
# Readconnroute
[Read-Conn-Service]
type=service
router=readconnroute
router_options=master
servers=db1,db2
user=$MARIADB_MAXSCALE_USER
password=$MARIADB_MAXSCALE_PASSWORD
# Allow login as root
enable_root_user=1
 
# Listener definitions for the services
[Read-Con-Listener]
type=listener
service=Read-Conn-Service
protocol=MariaDBClient
port=3306



 Comments   
Comment by markus makela [ 2023-03-02 ]

Thank you for reporting this. Have you since then restarted the MaxScale instance which has this error? If you did, does the error persist?

Comment by Dmitry [ 2023-03-02 ]

Hi Markus,
thank you for the quick reply
Honestly I didn't restart it as it was a production instance, and now when I did that the error is done. However I did switch Master and Slave DBs via DB Proxy and the error persisted.

{                                                            
    "checksum": "3ed123cc4400120c992a540745d3de675c6d4575",  
    "nodes": {                                               
        "0cca53d01c64": "OK",                                
        "51d4030d7309": "OK"                                 
    },                                                       
    "origin": "",                                            
    "status": "OK",                                          
    "version": 1                                             
}                                                            

Comment by markus makela [ 2023-09-22 ]

I managed to reproduce this by first making a valid change that was stored in the cluster and then restarting the database. The status updates for the local status are only updated when there's a problem or when a change is committed in the cluster. The part where the status is restored to OK when the sync once again connected to the cluster was missing if the version in the MaxScale was the same.

Generated at Thu Feb 08 04:29:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.