[MXS-836] "Failed to start all MaxScale services" without retrying Created: 2016-08-24  Updated: 2016-09-06  Resolved: 2016-09-06

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 2.0.0
Fix Version/s: 2.0.1

Type: Bug Priority: Major
Reporter: Kolbe Kegel (Inactive) Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None


 Description   

MaxScale 2.0.0 does not seem to correctly retry service startup before exiting. I think this worked correctly in 1.4.3.

The log output notes "Failed to start service RW Split Router, retrying in 10 seconds", but the entire service in fact shuts down immediately without ever retrying.

MariaDB Corporation MaxScale    /var/log/maxscale/maxscale1.log Wed Aug 24 19:44:22 2016
-----------------------------------------------------------------------
2016-08-24 19:44:22   notice : Working directory: /var/log/maxscale
2016-08-24 19:44:22   notice : MariaDB MaxScale beta-2.0.0 started
2016-08-24 19:44:22   notice : MaxScale is running in process 1587
2016-08-24 19:44:22   notice : Configuration file: /etc/maxscale.cnf
2016-08-24 19:44:22   notice : Log directory: /var/log/maxscale
2016-08-24 19:44:22   notice : Data directory: /var/lib/maxscale/data
2016-08-24 19:44:22   notice : Module directory: /usr/lib64/maxscale
2016-08-24 19:44:22   notice : Service cache: /var/cache/maxscale
2016-08-24 19:44:22   warning: Number of threads set to 4 which is greater than the number of processors available: 1
2016-08-24 19:44:22   notice : Initialise MaxInfo router module V1.0.0.
2016-08-24 19:44:22   notice : Loaded module maxinfo: V1.0.0 from /usr/lib64/maxscale/libmaxinfo.so
2016-08-24 19:44:22   notice : Initialise CLI router module V1.0.0.
2016-08-24 19:44:22   notice : Loaded module cli: V1.0.0 from /usr/lib64/maxscale/libcli.so
2016-08-24 19:44:22   notice : Initializing statemend-based read/write split router module.
2016-08-24 19:44:22   notice : Loaded module readwritesplit: V1.1.0 from /usr/lib64/maxscale/libreadwritesplit.so
2016-08-24 19:44:22   notice : Initialise the MySQL Galera Monitor module V2.0.0.
2016-08-24 19:44:22   notice : Loaded module galeramon: V2.0.0 from /usr/lib64/maxscale/libgaleramon.so
2016-08-24 19:44:22   notice : No query classifier specified, using default 'qc_sqlite'.
2016-08-24 19:44:22   notice : Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/libqc_sqlite.so
2016-08-24 19:44:22   notice : Encrypted password file /var/lib/maxscale/data/.secrets can't be accessed (No such file or directory). Password encryption is not used.
2016-08-24 19:44:22   error  : [Galera Monitor] Failed to connect to server 'server1' (mdbec-demo-5-db1:3306) when checking monitor user credentials and permissions: Can't connect to MySQL server on 'mdbec-demo-5-db1' (107)
2016-08-24 19:44:22   error  : [Galera Monitor] Failed to connect to server 'server2' (mdbec-demo-5-db2:3306) when checking monitor user credentials and permissions: Can't connect to MySQL server on 'mdbec-demo-5-db2' (107)
2016-08-24 19:44:22   error  : [Galera Monitor] Failed to connect to server 'server3' (mdbec-demo-5-db3:3306) when checking monitor user credentials and permissions: Can't connect to MySQL server on 'mdbec-demo-5-db3' (107)
2016-08-24 19:44:22   error  : [RW Split Router] Failed to connect to server 'server1' (mdbec-demo-5-db1:3306) when checking authentication user credentials and permissions: 2003 Can't connect to MySQL server on 'mdbec-demo-5-db1' (107)
2016-08-24 19:44:22   error  : [RW Split Router] Failed to connect to server 'server2' (mdbec-demo-5-db2:3306) when checking authentication user credentials and permissions: 2003 Can't connect to MySQL server on 'mdbec-demo-5-db2' (107)
2016-08-24 19:44:22   error  : [RW Split Router] Failed to connect to server 'server3' (mdbec-demo-5-db3:3306) when checking authentication user credentials and permissions: 2003 Can't connect to MySQL server on 'mdbec-demo-5-db3' (107)
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db1:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db1' (107)
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db2:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db2' (107)
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db3:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db3' (107)
2016-08-24 19:44:22   error  : Unable to get user data from backend database for service [RW Split Router]. Failed to connect to any of the backend databases.
2016-08-24 19:44:22   error  : Unable to load users for service RW Split Router listening at 0.0.0.0:4006.
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db1:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db1' (107)
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db2:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db2' (107)
2016-08-24 19:44:22   error  : Failure loading users data from backend [mdbec-demo-5-db3:3306] for service [RW Split Router]. MySQL error 2003, Can't connect to MySQL server on 'mdbec-demo-5-db3' (107)
2016-08-24 19:44:22   error  : Unable to get user data from backend database for service [RW Split Router]. Failed to connect to any of the backend databases.
2016-08-24 19:44:22   error  : Unable to load users for service RW Split Router listening at /var/lib/maxscale/rwsplit.sock:0.
2016-08-24 19:44:22   notice : Failed to start service RW Split Router, retrying in 10 seconds.
2016-08-24 19:44:22   error  : Failed to start service 'RW Split Router'.
2016-08-24 19:44:22   notice : Loaded module maxscaled: V2.0.0 from /usr/lib64/maxscale/libmaxscaled.so
2016-08-24 19:44:22   notice : Listening connections at /tmp/maxadmin.sock with protocol MaxScale Admin
2016-08-24 19:44:22   error  : maxinfo: failed to get service user details
2016-08-24 19:44:22   notice : Loaded module HTTPD: V1.1.1 from /usr/lib64/maxscale/libHTTPD.so
2016-08-24 19:44:22   notice : Listening connections at 0.0.0.0:8003 with protocol HTTPD
2016-08-24 19:44:22   error  : Error : Failed to start all MaxScale services. Exiting.
2016-08-24 19:44:22   MaxScale is shut down.
-----------------------------------------------------------------------



 Comments   
Comment by markus makela [ 2016-09-06 ]

This is actually a side effect of proper detection of failed services. When a service is configured with the retry_on_failure option and a service fails to start, the service start wrongly returns an error even though errors are tolerated at that point.

Generated at Thu Feb 08 04:02:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.