[MXS-2463] maxctrl shows stale information on changing bootstrap node Created: 2019-05-01  Updated: 2019-05-07  Resolved: 2019-05-07

Status: Closed
Project: MariaDB MaxScale
Component/s: xpandmon
Affects Version/s: None
Fix Version/s: 2.4.0

Type: Bug Priority: Major
Reporter: Rahul Joshi (Inactive) Assignee: Johan Wikman
Resolution: Fixed Votes: 0
Labels: None
Environment:

MaxScale server karma136:
OS: CentOS 7
Version: built from develop branch, MariaDB MaxScale 2.3.7 (Commit: 820ff756a76bdd28301a0bd072eb959cba7c35bc)
Clustrix nodes:
OS: CentOS 7
Version: 9.1.4


Attachments: Text File maxscale.log    

 Description   

This is the working as expected config and maxctrl o/p:

[root@karma016 ~]# clx s
Cluster Name:    cl0281322e37e3d4b8
Cluster Version: 9.1.4
Cluster Status:   OK
Cluster Size:    3 nodes - 16 CPUs per Node
Current Node:    karma016 - nid 5
 
nid |  Hostname | Status |  IP Address  | TPS |       Used      |  Total
----+-----------+--------+--------------+-----+-----------------+--------
  3 |  karma049 |    OK  |  10.2.15.126 |   0 |  499.3M (0.06%) |  767.0G
  4 |  karma055 |    OK  |  10.2.15.144 |   0 |  496.7M (0.06%) |  767.0G
  5 |  karma016 |    OK  |   10.2.15.19 |  11 |  496.9M (0.06%) |  767.0G
----+-----------+--------+--------------+-----+-----------------+--------
                                           11 |    1.5G (0.06%) |    2.2T
Conf file bootstrap server entry using karma016 IP:
[Bootstrap1-karma016]
type=server
address=10.2.15.19
#karma016
port=3306
protocol=mariadbbackend
ssl=required
ssl_cert=/etc/my.cnf.d/certs/client-cert.pem
ssl_key=/etc/my.cnf.d/certs/client-key.pem
ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem
 
[root@karma136 ~]# maxctrl list servers
┌─────────────────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server              │ Address     │ Port │ Connections │ State           │ GTID │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-4   │ 10.2.15.144 │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-3   │ 10.2.15.126 │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-5   │ 10.2.15.19  │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ Bootstrap1-karma016 │ 10.2.15.19  │ 3306 │ 0           │ Master, Running │      │
└─────────────────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘

Now, change the bootstrap server entry to a server that does not have clustrix (karma197, IP 10.2.13.97):

 
Conf file bootstrap server entry using karma016 IP:
[Bootstrap1-karma197]
type=server
address=10.2.13.97
#karma197
port=3306
protocol=mariadbbackend
ssl=required
ssl_cert=/etc/my.cnf.d/certs/client-cert.pem
ssl_key=/etc/my.cnf.d/certs/client-key.pem
ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem
 
[root@karma136 ~]# maxctrl list servers
┌─────────────────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server              │ Address     │ Port │ Connections │ State           │ GTID │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-4   │ 10.2.15.144 │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-3   │ 10.2.15.126 │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Clustrix:node-5   │ 10.2.15.19  │ 3306 │ 0           │ Master, Running │      │
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ Bootstrap1-karma197 │ 10.2.13.97  │ 3306 │ 0           │ Down            │      │
└─────────────────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘

The first 3 entries for @@Clustrix:node-3/4/5 are for the old bootstrap server (karma016) which is no more part of the config file.

Expected:
We should try to find out nodes from current bootstrap nodes only instead of all previously discovered nodes, if using current bootstrap node fails.
The maxctrl output should only have one entry like:

│ Server              │ Address     │ Port │ Connections │ State           │ GTID │
│ Bootstrap1-karma197 │ 10.2.13.97  │ 3306 │ 0           │ Down            │      │

Relevant logs:

2019-05-01 20:01:40   error  : [clustrixmon] Clustrix: Could either not ping or create connection to 10.2.13.97:3306: Can't connect to MySQL server on '10.2.13.97' (115)
2019-05-01 20:01:40   notice : [clustrixmon] Attempting to find a Clustrix bootstrap node from one of the nodes used during the previous run of MaxScale.
2019-05-01 20:01:40   notice : [clustrixmon] Trying to find out cluster nodes from 10.2.15.19:3306.
2019-05-01 20:01:40   notice : Created server '@@Clustrix:node-5' at 10.2.15.19:3306
2019-05-01 20:01:40   info   : [clustrixmon] Updated Clustrix node in bookkeeping: 5, '10.2.15.19', 3306, 3581.
2019-05-01 20:01:40   notice : Created server '@@Clustrix:node-4' at 10.2.15.144:3306
2019-05-01 20:01:40   info   : [clustrixmon] Updated Clustrix node in bookkeeping: 4, '10.2.15.144', 3306, 3581.
2019-05-01 20:01:40   notice : Created server '@@Clustrix:node-3' at 10.2.15.126:3306
2019-05-01 20:01:40   info   : [clustrixmon] Updated Clustrix node in bookkeeping: 3, '10.2.15.126', 3306, 3581.
2019-05-01 20:01:40   notice : [clustrixmon] Cluster nodes refreshed.
2019-05-01 20:01:40   notice : [clustrixmon] Clustrix: Monitoring Clustrix cluster state using node 10.2.15.126:3306.
2019-05-01 20:01:40   info   : In-memory sqlite database successfully opened for thread 140382505428736.
2019-05-01 20:01:40   info   : In-memory sqlite database successfully opened for thread 140382497036032.
2019-05-01 20:01:40   info   : In-memory sqlite database successfully opened for thread 140382281135872.
2019-05-01 20:01:40   notice : Server changed state: Bootstrap1-karma197[10.2.13.97:3306]: server_down. [Running] -> [Down]

Full logs attached.



 Comments   
Comment by markus makela [ 2019-05-02 ]

I guess this would be solved by never persisting the generated servers.

Comment by Johan Wikman [ 2019-05-02 ]

The connection information (host + port) about the nodes detected at runtime is saved so that the Clustrix monitor can get going even if the bootstrap server is not present when MaxScale starts.

It seems that the bootstrap server used for obtaining the information should be saved as well, so that the stored information is used only if there has been no change in the bootstrap server(s).

Comment by Johan Wikman [ 2019-05-07 ]

Now it works so that if there has been any change in the bootstrap nodes, then the stored information is not used.

Generated at Thu Feb 08 04:14:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.