[MXS-1956] When executing a switchover, former master remains with sleep and read-only connections Created: 2018-07-04  Updated: 2020-04-14  Resolved: 2020-04-14

Status: Closed
Project: MariaDB MaxScale
Component/s: readconnroute
Affects Version/s: 2.2.9
Fix Version/s: 2.4.8

Type: Bug Priority: Major
Reporter: Wagner Bianchi (Inactive) Assignee: Esa Korhonen
Resolution: Won't Fix Votes: 0
Labels: None

Sprint: MXS-SPRINT-101, MXS-SPRINT-102, MXS-SPRINT-103, MXS-SPRINT-104

 Description   

Folks,

I found an interesting situation. I have a simple environment running a replication cluster with a master and a slave. Database servers and Maxscale are in production, and the application is not being stopped for performing the switchover. Before the switchover, I have 200 threads open on master and none on the slave. After the switchover, the threads running on the former master keeps open, and when closed by Maxscale, I see the following messages in the error log:

#: before switching over
root@dwbm-maxscale01:~# maxadmin list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server             | Address         | Port  | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
wbm-mariadb01      | 192.168.0.11    |  3306 |          39 | Master, Running
wbm-mariadb02      | 192.168.0.12    |  3306 |         112 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
 
#: after switching over
root@wbm-maxscale01:~# maxadmin list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server             | Address         | Port  | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
wbm-mariadb01      | 192.168.0.11    |  3306 |           5 | Slave, Running
wbm-mariadb02      | 192.168.0.12    |  3306 |          84 | Master, Running
-------------------+-----------------+-------+-------------+--------------------

When those remaining 5 connections start getting closed, I see the below in the error log:

2018-07-04 12:17:35.028   error  : (130) [readconnroute] (log_closed_session): Failed to route MySQL command 1 to backend server.
2018-07-04 12:17:35.028   error  : (130) [mariadbclient] (gw_read_finish_processing): Routing the query failed. Session will be closed.

Can we clean up all sleeping and read-only threads before getting the new master up? I mean, can we kill all sleepers and read-only threads before saying the switchover was successfully done?

Thanks!



 Comments   
Comment by markus makela [ 2018-07-04 ]

This is currently expected behavior. The monitor does not trigger anything that would notify the routers that the server has changed states. This means that the router will only know of a new state when the connection is used.

Due to technical limitations this cannot be fixed in 2.2.

Comment by Esa Korhonen [ 2020-04-14 ]

Leaving unfixed for now, as recent versions already give better log messages. Should be reopened if more is needed.

Generated at Thu Feb 08 04:10:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.