[MXS-1672] STOP SLAVE IO_THREAD causes switchover to wait until timeout Created: 2018-02-16  Updated: 2020-07-03  Resolved: 2020-07-03

Status: Closed
Project: MariaDB MaxScale
Component/s: mariadbmon
Affects Version/s: 2.2.2
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: markus makela Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None


 Description   

When the slave IO thread is stopped with STOP SLAVE IO_THREAD, the monitor correctly detects that the replication is not fully functional on that slave. If a switchover is performed while a slave has its IO thread stopped, it will fail due to a timeout as the switchover waits for the slave to catch up to the current GTID.

A possible fix to this would be to check whether the server has the Slave status before executing the MASTER_GTID_WAIT call.

Steps to reproduce

With the following configuration:

maxscale.cnf

[maxscale]
threads=4
 
[MariaDB-Monitor]
type=monitor
module=mariadbmon
servers=server1,server2,server3,server4
user=maxuser
passwd=maxpwd
monitor_interval=1000
 
[RW-Split-Router]
type=service
router=readwritesplit
servers=server1,server2,server3,server4
user=maxuser
passwd=maxpwd
 
[RW-Split-Listener]
type=listener
service=RW-Split-Router
protocol=MariaDBClient
port=4006
 
[server1]
type=server
address=127.0.0.1
port=3000
protocol=MariaDBBackend
 
[server2]
type=server
address=127.0.0.1
port=3001
protocol=MariaDBBackend
 
[server3]
type=server
address=127.0.0.1
port=3002
protocol=MariaDBBackend
 
[server4]
type=server
address=127.0.0.1
port=3003
protocol=MariaDBBackend

  • Configure server1 as the master and the rest of the servers as slaves
  • Execute STOP SLAVE IO_THREAD on server4
  • Execute maxctrl -t 100000 call command mariadbmon switchover MariaDB-Monitor server2 server1

The following error is produced and the switchover fails:

Error: Server at localhost:8989 responded with status code 403 to `POST maxscale/modules/mariadbmon/switchover?MariaDB-Monitor&server1&server2`:{
    "errors": [
        {
            "detail": "MASTER_GTID_WAIT() timed out on slave 'server4'."
        },
        {
            "detail": "read_only disabled on server server2."
        },
        {
            "detail": "Switchover server2 -> server1 failed."
        }
    ]
}


Generated at Thu Feb 08 04:08:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.