|
When the slave IO thread is stopped with STOP SLAVE IO_THREAD, the monitor correctly detects that the replication is not fully functional on that slave. If a switchover is performed while a slave has its IO thread stopped, it will fail due to a timeout as the switchover waits for the slave to catch up to the current GTID.
A possible fix to this would be to check whether the server has the Slave status before executing the MASTER_GTID_WAIT call.
Steps to reproduce
With the following configuration:
|
maxscale.cnf
|
[maxscale]
|
threads=4
|
|
[MariaDB-Monitor]
|
type=monitor
|
module=mariadbmon
|
servers=server1,server2,server3,server4
|
user=maxuser
|
passwd=maxpwd
|
monitor_interval=1000
|
|
[RW-Split-Router]
|
type=service
|
router=readwritesplit
|
servers=server1,server2,server3,server4
|
user=maxuser
|
passwd=maxpwd
|
|
[RW-Split-Listener]
|
type=listener
|
service=RW-Split-Router
|
protocol=MariaDBClient
|
port=4006
|
|
[server1]
|
type=server
|
address=127.0.0.1
|
port=3000
|
protocol=MariaDBBackend
|
|
[server2]
|
type=server
|
address=127.0.0.1
|
port=3001
|
protocol=MariaDBBackend
|
|
[server3]
|
type=server
|
address=127.0.0.1
|
port=3002
|
protocol=MariaDBBackend
|
|
[server4]
|
type=server
|
address=127.0.0.1
|
port=3003
|
protocol=MariaDBBackend
|
- Configure server1 as the master and the rest of the servers as slaves
- Execute STOP SLAVE IO_THREAD on server4
- Execute maxctrl -t 100000 call command mariadbmon switchover MariaDB-Monitor server2 server1
The following error is produced and the switchover fails:
Error: Server at localhost:8989 responded with status code 403 to `POST maxscale/modules/mariadbmon/switchover?MariaDB-Monitor&server1&server2`:{
|
"errors": [
|
{
|
"detail": "MASTER_GTID_WAIT() timed out on slave 'server4'."
|
},
|
{
|
"detail": "read_only disabled on server server2."
|
},
|
{
|
"detail": "Switchover server2 -> server1 failed."
|
}
|
]
|
}
|
|