The pinloki switchover test causes the monitor to fail as described below. Rare scenario, not likely to happen in the real world.
niclas The pinloki test in review revealed two monitor TODO:s. First, (which I think has come up before) the monitor deduces a replica is replicating from an "external" server by comparing IPs. So a server that is 127.0.0.1 can be external or internal depending on where the IP comes from, and how the monitor is configured. It should be consistent.
Second, if the sleep(5) in the test is replaced with test.maxscale().wait_monitor_ticks(5) the monitor ties itself in knots, and maxctrl becomes unresponsive.
esak The monitor gets stuck?
niclas Something goes awry and the monitor goes into a loop trying to STOP SLAVE, which fails.
I didn't look into it much, just noticing that something is messed up when the two scenarios play at the same time.
esak It's likely not an infinite loop, but depends on some timeout settings.
but why does "stop slave" fail?
niclas That's the part that needs to be dug into.
2020-10-22 10:49:20 warning: [mariadbmon] Query 'SET STATEMENT max_statement_time=3 FOR STOP SLAVE '';' failed on 'pinloki': 'Lost connection to MySQL server during query' (2013). Retrying with 86.9 seconds left.
esak could there be some weird deadlock where one thread cannot advance before the other? It's a bit weird since monitor runs in its own.
niclas I think it is something like that.