Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
2.5.5
-
None
-
MXS-SPRINT-141
Description
The pinloki switchover test causes the monitor to fail as described below. Rare scenario, not likely to happen in the real world.
niclas: The pinloki test in review revealed two monitor TODO:s. First, (which I think has come up before) the monitor deduces a replica is replicating from an "external" server by comparing IPs. So a server that is 127.0.0.1 can be external or internal depending on where the IP comes from, and how the monitor is configured. It should be consistent.
|
Second, if the sleep(5) in the test is replaced with test.maxscale().wait_monitor_ticks(5) the monitor ties itself in knots, and maxctrl becomes unresponsive.
|
|
esak: The monitor gets stuck?
|
|
niclas: Something goes awry and the monitor goes into a loop trying to STOP SLAVE, which fails.
|
I didn't look into it much, just noticing that something is messed up when the two scenarios play at the same time.
|
|
esak: It's likely not an infinite loop, but depends on some timeout settings.
|
but why does "stop slave" fail?
|
|
niclas: That's the part that needs to be dug into.
|
2020-10-22 10:49:20 warning: [mariadbmon] Query 'SET STATEMENT max_statement_time=3 FOR STOP SLAVE '';' failed on 'pinloki': 'Lost connection to MySQL server during query' (2013). Retrying with 86.9 seconds left.
|
|
esak: could there be some weird deadlock where one thread cannot advance before the other? It's a bit weird since monitor runs in its own.
|
|
niclas: I think it is something like that.
|