Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3254

Monitor failover fails

    XMLWordPrintable

    Details

    • Sprint:
      MXS-SPRINT-141

      Description

      The pinloki switchover test causes the monitor to fail as described below. Rare scenario, not likely to happen in the real world.

      niclas: The pinloki test in review revealed two monitor TODO:s. First, (which I think has come up before) the monitor deduces a replica is replicating from an "external" server by comparing IPs. So a server that is 127.0.0.1 can be external or internal depending on where the IP comes from, and how the monitor is configured. It should be consistent.
      Second, if the sleep(5) in the test is replaced with test.maxscale().wait_monitor_ticks(5) the monitor ties itself in knots, and maxctrl becomes unresponsive.
       
      esak: The monitor gets stuck?
       
      niclas: Something goes awry and the monitor goes into a loop trying to STOP SLAVE, which fails.
      I didn't look into it much, just noticing that something is messed up when the two scenarios play at the same time.
       
      esak: It's likely not an infinite loop, but depends on some timeout settings.
      but why does "stop slave" fail?
       
      niclas: That's the part that needs to be dug into.
      2020-10-22 10:49:20   warning: [mariadbmon] Query 'SET STATEMENT max_statement_time=3 FOR STOP SLAVE '';' failed on 'pinloki': 'Lost connection to MySQL server during query' (2013). Retrying with 86.9 seconds left.
       
      esak: could there be some weird deadlock where one thread cannot advance before the other? It's a bit weird since monitor runs in its own.
       
      niclas: I think it is something like that.
      

        Attachments

          Activity

            People

            Assignee:
            markus makela markus makela
            Reporter:
            nantti Niclas Antti
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration