Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2576

Columnstore Monitor inaccurately labels a UM as slave

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.8
    • Fix Version/s: 2.3.12
    • Component/s: Monitor
    • Labels:
      None
    • Sprint:
      MXS-SPRINT-87

      Description

      Columnstore Monitor can inaccurately label a server as a "slave" if the monitor's connection gets dropped in the get_cs_version() function:

      https://github.com/mariadb-corporation/MaxScale/blob/maxscale-2.3.8/server/modules/monitor/csmon/csmon.cc#L56

      The reason is that this function is called inside CsMonitor::update_server_status to determine if the server supports the mcsSystemPrimary function:

      https://github.com/mariadb-corporation/MaxScale/blob/maxscale-2.3.8/server/modules/monitor/csmon/csmon.cc#L111

      If the get_cs_version function returns a value lower than 10200, then MaxScale checks the server's configuration for the "primary" parameter.

      However, if the monitor's connection gets disconnected in the get_cs_version function, then the function will return 0. Therefore, since this value is lower than 10200, MaxScale will check the server's configuration for the primary parameter. But since the DBA expects MaxScale to use the mcsSystemPrimary function instead, this parameter is most likely not going to be set at all for the server. This causes MaxScale to set the server as a "Slave", since it thinks that some other server is the primary server.

      Instead of allowing the server to be incorrectly set as a "Slave", Columnstore Monitor should detect that the connection died, and it should label the server as "Down".

      Here's some relevant entries from a MaxScale error log that shows this happening:

      2019-06-20 23:18:30   error  : Failed to execute query on server 'srv1' ([192.168.1.44]:3306): Lost connection to MySQL server during query
      2019-06-20 23:18:30   notice : Server changed state: srv1[192.168.1.44:3306]: new_slave. [Master, Running] -> [Slave, Running]
      2019-06-20 23:18:35   notice : Server changed state: srv1[192.168.1.44:3306]: new_master. [Slave, Running] -> [Master, Running]
      

        Attachments

          Activity

            People

            Assignee:
            markus makela markus makela
            Reporter:
            GeoffMontee Geoff Montee
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration