Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-1476

priority value ignored when a Galera node rejoins with a lower wsrep_local_index than current master

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 2.1.9
    • N/A
    • galeramon
    • Ubuntu 16.06, MariaDB 10.1.28, 2-node Galera cluster with Galera Arbitrator, MaxScale 2.1.9, mysql client version 15.1 Distrib. 10.1.28-MariaDB for debian-linux-gnu (x86_64) using readline 5.2
    • 2017-46, 2017-47

    Description

      Hi,

      I'm testing MaxScale, set up to connect only to the master node, on Ubuntu 16.04 fronting a Galera cluster. The cluster comprises of 2 MariaDB 10.1 instances on different servers, with a Galera Arbitrator instance running on the MaxScale server. I'm testing this using the MariaDB client (mysql) from a fourth machine.

      My test scenario is to see what the client experiences if I stop a MariaDB node part-way through a transaction, commit the transaction, then restart the MariaDB node. I start with the "slave" node, to give me a baseline for comparison, before doing it with the master node. However, the baseline case has given me inconsistent results, which I first thought was due to TLS, but may actually be something else as I've now reproduced it on a non-TLS connection.

      If the slave MariaDB node comes back online with a lower wsrep_local_index value than the master, MaxScale sends a 2003 response to the client when it next sends anything to the database, sends a QUIT to the master node, and terminates both connections immediately.

      If the slave comes back with a higher wsrep_local_index than the master, this doesn't seem to happen.

      (I can't see a pattern as to how the wsrep_local_index value is assigned to Galera nodes rejoining the cluster, other than preferring to keep their previous value if any.)

      When it disconnects, I see the following lines logged in /etc/syslog:

      Oct 16 11:45:04 maxscale01 maxscale[10675]: [galeramon] There are no cluster members
      Oct 16 11:45:04 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: lost_master. [Master, Synced, Running] -> [Running]
      Oct 16 11:45:05 maxscale01 maxscale[10675]: [galeramon] Found cluster members
      Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode1[172.100.1.22:3306]: new_master. [Running] -> [Master, Synced, Running]
      Oct 16 11:45:05 maxscale01 maxscale[10675]: Server changed state: dbnode2[172.100.1.23:3306]: slave_up. [Down] -> [Slave, Synced, Running]
      

      MaxScale is configured as follows (the commented-out configuration is uncommented when connecting via TLS):

      [dbnode1]
      type=server
      address=172.16.1.22
      port=3306
      protocol=MySQLBackend
      priority=1
       
      [dbnode2]
      type=server
      address=172.16.1.23
      port=3306
      protocol=MySQLBackend
      priority=2
       
      [Galera Monitor]
      type=monitor
      module=galeramon
      servers=dbnode1,dbnode2
      user=galeramon
      passwd=galeramon
      monitor_interval=1000
      available_when_donor=true
      use_priority=true
       
      [Galera Service]
      type=service
      router=readrouteconn
      router_options=master
      servers=dbnode1,dbnode2
      user=galeramon
      passwd=galeramon
       
      [MaxAdmin Service]
      type=service
      router=cli
       
      [Galera Listener]
      type=listener
      service=Galera Service
      protocol=MySQLClient
      port=3306
      #ssl=required
      #ssl_version=TLSv12
      #ssl_cert=/etc/mysql/ssl/server-cert.pem
      #ssl_key=/etc/mysql/ssl/server-key.pem
      #ssl_ca_cert=/etc/mysql/ssl/ca-cert.pem
      #ssl_cert_verify_depth=1
       
      [MaxAdmin Listener]
      type=listener
      service=MaxAdmin Service
      protocol=maxscaled
      socket=default
      

      Attachments

        Activity

          markus makela markus makela added a comment -

          We still can't reproduce any problems with the priority so we'll close it as Cannot Reproduce.

          markus makela markus makela added a comment - We still can't reproduce any problems with the priority so we'll close it as Cannot Reproduce.
          markus makela markus makela added a comment -

          Hmm, could be that cutting the network traffic is not enough, I'll continue testing with complete shutdown.

          markus makela markus makela added a comment - Hmm, could be that cutting the network traffic is not enough, I'll continue testing with complete shutdown.
          PC Pak Chan added a comment -

          I'm not sure what the "test.galera->block_node(slave);" does (I assume it stops the MariaDB Galera instance being passed in), but when it is unblocked (restarted?), there doesn't appear to be a check to see if its "wsrep_local_index" is lower than the master's. If it isn't, the condition for the failure isn't met.

          (My tests are more manual, and involve stopping and restarting the individual MariaDB nodes via systemctl, whilst in the middle of a transaction; e.g. "BEGIN TRANS; INSERT INTO test.t1 VALUES (1);". The latter probably doesn't make much of a difference; I don't know if the former does.)

          PC Pak Chan added a comment - I'm not sure what the "test.galera->block_node(slave);" does (I assume it stops the MariaDB Galera instance being passed in), but when it is unblocked (restarted?), there doesn't appear to be a check to see if its "wsrep_local_index" is lower than the master's. If it isn't, the condition for the failure isn't met. (My tests are more manual, and involve stopping and restarting the individual MariaDB nodes via systemctl, whilst in the middle of a transaction; e.g. "BEGIN TRANS; INSERT INTO test.t1 VALUES (1);". The latter probably doesn't make much of a difference; I don't know if the former does.)
          markus makela markus makela added a comment -

          Added a test that attempts to reproduce this but it doesn't appear to happen in our environment.

          markus makela markus makela added a comment - Added a test that attempts to reproduce this but it doesn't appear to happen in our environment.
          PC Pak Chan added a comment -

          No worries; I probably should have mentioned it in the writeup.

          PC Pak Chan added a comment - No worries; I probably should have mentioned it in the writeup.
          markus makela markus makela added a comment -

          Ah, my apologies. I missed the fact that use_priority was used as the issue doesn't seem to mention it at all.

          markus makela markus makela added a comment - Ah, my apologies. I missed the fact that use_priority was used as the issue doesn't seem to mention it at all.
          PC Pak Chan added a comment - - edited

          So this occurs despite the use_priority=true setting, which has deterministically selected the write node? (The write node has not gone offline in this scenario, only the other node.)

          If this is as per design, the documentation should be updated to reflect that the "use_priority" setting may be ignored by MaxScale.

          PC Pak Chan added a comment - - edited So this occurs despite the use_priority=true setting, which has deterministically selected the write node? (The write node has not gone offline in this scenario, only the other node.) If this is as per design, the documentation should be updated to reflect that the "use_priority" setting may be ignored by MaxScale.
          markus makela markus makela added a comment -

          This behavior is expected as the algorithm to find the "master" node is to use the node with the lowest wsrep_local_index value. To prevent a node with a lower index from interrupting the connection, use the disable_master_failback option. This will cause the master status to stick to the node it is assigned even if a node with a lower wsrep_local_index value joins the cluster.

          markus makela markus makela added a comment - This behavior is expected as the algorithm to find the "master" node is to use the node with the lowest wsrep_local_index value. To prevent a node with a lower index from interrupting the connection, use the disable_master_failback option. This will cause the master status to stick to the node it is assigned even if a node with a lower wsrep_local_index value joins the cluster.

          People

            markus makela markus makela
            PC Pak Chan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.