Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2594

Enabling use_priority for not set priority on server level triggers an election

    XMLWordPrintable

Details

    Description

      Folks,

      Working on a pre-production environment where I'm setting up a new environment with a fresh MaxScale 2.3.9 and a MariaDB Cluster 10.2 (on Debian 9.7) I found an interesting situation I got curious about. The MariaDB Cluster already running has three nodes and below are their sweep_local_index before starting up with MaxScale implementation:

      changed: [prod_mariadb03] => changed=true
        cmd: mysql -e 'show global status like "wsrep_local_index"'
        stdout: |-
          Variable_name   Value
          wsrep_local_index       0
      changed: [prod_mariadb02] => changed=true
        cmd: mysql -e 'show global status like "wsrep_local_index"'
        stdout: |-
          Variable_name   Value
          wsrep_local_index       1
      changed: [prod_mariadb01] => changed=true
        cmd: mysql -e 'show global status like "wsrep_local_index"'
        stdout: |-
          Variable_name   Value
          wsrep_local_index       2
      

      Follow the implementation steps, what was done is as follows:

      1. created a basic configuration file with global and service definitions:

      root@prod-maxscale01:~# cat /etc/maxscale.cnf
      [maxscale]
      threads                     = auto
      log_augmentation            = 1
      ms_timestamp                = 1
      admin_host                  = 0.0.0.0
      admin_port                  = 8989
       
      [rwsplit-service]
      type                        = service
      router                      = readwritesplit
      user                        = maxusr
      password                    = A0FE98035CFA5EB978337B739E949878
      causal_reads                = true
      causal_reads_timeout        = 30
      master_reconnection         = true
      max_sescmd_history          = 1000
      prune_sescmd_history        = true
      master_failure_mode         = fail_on_write
      

      2. created the cluster on MaxScale using the dynamic commands below:

      #: task: creating the monitor
      maxctrl create monitor replication-cluster-monitor galeramon --monitor-user=maxmon --monitor-password=AFB909850E7181E9906159CE45176FAD
       
      #: task: configuring the monitor for the replication cluster
      maxctrl alter monitor replication-cluster-monitor monitor_interval          500 
      maxctrl alter monitor replication-cluster-monitor disk_space_threshold      /var/lib:85
      maxctrl alter monitor replication-cluster-monitor disk_space_check_interval 1000
       
      #: task: create a listener
      maxctrl create listener rwsplit-service replication-rwsplit-listener 3306
       
      #: task: create servers
      maxctrl create server prod_mariadb01 10.136.88.50  3306
      maxctrl create server prod_mariadb02 10.136.69.104 3306
      maxctrl create server prod_mariadb03 10.136.79.28  3306
       
      #: task: link servers with the service
      maxctrl link service rwsplit-service prod_mariadb01
      maxctrl link service rwsplit-service prod_mariadb02
      maxctrl link service rwsplit-service prod_mariadb03
       
      #: task: link servers with the monitor
      maxctrl link monitor replication-cluster-monitor prod_mariadb01
      maxctrl link monitor replication-cluster-monitor prod_mariadb02
      maxctrl link monitor replication-cluster-monitor prod_mariadb03
      

      And then, checking logs, I noticed that, after creating the .secrets file I forgot to workout the file's ownership (chown maxscale:maxscale) and I got silly errors, fixed after adjusting it. The case is that, after having the GaleraMon monitor reading the .secrets file, we can see below the servers coming up online on MaxScale:

      2019-07-08 12:58:11.657   error  : (secrets_readKeys): Access for secrets file [/var/lib/maxscale/.secrets] failed. Error 13, Permission denied.
      2019-07-08 12:58:12.258   notice : (secrets_readKeys): Using encrypted passwords. Encryption key: '/var/lib/maxscale/.secrets'.
      2019-07-08 12:58:12.271   notice : (post_tick): Found cluster members
      2019-07-08 12:58:12.272   notice : (mon_log_state_change): Server changed state: prod_mariadb01[10.136.88.50:3306]: slave_up. [Auth Error, Down] -> [Slave, Synced, Running]
      2019-07-08 12:58:12.272   notice : (mon_log_state_change): Server changed state: prod_mariadb02[10.136.69.104:3306]: slave_up. [Auth Error, Down] -> [Slave, Synced, Running]
      2019-07-08 12:58:12.273   notice : (mon_log_state_change): Server changed state: prod_mariadb03[10.136.79.28:3306]: master_up. [Auth Error, Down] -> [Master, Synced, Running]
      

      The configurations created for persisting dynamic commands for servers are below:

      [prod_mariadb01]
      type=server
      port=3306
      extra_port=0
      persistpoolmax=0
      persistmaxtime=0
      proxy_protocol=false
      ssl=false
      ssl_version=MAX
      ssl_cert_verify_depth=9
      ssl_verify_peer_certificate=true
      protocol=mariadbbackend
      address=10.136.88.50
       
      [prod_mariadb02]
      type=server
      port=3306
      extra_port=0
      persistpoolmax=0
      persistmaxtime=0
      proxy_protocol=false
      ssl=false
      ssl_version=MAX
      ssl_cert_verify_depth=9
      ssl_verify_peer_certificate=true
      protocol=mariadbbackend
      address=10.136.69.104
       
      [prod_mariadb03]
      type=server
      port=3306
      extra_port=0
      persistpoolmax=0
      persistmaxtime=0
      proxy_protocol=false
      ssl=false
      ssl_version=MAX
      ssl_cert_verify_depth=9
      ssl_verify_peer_certificate=true
      protocol=mariadbbackend
      address=10.136.79.28
      

      All right, that's good. However, at this point, I felt like enabling priorities, so I can better handle who is the next when the current master should fail:

      root@prod-maxscale01:~# maxctrl alter monitor replication-cluster-monitor use_priority true
      OK
      

      Great, I enabled the priorities for the monitor (galeramon), awesome. And then, tailing logs, I see:

      2019-07-08 12:59:34.832   notice : (do_alter_monitor): Updated monitor 'replication-cluster-monitor': use_priority=true
      2019-07-08 12:59:34.838   notice : (load_server_journal): Loaded server states from journal file: /var/lib/maxscale/replication-cluster-monitor/monitor.dat
      2019-07-08 12:59:34.850   notice : (mon_log_state_change): Server changed state: prod_mariadb01[10.136.88.50:3306]: new_master. [Slave, Synced, Running] -> [Master, Synced, Running]
      2019-07-08 12:59:34.851   notice : (mon_log_state_change): Server changed state: prod_mariadb03[10.136.79.28:3306]: new_slave. [Master, Synced, Running] -> [Slave, Synced, Running]
      

      I haven't set priories per server yet as I was planning to enable the use_priotities on the monitor side and then, set the priorities for servers. My current master was prod_mariadb03 and I was planning to set priorities like below (after enabling the use_priorities on monitor):

      # prod_mariadb03 wsrep_cluster_index=0 priority=1
      # prod_mariadb03 wsrep_cluster_index=1 priority=2
      # prod_mariadb01 wsrep_cluster_index=2 priority=3
      --
      maxctrl alter server prod_mariadb03 priority 1
      maxctrl alter server prod_mariadb02 priority 2
      maxctrl alter server prod_mariadb01 priority 3
      

      This way, I was attempting to avoid an election to be triggered. By the way, just enabling the priorities, an election was triggered without priorities set per server.

      Following the theory behind this, if you don't set priorities, the galeramon will elect master based on the wsrep_local_index, and if I enable the priorities usage on the monitor side, it does not change the wsrep_local_index neither [should] add server's priorities underneath, am I right? Is that expected to have an election in this case?

      Thanks!

      Attachments

        Activity

          People

            markus makela markus makela
            wagnerbianchi Wagner Bianchi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.