Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9965

gmcast.listen_addr does not accept hostname instead of IP address

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.1.13
    • N/A
    • Galera
    • Ubuntu Xenial (16.04 LTS), Ubuntu Trusty (14.04 LTS)

    Description

      (NB I consider this a bug and not a feature request as other "wsrep_* addr *" options support this, see below.)

      Setup: A working cluster consisting of three vservers (vserver03..vserver05) using the following settings (excerpt from vserver03; the internal hostnames with prefix "ipv4." are mapped to public IPv4 addresses by means of the local nameserver (unbound)):

      [mysqld]
      wsrep_cluster_name="galera_provider"
      wsrep_node_name=vserver03.provider
      wsrep_node_address=ipv4.vserver03.provider
      ##wsrep_node_address=aaa.aaa.aaa.aaa
      wsrep_cluster_address="gcomm://ipv4.vserver04.provider,ipv4.vserver05.provider"
      ##wsrep_cluster_address="gcomm://bbb.bbb.bbb.bbb,ccc.ccc.ccc.ccc"
      ##wsrep_node_incoming_address=ipv4.vserver03.provider   #defaults to wsrep_node_address
      ##wsrep_sst_receive_address=ipv4.vserver03.provider     #defaults to wsrep_node_address
      wsrep_sst_donor=vserver04.provider,vserver05.provider
       
      wsrep_provider_options="gmcast.listen_addr=tcp://aaa.aaa.aaa.aaa:4567; gcache.size=128M; gcache.name=/tmp/galera.cache; gcache.page_size=128M"
      ##wsrep_provider_options="gmcast.listen_addr=tcp://ipv4.vserver03.provider:4567; gcache.size=128M; gcache.name=/tmp/galera.cache; gcache.page_size=128M"
      

      Switching between the use of IPv4 addresses and the internal hostnames works on all vservers for all options but gmcast.listen_addr--not using an IP(v4) address here prevents a node from rejoining the cluster (excerpts from vserver04 syslog, ignore timestamp differences):

      Looking good on first sight:

      Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = ipv4.vserver04.provider; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /tmp/galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://ipv4.vserver04.provider:4567; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = f
      Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: (50565ab6, 'tcp://bbb.bbb.bbb.bbb:4567') listening at tcp://bbb.bbb.bbb.bbb:4567
      Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: (50565ab6, 'tcp://bbb.bbb.bbb.bbb:4567') multicast: , ttl: 1
      [...]
      Apr 21 12:36:03 vserver04 mysqld[10620]: 2016-04-21 12:36:03 139716242458880 [Note] WSREP: gcomm: connecting to group 'galera_provider', peer 'ipv4.vserver05.provider:,ipv4.vserver03.provider:'
      

      The major difference is the following; Using an IP address will trigger a synchronisation, while the connection attempts fail (timeout) when using the above internal hostname:

      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [Note] WSREP: view((empty))
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
      Apr 21 12:35:14 vserver04 mysqld[9721]: #011 at gcomm/src/pc.cpp:connect():162
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'galera_provider' at 'gcomm://ipv4.vserver05.provider,ipv4.vserver03.provider': -110 (Connection timed out)
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs connect failed: Connection timed out
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: wsrep::connect(gcomm://ipv4.vserver05.provider,ipv4.vserver03.provider) failed: 7
      Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] Aborting
      Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
      Apr 21 12:35:15 vserver04 systemd[1]: Failed to start MariaDB database server.
      Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Unit entered failed state.
      Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Failed with result 'exit-code'.
      

      (The above is reproducible on all nodes.)

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              m_ueberall Markus Ueberall
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.