[MDEV-9965] gmcast.listen_addr does not accept hostname instead of IP address Created: 2016-04-21  Updated: 2019-09-19  Resolved: 2019-09-19

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.1.13
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Markus Ueberall Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 0
Labels: galera
Environment:

Ubuntu Xenial (16.04 LTS), Ubuntu Trusty (14.04 LTS)



 Description   

(NB I consider this a bug and not a feature request as other "wsrep_* addr *" options support this, see below.)

Setup: A working cluster consisting of three vservers (vserver03..vserver05) using the following settings (excerpt from vserver03; the internal hostnames with prefix "ipv4." are mapped to public IPv4 addresses by means of the local nameserver (unbound)):

[mysqld]
wsrep_cluster_name="galera_provider"
wsrep_node_name=vserver03.provider
wsrep_node_address=ipv4.vserver03.provider
##wsrep_node_address=aaa.aaa.aaa.aaa
wsrep_cluster_address="gcomm://ipv4.vserver04.provider,ipv4.vserver05.provider"
##wsrep_cluster_address="gcomm://bbb.bbb.bbb.bbb,ccc.ccc.ccc.ccc"
##wsrep_node_incoming_address=ipv4.vserver03.provider   #defaults to wsrep_node_address
##wsrep_sst_receive_address=ipv4.vserver03.provider     #defaults to wsrep_node_address
wsrep_sst_donor=vserver04.provider,vserver05.provider
 
wsrep_provider_options="gmcast.listen_addr=tcp://aaa.aaa.aaa.aaa:4567; gcache.size=128M; gcache.name=/tmp/galera.cache; gcache.page_size=128M"
##wsrep_provider_options="gmcast.listen_addr=tcp://ipv4.vserver03.provider:4567; gcache.size=128M; gcache.name=/tmp/galera.cache; gcache.page_size=128M"

Switching between the use of IPv4 addresses and the internal hostnames works on all vservers for all options but gmcast.listen_addr--not using an IP(v4) address here prevents a node from rejoining the cluster (excerpts from vserver04 syslog, ignore timestamp differences):

Looking good on first sight:

Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = ipv4.vserver04.provider; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /tmp/galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://ipv4.vserver04.provider:4567; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = f
Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: (50565ab6, 'tcp://bbb.bbb.bbb.bbb:4567') listening at tcp://bbb.bbb.bbb.bbb:4567
Apr 21 11:34:58 vserver04 mysqld[30240]: 2016-04-21 11:34:58 139703548975360 [Note] WSREP: (50565ab6, 'tcp://bbb.bbb.bbb.bbb:4567') multicast: , ttl: 1
[...]
Apr 21 12:36:03 vserver04 mysqld[10620]: 2016-04-21 12:36:03 139716242458880 [Note] WSREP: gcomm: connecting to group 'galera_provider', peer 'ipv4.vserver05.provider:,ipv4.vserver03.provider:'

The major difference is the following; Using an IP address will trigger a synchronisation, while the connection attempts fail (timeout) when using the above internal hostname:

Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [Note] WSREP: view((empty))
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Apr 21 12:35:14 vserver04 mysqld[9721]: #011 at gcomm/src/pc.cpp:connect():162
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'galera_provider' at 'gcomm://ipv4.vserver05.provider,ipv4.vserver03.provider': -110 (Connection timed out)
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: gcs connect failed: Connection timed out
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] WSREP: wsrep::connect(gcomm://ipv4.vserver05.provider,ipv4.vserver03.provider) failed: 7
Apr 21 12:35:14 vserver04 mysqld[9721]: 2016-04-21 12:35:14 140206860740864 [ERROR] Aborting
Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
Apr 21 12:35:15 vserver04 systemd[1]: Failed to start MariaDB database server.
Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Unit entered failed state.
Apr 21 12:35:15 vserver04 systemd[1]: mariadb.service: Failed with result 'exit-code'.

(The above is reproducible on all nodes.)



 Comments   
Comment by Jan Lindström (Inactive) [ 2019-09-19 ]

This is intended behavior.

Generated at Thu Feb 08 07:38:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.