Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
10.0.16-galera
Description
hi
i have install these on cents 6.6 selinux disabled and iptables closed on both server, the only difference is servers are in different locations
MariaDB-compat-10.0.17-1.el6.x86_64
MariaDB-common-10.0.17-1.el6.x86_64
perl-Pod-Escapes-1.04-136.el6_6.1.x86_64
perl-libs-5.10.1-136.el6_6.1.x86_64
perl-version-0.77-136.el6_6.1.x86_64
perl-Module-Pluggable-3.90-136.el6_6.1.x86_64
perl-Pod-Simple-3.13-136.el6_6.1.x86_64
perl-5.10.1-136.el6_6.1.x86_64
perl-DBI-1.609-4.el6.x86_64
MariaDB-client-10.0.17-1.el6.x86_64
galera-25.3.5-1.rhel6.x86_64
lsof-4.82-4.el6.x86_64
rsync-3.1.1-1.el6.x86_64
MariaDB-Galera-server-10.0.16-1.el6.x86_64
and i got
150309 19:11:31 [Warning] WSREP: 1.0 (db1): State transfer to 0.0 (db51) failed: -255 (Unknown error 255)
150309 19:11:31 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
150309 19:11:31 [Note] WSREP: gcomm: terminating thread
all the log files joiner and the donor are in the attach
Attachments
- donor.txt
- 22 kB
- joiner.txt
- 16 kB
Activity
hi
this is the /etc/sysconfig/selinux on both servers, service iptables stopped and chkconfig iptables off
# This file controls the state of SELinux on the system.
|
# SELINUX= can take one of these three values:
|
# enforcing - SELinux security policy is enforced.
|
# permissive - SELinux prints warnings instead of enforcing.
|
# disabled - No SELinux policy is loaded.
|
SELINUX=disabled
|
# SELINUXTYPE= can take one of these two values:
|
# targeted - Targeted processes are protected,
|
# mls - Multi Level Security protection.
|
SELINUXTYPE=targeted
|
this is the joiner /etc/my.cnf.d/server.cnf
#
|
# These groups are read by MariaDB server.
|
# Use it for options that only the server (but not clients) should see
|
#
|
# See the examples of server my.cnf files in /usr/share/mysql/
|
#
|
|
# this is read by the standalone daemon and embedded servers
|
[server]
|
|
# this is only for the mysqld standalone daemon
|
[mysqld]
|
|
#
|
# * Galera-related settings
|
#
|
[galera]
|
# Mandatory settings
|
#wsrep_provider=
|
#wsrep_cluster_address=
|
#binlog_format=row
|
#default_storage_engine=InnoDB
|
#innodb_autoinc_lock_mode=2
|
#bind-address=0.0.0.0
|
#
|
# Optional setting
|
#wsrep_slave_threads=1
|
#innodb_flush_log_at_trx_commit=0
|
|
# this is only for embedded server
|
[embedded]
|
|
# This group is only read by MariaDB servers, not by MySQL.
|
# If you use the same .cnf file for MySQL and MariaDB,
|
# you can put MariaDB-only options here
|
[mariadb]
|
query_cache_size=0
|
binlog_format=ROW
|
default_storage_engine=innodb
|
innodb_autoinc_lock_mode=2
|
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
|
wsrep_cluster_address=gcomm://46.20.11.247
|
wsrep_cluster_name='arescluster'
|
wsrep_node_address='217.195.204.3'
|
wsrep_node_name='db51'
|
wsrep_sst_method=rsync
|
wsrep_sst_auth=root:FQvhBrEycz
|
wsrep_debug=On
|
|
# This group is only read by MariaDB-10.0 servers.
|
# If you use the same .cnf file for MariaDB of different versions,
|
# use this group for options that older servers don't understand
|
[mariadb-10.0]
|
and this is the donor config file
#
|
# These groups are read by MariaDB server.
|
# Use it for options that only the server (but not clients) should see
|
#
|
# See the examples of server my.cnf files in /usr/share/mysql/
|
#
|
|
# this is read by the standalone daemon and embedded servers
|
[server]
|
|
# this is only for the mysqld standalone daemon
|
[mysqld]
|
|
#
|
# * Galera-related settings
|
#
|
[galera]
|
# Mandatory settings
|
#wsrep_provider=
|
#wsrep_cluster_address=
|
#binlog_format=row
|
#default_storage_engine=InnoDB
|
#innodb_autoinc_lock_mode=2
|
#bind-address=0.0.0.0
|
#
|
# Optional setting
|
#wsrep_slave_threads=1
|
#innodb_flush_log_at_trx_commit=0
|
|
# this is only for embedded server
|
[embedded]
|
|
# This group is only read by MariaDB servers, not by MySQL.
|
# If you use the same .cnf file for MySQL and MariaDB,
|
# you can put MariaDB-only options here
|
[mariadb]
|
query_cache_size=0
|
binlog_format=ROW
|
default_storage_engine=innodb
|
innodb_autoinc_lock_mode=2
|
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
|
wsrep_cluster_address=gcomm://217.195.204.3
|
wsrep_cluster_name='arescluster'
|
wsrep_node_address='46.20.11.247'
|
wsrep_node_name='db1'
|
wsrep_sst_method=rsync
|
wsrep_sst_auth=root:FQvhBrEycz
|
wsrep_debug=On
|
|
|
# This group is only read by MariaDB-10.0 servers.
|
# If you use the same .cnf file for MariaDB of different versions,
|
# use this group for options that older servers don't understand
|
[mariadb-10.0]
|
on both servers my.cnf 's are
#
|
# This group is read both both by the client and the server
|
# use it for options that affect everything
|
#
|
[client-server]
|
|
#
|
# include all files from the config directory
|
#
|
!includedir /etc/my.cnf.d
|
i have a 3 node mariadb galera cluster on the same datacenter running perfectly
bu when i want to add a remote node it fails
While it might not be entirely related to the issue, I see that Donor's wsrep_cluster_address
has Joiner's IP address. "wsrep_cluster_address" should hold the IP(s) of the existing node(s)
in the cluster (or gcomm:// for bootstrapping). But since Joiner is not part of the cluster yet,
its IP address should not be used.
Donor:
|
wsrep_cluster_address=gcomm://217.195.204.3
|
wsrep_node_address='46.20.11.247'
|
|
Joiner:
|
wsrep_cluster_address=gcomm://46.20.11.247
|
wsrep_node_address='217.195.204.3'
|
Can you share the runtime values of wsrep_cluster_address and wsrep_node_address
of all the nodes in the existing cluster?
Can you also try with wsrep_node_address left unset on the joiner node?
hi
this is the donors log file
150310 17:34:22 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150310 17:34:22 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.iyJoYU' --pid-file='/var/lib/mysql/testsms1-recover.pid'
150310 17:34:25 mysqld_safe WSREP: Recovered position d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150310 17:34:25 [Note] WSREP: wsrep_start_position var submitted: 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31'
150310 17:34:25 [Note] WSREP: Setting wsrep_ready to 0
150310 17:34:25 [Note] WSREP: Read nil XID from storage engines, skipping position init
150310 17:34:25 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
150310 17:34:25 [Note] WSREP: wsrep_load(): Galera 25.3.5(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
150310 17:34:25 [Note] WSREP: CRC-32C: using hardware acceleration.
150310 17:34:25 [Note] WSREP: Found saved state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150310 17:34:25 [Note] WSREP: Passing config to GCS: base_host = 46.20.11.247; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protone
150310 17:34:25 [Note] WSREP: Service thread queue flushed.
150310 17:34:25 [Note] WSREP: Assign initial position for certification: 31, protocol version: -1
150310 17:34:25 [Note] WSREP: wsrep_sst_grab()
150310 17:34:25 [Note] WSREP: Start replication
150310 17:34:25 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster
150310 17:34:25 [Note] WSREP: Setting initial position to d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150310 17:34:25 [Note] WSREP: protonet asio version 0
150310 17:34:25 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
150310 17:34:25 [Note] WSREP: backend: asio
150310 17:34:25 [Note] WSREP: GMCast version 0
150310 17:34:25 [Note] WSREP: (eeaf10c9-c73a-11e4-9a7a-be8324bb878e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150310 17:34:25 [Note] WSREP: (eeaf10c9-c73a-11e4-9a7a-be8324bb878e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150310 17:34:25 [Note] WSREP: EVS version 0
150310 17:34:25 [Note] WSREP: PC version 0
150310 17:34:25 [Note] WSREP: gcomm: bootstrapping new group 'arescluster'
150310 17:34:25 [Note] WSREP: Node eeaf10c9-c73a-11e4-9a7a-be8324bb878e state prim
150310 17:34:25 [Note] WSREP: view(view_id(PRIM,eeaf10c9-c73a-11e4-9a7a-be8324bb878e,1) memb {
eeaf10c9-c73a-11e4-9a7a-be8324bb878e,0
} joined {
} left {
} partitioned {
})
150310 17:34:25 [Note] WSREP: discarding pending addr without UUID: tcp://217.195.204.3:4567
150310 17:34:25 [Note] WSREP: discarding pending addr proto entry 0x7f6d09cbc380
150310 17:34:25 [Note] WSREP: gcomm: connected
150310 17:34:25 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150310 17:34:25 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150310 17:34:25 [Note] WSREP: Opened channel 'arescluster'
150310 17:34:25 [Note] WSREP: Waiting for SST to complete.
150310 17:34:25 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
150310 17:34:25 [Note] WSREP: STATE_EXCHANGE: sent state UUID: eeafdd20-c73a-11e4-8a4c-fba6a590f580
150310 17:34:25 [Note] WSREP: STATE EXCHANGE: sent state msg: eeafdd20-c73a-11e4-8a4c-fba6a590f580
150310 17:34:25 [Note] WSREP: STATE EXCHANGE: got state msg: eeafdd20-c73a-11e4-8a4c-fba6a590f580 from 0 (db1)
150310 17:34:25 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 0,
members = 1/1 (joined/total),
act_id = 31,
last_appl. = -1,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150310 17:34:25 [Note] WSREP: Flow-control interval: [16, 16]
150310 17:34:25 [Note] WSREP: Restored state OPEN -> JOINED (31)
150310 17:34:25 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
150310 17:34:25 [Note] WSREP: SST complete, seqno: 31
150310 17:34:25 [Note] WSREP: Member 0.0 (db1) synced with group.
150310 17:34:25 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 31)
150310 17:34:25 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150310 17:34:25 [Note] InnoDB: The InnoDB memory heap is disabled
150310 17:34:25 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150310 17:34:25 [Note] InnoDB: Memory barrier is not used
150310 17:34:25 [Note] InnoDB: Compressed tables use zlib 1.2.3
150310 17:34:25 [Note] InnoDB: Using Linux native AIO
150310 17:34:25 [Note] InnoDB: Using CPU crc32 instructions
150310 17:34:25 [Note] InnoDB: Initializing buffer pool, size = 128.0M
150310 17:34:25 [Note] InnoDB: Completed initialization of buffer pool
150310 17:34:25 [Note] InnoDB: Highest supported file format is Barracuda.
150310 17:34:25 [Note] InnoDB: 128 rollback segment(s) are active.
150310 17:34:25 [Note] InnoDB: Waiting for purge to start
150310 17:34:25 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.22-71.0 started; log sequence number 1764806
150310 17:34:25 [Note] Plugin 'FEEDBACK' is disabled.
150310 17:34:25 [Note] WSREP: Initial TC log open: dummy
150310 17:34:25 [Note] Server socket created on IP: '::'.
150310 17:34:25 [Note] Event Scheduler: Loaded 0 events
150310 17:34:25 [Note] WSREP: Set WSREPXid for InnoDB: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150310 17:34:25 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.16-MariaDB-wsrep' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server, wsrep_25.10.r4144
150310 17:34:25 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150310 17:34:25 [Note] WSREP: REPL Protocols: 5 (3, 1)
150310 17:34:25 [Note] WSREP: Service thread queue flushed.
150310 17:34:25 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150310 17:34:25 [Note] WSREP: Service thread queue flushed.
150310 17:34:25 [Note] WSREP: Synchronized with group, ready for connections
150310 17:34:25 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150310 17:34:25 [Note] WSREP: Nobody is waiting for SST.
and this is the joiners, i left blank wsrep_node_address and it starts but not in sync
150310 19:45:36 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150310 19:45:36 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.h6BC81' --pid-file='/var/lib/mysql/sms51-recover.pid'
150310 19:45:38 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
150310 19:45:38 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
150310 19:45:38 [Note] WSREP: Setting wsrep_ready to 0
150310 19:45:38 [Note] WSREP: Read nil XID from storage engines, skipping position init
150310 19:45:38 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
150310 19:45:38 [Note] WSREP: wsrep_load(): Galera 25.3.5(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
150310 19:45:38 [Note] WSREP: CRC-32C: using hardware acceleration.
150310 19:45:38 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150310 19:45:38 [Note] WSREP: Passing config to GCS: base_host = 217.195.204.3; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; proton
150310 19:45:38 [Note] WSREP: Service thread queue flushed.
150310 19:45:38 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150310 19:45:38 [Note] WSREP: wsrep_sst_grab()
150310 19:45:38 [Note] WSREP: Start replication
150310 19:45:38 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150310 19:45:38 [Note] WSREP: protonet asio version 0
150310 19:45:38 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
150310 19:45:38 [Note] WSREP: backend: asio
150310 19:45:38 [Note] WSREP: GMCast version 0
150310 19:45:38 [Note] WSREP: (437a9fd5-c74d-11e4-8bcd-0341ea070235, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150310 19:45:38 [Note] WSREP: (437a9fd5-c74d-11e4-8bcd-0341ea070235, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150310 19:45:38 [Note] WSREP: EVS version 0
150310 19:45:38 [Note] WSREP: PC version 0
150310 19:45:38 [Note] WSREP: gcomm: connecting to group 'arescluster', peer ''
150310 19:45:38 [Note] WSREP: Node 437a9fd5-c74d-11e4-8bcd-0341ea070235 state prim
150310 19:45:38 [Note] WSREP: view(view_id(PRIM,437a9fd5-c74d-11e4-8bcd-0341ea070235,1) memb {
437a9fd5-c74d-11e4-8bcd-0341ea070235,0
} joined {
} left {
} partitioned {
})
150310 19:45:38 [Note] WSREP: gcomm: connected
150310 19:45:38 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150310 19:45:38 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150310 19:45:38 [Note] WSREP: Opened channel 'arescluster'
150310 19:45:38 [Note] WSREP: Waiting for SST to complete.
150310 19:45:38 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
150310 19:45:38 [Note] WSREP: Starting new group from scratch: 437aec2b-c74d-11e4-9d7c-9b550749b996
150310 19:45:38 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 437afcd3-c74d-11e4-94e7-7f30219d598d
150310 19:45:38 [Note] WSREP: STATE EXCHANGE: sent state msg: 437afcd3-c74d-11e4-94e7-7f30219d598d
150310 19:45:38 [Note] WSREP: STATE EXCHANGE: got state msg: 437afcd3-c74d-11e4-94e7-7f30219d598d from 0 (db51)
150310 19:45:38 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 0,
members = 1/1 (joined/total),
act_id = 0,
last_appl. = -1,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = 437aec2b-c74d-11e4-9d7c-9b550749b996
150310 19:45:38 [Note] WSREP: Flow-control interval: [16, 16]
150310 19:45:38 [Note] WSREP: Restored state OPEN -> JOINED (0)
150310 19:45:38 [Note] WSREP: New cluster view: global state: 437aec2b-c74d-11e4-9d7c-9b550749b996:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
150310 19:45:38 [Note] WSREP: SST complete, seqno: 0
150310 19:45:38 [Note] WSREP: Member 0.0 (db51) synced with group.
150310 19:45:38 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
150310 19:45:38 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150310 19:45:38 [Note] InnoDB: The InnoDB memory heap is disabled
150310 19:45:38 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150310 19:45:38 [Note] InnoDB: Memory barrier is not used
150310 19:45:38 [Note] InnoDB: Compressed tables use zlib 1.2.3
150310 19:45:38 [Note] InnoDB: Using Linux native AIO
150310 19:45:38 [Note] InnoDB: Using CPU crc32 instructions
150310 19:45:38 [Note] InnoDB: Initializing buffer pool, size = 128.0M
150310 19:45:38 [Note] InnoDB: Completed initialization of buffer pool
150310 19:45:38 [Note] InnoDB: Highest supported file format is Barracuda.
150310 19:45:38 [Note] InnoDB: 128 rollback segment(s) are active.
150310 19:45:38 [Note] InnoDB: Waiting for purge to start
150310 19:45:38 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.22-71.0 started; log sequence number 1617962
150310 19:45:38 [Note] Plugin 'FEEDBACK' is disabled.
150310 19:45:38 [Note] WSREP: Initial TC log open: dummy
150310 19:45:38 [Note] Server socket created on IP: '::'.
150310 19:45:38 [Note] Event Scheduler: Loaded 0 events
150310 19:45:38 [Note] WSREP: Set WSREPXid for InnoDB: 437aec2b-c74d-11e4-9d7c-9b550749b996:0
150310 19:45:38 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.16-MariaDB-wsrep' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server, wsrep_25.10.r4144
150310 19:45:38 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150310 19:45:38 [Note] WSREP: REPL Protocols: 5 (3, 1)
150310 19:45:38 [Note] WSREP: Service thread queue flushed.
150310 19:45:38 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
150310 19:45:38 [Note] WSREP: Service thread queue flushed.
150310 19:45:38 [Note] WSREP: Synchronized with group, ready for connections
150310 19:45:38 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150310 19:45:38 [Note] WSREP: Nobody is waiting for SST.
I think, I wasn't clear enough in my last comment : lets not use 'wsrep_node_address' at all in the joiner's config file (just comment it out).
By doing so, we will let mysqld guess the node's address and set it.
Also, please post the runtime values of 'wsrep_cluster_address' and 'wsrep_node_address'.
hi
yes its my fault you are talking about wsrep_cluster_address at the top but at the bottom you said wsrep_node_address. I can't understand clearly the run time values, is it the config files? The logs and config files written below is my test servers, just 2 nodes I setup, the donor one is in the same datacenter with the working 3 nodes and sycned. My production databases and tables are in the donor node. The joiner one is in a different datacenter. I can only run the joiner only 2 times first one is i copy aria_log.00000001, aria_log_control, galera.cache, grastate.dat, ib_logfile0, ib_logfile1, ibdata1 and rsync_sst_complete files from a working node and after reboot mysql starts normally, i check the sync and its working my tables adding deleting and inserts are done in the joiner node. But 1 day later when i reboot mysql starts SST and it fails. The second one is i setup joiner node on datacenter A, start mysql and move it to datacenter B it was worked again but when i reboot after a few hours later it starts SST and fails again.
my joiner config file mariadb tag
[mariadb]
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://46.20.11.247,217.195.204.3
wsrep_cluster_name='arescluster'
wsrep_node_name='db51'
wsrep_sst_method=rsync
wsrep_sst_auth=root:FQvhBrEycz
wsrep_debug=On
joiner log file
150311 11:18:48 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150311 11:18:48 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.D5PU8c' --pid-file='/var/lib/mysql/sms51-recover.pid'
150311 11:18:50 mysqld_safe WSREP: Recovered position 437aec2b-c74d-11e4-9d7c-9b550749b996:0
150311 11:18:50 [Note] WSREP: wsrep_start_position var submitted: '437aec2b-c74d-11e4-9d7c-9b550749b996:0'
150311 11:18:50 [Note] WSREP: Setting wsrep_ready to 0
150311 11:18:50 [Note] WSREP: Read nil XID from storage engines, skipping position init
150311 11:18:50 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
150311 11:18:50 [Note] WSREP: wsrep_load(): Galera 25.3.5(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
150311 11:18:50 [Note] WSREP: CRC-32C: using hardware acceleration.
150311 11:18:50 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150311 11:18:50 [Note] WSREP: Passing config to GCS: base_host = 192.168.1.1; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet
150311 11:18:50 [Note] WSREP: Service thread queue flushed.
150311 11:18:50 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150311 11:18:50 [Note] WSREP: wsrep_sst_grab()
150311 11:18:50 [Note] WSREP: Start replication
150311 11:18:50 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150311 11:18:50 [Note] WSREP: protonet asio version 0
150311 11:18:50 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
150311 11:18:50 [Note] WSREP: backend: asio
150311 11:18:50 [Note] WSREP: GMCast version 0
150311 11:18:50 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150311 11:18:50 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150311 11:18:50 [Note] WSREP: EVS version 0
150311 11:18:50 [Note] WSREP: PC version 0
150311 11:18:50 [Note] WSREP: gcomm: connecting to group 'arescluster', peer '46.20.11.247:,217.195.204.3:'
150311 11:18:50 [Warning] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') address 'tcp://217.195.204.3:4567' points to own listening address, blacklisting
150311 11:18:50 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') address 'tcp://217.195.204.3:4567' pointing to uuid a14a65da-c7cf-11e4-9e6a-4716280818a2 is blacklisted, skipping
150311 11:18:52 [Note] WSREP: declaring 1ab47b34-c7bd-11e4-becb-c6c739f575b5 stable
150311 11:18:53 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 11:18:53 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 11:18:54 [Note] WSREP: gcomm: connected
150311 11:18:54 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150311 11:18:54 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150311 11:18:54 [Note] WSREP: Opened channel 'arescluster'
150311 11:18:54 [Note] WSREP: Waiting for SST to complete.
150311 11:18:54 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150311 11:18:54 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150311 11:18:54 [Note] WSREP: STATE EXCHANGE: sent state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48
150311 11:18:54 [Note] WSREP: STATE EXCHANGE: got state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48 from 0 (db1)
150311 11:18:54 [Note] WSREP: STATE EXCHANGE: got state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48 from 1 (db51)
150311 11:18:54 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 1,
members = 1/2 (joined/total),
act_id = 31,
last_appl. = -1,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 11:18:54 [Note] WSREP: Flow-control interval: [23, 23]
150311 11:18:54 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 31)
150311 11:18:54 [Note] WSREP: State transfer required:
Group state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
Local state: 00000000-0000-0000-0000-000000000000:-1
150311 11:18:54 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
150311 11:18:54 [Warning] WSREP: Gap in state sequence. Need state transfer.
150311 11:18:54 [Note] WSREP: Setting wsrep_ready to 0
150311 11:18:54 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.1.1' --auth 'root:FQvhBrEycz' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '15051' '' '
150311 11:18:54 [Note] WSREP: Prepared SST request: rsync|192.168.1.1:4444/rsync_sst
150311 11:18:54 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 11:18:54 [Note] WSREP: REPL Protocols: 5 (3, 1)
150311 11:18:54 [Note] WSREP: Service thread queue flushed.
150311 11:18:54 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150311 11:18:54 [Note] WSREP: Service thread queue flushed.
150311 11:18:54 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be unavailable.
150311 11:18:54 [Note] WSREP: Member 1.0 (db51) requested state transfer from 'any'. Selected 0.0 (db1)(SYNCED) as donor.
150311 11:18:54 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 31)
150311 11:18:54 [Note] WSREP: Requesting state transfer: success, donor: 0
150311 11:19:28 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://46.20.11.247:4567
150311 11:19:29 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') reconnecting to 1ab47b34-c7bd-11e4-becb-c6c739f575b5 (tcp://46.20.11.247:4567), attempt 0
150311 11:19:29 [Note] WSREP: evs::proto(a14a65da-c7cf-11e4-9e6a-4716280818a2, OPERATIONAL, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: 1ab47b34-c7bd-11e4-becb-c6c739f575b5
150311 11:19:30 [Note] WSREP: evs::proto(a14a65da-c7cf-11e4-9e6a-4716280818a2, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: 1ab47b34-c7bd-11e4-becb-c6c739f575b5
150311 11:19:30 [Note] WSREP: evs::proto(a14a65da-c7cf-11e4-9e6a-4716280818a2, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: 1ab47b34-c7bd-11e4-becb-c6c739f575b5
150311 11:19:31 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') address 'tcp://217.195.204.3:4567' pointing to uuid a14a65da-c7cf-11e4-9e6a-4716280818a2 is blacklisted, skipping
150311 11:19:31 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') turning message relay requesting off
150311 11:19:31 [Note] WSREP: evs::proto(a14a65da-c7cf-11e4-9e6a-4716280818a2, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: 1ab47b34-c7bd-11e4-becb-c6c739f575b5
150311 11:19:35 [Warning] WSREP: subsequent views have same members, prev view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
}) current view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 11:19:35 [Note] WSREP: declaring 1ab47b34-c7bd-11e4-becb-c6c739f575b5 stable
150311 11:19:35 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 11:19:35 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 11:19:35 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150311 11:19:35 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150311 11:19:35 [Note] WSREP: STATE EXCHANGE: sent state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce
150311 11:19:36 [Note] WSREP: STATE EXCHANGE: got state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce from 0 (db1)
150311 11:19:36 [Note] WSREP: STATE EXCHANGE: got state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce from 1 (db51)
150311 11:19:36 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 2,
members = 1/2 (joined/total),
act_id = 31,
last_appl. = 0,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 11:19:36 [Note] WSREP: Flow-control interval: [23, 23]
150311 11:19:57 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://46.20.11.247:4567
150311 11:19:58 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') reconnecting to 1ab47b34-c7bd-11e4-becb-c6c739f575b5 (tcp://46.20.11.247:4567), attempt 0
150311 11:19:59 [Note] WSREP: evs::proto(a14a65da-c7cf-11e4-9e6a-4716280818a2, OPERATIONAL, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5)) suspecting node: 1ab47b34-c7bd-11e4-becb-c6c739f575b5
150311 11:19:59 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') address 'tcp://217.195.204.3:4567' pointing to uuid a14a65da-c7cf-11e4-9e6a-4716280818a2 is blacklisted, skipping
150311 11:19:59 [Note] WSREP: (a14a65da-c7cf-11e4-9e6a-4716280818a2, 'tcp://0.0.0.0:4567') turning message relay requesting off
150311 11:19:59 [Warning] WSREP: 0.0 (db1): State transfer to 1.0 (db51) failed: -255 (Unknown error 255)
150311 11:19:59 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
150311 11:19:59 [Note] WSREP: gcomm: terminating thread
150311 11:19:59 [Note] WSREP: gcomm: joining thread
150311 11:19:59 [Note] WSREP: gcomm: closing backend
150311 11:20:00 [Warning] WSREP: subsequent views have same members, prev view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
}) current view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,6) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 11:20:00 [Note] WSREP: declaring 1ab47b34-c7bd-11e4-becb-c6c739f575b5 stable
150311 11:20:00 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 11:20:00 [Note] WSREP: view(view_id(NON_PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,6) memb {
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
})
150311 11:20:00 [Note] WSREP: view((empty))
150311 11:20:00 [Note] WSREP: gcomm: closed
150311 11:20:00 [Note] WSREP: /usr/sbin/mysqld: Terminated.
150311 11:20:00 mysqld_safe mysqld from pid file /var/lib/mysql/sms51.pid ended
WSREP_SST: [ERROR] Parent mysqld process (PID:15051) terminated unexpectedly. (20150311 11:20:01.165)
WSREP_SST: [INFO] Joiner cleanup. (20150311 11:20:01.167)
WSREP_SST: [INFO] Joiner cleanup done. (20150311 11:20:01.673)
donor config file
[mariadb]
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://217.195.204.3,46.20.11.247
wsrep_cluster_name='arescluster'
wsrep_node_address='46.20.11.247'
wsrep_node_name='db1'
wsrep_sst_method=rsync
wsrep_sst_auth=root:FQvhBrEycz
wsrep_debug=On
donor log file
150311 09:06:11 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150311 09:06:11 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.9NV3rJ' --pid-file='/var/lib/mysql/testsms1-recover.pid'
150311 09:06:13 mysqld_safe WSREP: Recovered position d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150311 9:06:13 [Note] WSREP: wsrep_start_position var submitted: 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31'
150311 9:06:13 [Note] WSREP: Setting wsrep_ready to 0
150311 9:06:13 [Note] WSREP: Read nil XID from storage engines, skipping position init
150311 9:06:13 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
150311 9:06:13 [Note] WSREP: wsrep_load(): Galera 25.3.5(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
150311 9:06:13 [Note] WSREP: CRC-32C: using hardware acceleration.
150311 9:06:13 [Note] WSREP: Found saved state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150311 9:06:13 [Note] WSREP: Passing config to GCS: base_host = 46.20.11.247; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protone
150311 9:06:13 [Note] WSREP: Service thread queue flushed.
150311 9:06:13 [Note] WSREP: Assign initial position for certification: 31, protocol version: -1
150311 9:06:13 [Note] WSREP: wsrep_sst_grab()
150311 9:06:13 [Note] WSREP: Start replication
150311 9:06:13 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster
150311 9:06:13 [Note] WSREP: Setting initial position to d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150311 9:06:13 [Note] WSREP: protonet asio version 0
150311 9:06:13 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
150311 9:06:13 [Note] WSREP: backend: asio
150311 9:06:13 [Note] WSREP: GMCast version 0
150311 9:06:13 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150311 9:06:13 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150311 9:06:13 [Note] WSREP: EVS version 0
150311 9:06:13 [Note] WSREP: PC version 0
150311 9:06:13 [Note] WSREP: gcomm: bootstrapping new group 'arescluster'
150311 9:06:13 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 9:06:13 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,1) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
} joined {
} left {
} partitioned {
})
150311 9:06:13 [Note] WSREP: discarding pending addr without UUID: tcp://217.195.204.3:4567
150311 9:06:13 [Note] WSREP: discarding pending addr proto entry 0x7fc0444bc380
150311 9:06:13 [Note] WSREP: discarding pending addr without UUID: tcp://46.20.11.247:4567
150311 9:06:13 [Note] WSREP: discarding pending addr proto entry 0x7fc0444bc440
150311 9:06:13 [Note] WSREP: gcomm: connected
150311 9:06:13 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150311 9:06:13 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150311 9:06:13 [Note] WSREP: Opened channel 'arescluster'
150311 9:06:13 [Note] WSREP: Waiting for SST to complete.
150311 9:06:13 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
150311 9:06:13 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 1ab56617-c7bd-11e4-8a89-ee67ad422b2c
150311 9:06:13 [Note] WSREP: STATE EXCHANGE: sent state msg: 1ab56617-c7bd-11e4-8a89-ee67ad422b2c
150311 9:06:13 [Note] WSREP: STATE EXCHANGE: got state msg: 1ab56617-c7bd-11e4-8a89-ee67ad422b2c from 0 (db1)
150311 9:06:13 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 0,
members = 1/1 (joined/total),
act_id = 31,
last_appl. = -1,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 9:06:13 [Note] WSREP: Flow-control interval: [16, 16]
150311 9:06:13 [Note] WSREP: Restored state OPEN -> JOINED (31)
150311 9:06:13 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
150311 9:06:13 [Note] WSREP: SST complete, seqno: 31
150311 9:06:13 [Note] WSREP: Member 0.0 (db1) synced with group.
150311 9:06:13 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 31)
150311 9:06:13 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150311 9:06:13 [Note] InnoDB: The InnoDB memory heap is disabled
150311 9:06:13 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150311 9:06:13 [Note] InnoDB: Memory barrier is not used
150311 9:06:13 [Note] InnoDB: Compressed tables use zlib 1.2.3
150311 9:06:13 [Note] InnoDB: Using Linux native AIO
150311 9:06:13 [Note] InnoDB: Using CPU crc32 instructions
150311 9:06:13 [Note] InnoDB: Initializing buffer pool, size = 128.0M
150311 9:06:13 [Note] InnoDB: Completed initialization of buffer pool
150311 9:06:13 [Note] InnoDB: Highest supported file format is Barracuda.
150311 9:06:13 [Note] InnoDB: 128 rollback segment(s) are active.
150311 9:06:13 [Note] InnoDB: Waiting for purge to start
150311 9:06:14 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.22-71.0 started; log sequence number 1765170
150311 9:06:14 [Note] Plugin 'FEEDBACK' is disabled.
150311 9:06:14 [Note] WSREP: Initial TC log open: dummy
150311 9:06:14 [Note] Server socket created on IP: '::'.
150311 9:06:14 [Note] Event Scheduler: Loaded 0 events
150311 9:06:14 [Note] WSREP: Set WSREPXid for InnoDB: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31
150311 9:06:14 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.16-MariaDB-wsrep' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server, wsrep_25.10.r4144
150311 9:06:14 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:06:14 [Note] WSREP: REPL Protocols: 5 (3, 1)
150311 9:06:14 [Note] WSREP: Service thread queue flushed.
150311 9:06:14 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150311 9:06:14 [Note] WSREP: Service thread queue flushed.
150311 9:06:14 [Note] WSREP: Synchronized with group, ready for connections
150311 9:06:14 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:06:14 [Note] WSREP: Nobody is waiting for SST.
150311 9:07:45 [Note] WSREP: declaring a14a65da-c7cf-11e4-9e6a-4716280818a2 stable
150311 9:07:45 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 9:07:46 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 9:07:46 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
150311 9:07:46 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 51bfcc40-c7bd-11e4-b397-27449ac6ea48
150311 9:07:46 [Note] WSREP: STATE EXCHANGE: sent state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48
150311 9:07:47 [Note] WSREP: STATE EXCHANGE: got state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48 from 0 (db1)
150311 9:07:47 [Note] WSREP: STATE EXCHANGE: got state msg: 51bfcc40-c7bd-11e4-b397-27449ac6ea48 from 1 (db51)
150311 9:07:47 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 1,
members = 1/2 (joined/total),
act_id = 31,
last_appl. = 0,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 9:07:47 [Note] WSREP: Flow-control interval: [23, 23]
150311 9:07:47 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
150311 9:07:47 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:07:47 [Note] WSREP: REPL Protocols: 5 (3, 1)
150311 9:07:47 [Note] WSREP: Service thread queue flushed.
150311 9:07:47 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150311 9:07:47 [Note] WSREP: Service thread queue flushed.
150311 9:07:47 [Note] WSREP: Member 1.0 (db51) requested state transfer from 'any'. Selected 0.0 (db1)(SYNCED) as donor.
150311 9:07:47 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 31)
150311 9:07:47 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:07:47 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '192.168.1.1:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31''
150311 9:07:47 [Note] WSREP: sst_donor_thread signaled with 0
150311 9:07:47 [Note] WSREP: Flushing tables for SST...
150311 9:07:47 [Note] WSREP: Provider paused at d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31 (5)
150311 9:07:47 [Note] WSREP: Tables flushed.
150311 9:08:20 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://217.195.204.3:4567
150311 9:08:21 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') reconnecting to a14a65da-c7cf-11e4-9e6a-4716280818a2 (tcp://217.195.204.3:4567), attempt 0
150311 9:08:22 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, OPERATIONAL, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:23 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:23 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:24 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:24 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') turning message relay requesting off
150311 9:08:24 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:28 [Warning] WSREP: subsequent views have same members, prev view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,2) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
}) current view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 9:08:28 [Note] WSREP: declaring a14a65da-c7cf-11e4-9e6a-4716280818a2 stable
150311 9:08:28 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 9:08:28 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 9:08:28 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
150311 9:08:28 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 6af140ad-c7bd-11e4-a103-8f5301f46cce
150311 9:08:28 [Note] WSREP: STATE EXCHANGE: sent state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce
150311 9:08:28 [Note] WSREP: STATE EXCHANGE: got state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce from 0 (db1)
150311 9:08:28 [Note] WSREP: STATE EXCHANGE: got state msg: 6af140ad-c7bd-11e4-a103-8f5301f46cce from 1 (db51)
150311 9:08:28 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 2,
members = 1/2 (joined/total),
act_id = 31,
last_appl. = 0,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 9:08:28 [Note] WSREP: Flow-control interval: [23, 23]
150311 9:08:49 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://217.195.204.3:4567
150311 9:08:50 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') reconnecting to a14a65da-c7cf-11e4-9e6a-4716280818a2 (tcp://217.195.204.3:4567), attempt 0
rsync: failed to connect to 192.168.1.1 (192.168.1.1): Connection timed out (110)
rsync error: error in socket IO (code 10) at clientserver.c(128) [sender=3.1.1]
WSREP_SST: [ERROR] rsync returned code 10: (20150311 09:08:50.963)
150311 9:08:50 [ERROR] WSREP: Failed to read from: wsrep_sst_rsync --role 'donor' --address '192.168.1.1:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31'
150311 9:08:50 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'donor' --address '192.168.1.1:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31': 255 (Unknown error 255)
150311 9:08:50 [Note] WSREP: resuming provider at 5
150311 9:08:50 [Note] WSREP: Provider resumed.
150311 9:08:50 [ERROR] WSREP: Command did not run: wsrep_sst_rsync --role 'donor' --address '192.168.1.1:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31'
150311 9:08:50 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 3: Primary, number of nodes: 2, my index: 0, protocol version 3
150311 9:08:50 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:08:50 [Note] WSREP: REPL Protocols: 5 (3, 1)
150311 9:08:50 [Note] WSREP: Service thread queue flushed.
150311 9:08:50 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150311 9:08:50 [Note] WSREP: Service thread queue flushed.
150311 9:08:51 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, OPERATIONAL, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:51 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:52 [Note] WSREP: (1ab47b34-c7bd-11e4-becb-c6c739f575b5, 'tcp://0.0.0.0:4567') turning message relay requesting off
150311 9:08:52 [Note] WSREP: evs::proto(1ab47b34-c7bd-11e4-becb-c6c739f575b5, GATHER, view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5)) suspecting node: a14a65da-c7cf-11e4-9e6a-4716280818a2
150311 9:08:52 [Warning] WSREP: 0.0 (db1): State transfer to 1.0 (db51) failed: -255 (Unknown error 255)
150311 9:08:52 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 31)
150311 9:08:52 [Warning] WSREP: subsequent views have same members, prev view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,5) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
}) current view view(view_id(REG,1ab47b34-c7bd-11e4-becb-c6c739f575b5,6) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
} joined {
} left {
} partitioned {
})
150311 9:08:52 [Note] WSREP: declaring a14a65da-c7cf-11e4-9e6a-4716280818a2 stable
150311 9:08:53 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 9:08:53 [Warning] WSREP: 1ab47b34-c7bd-11e4-becb-c6c739f575b5 sending install message failed: Resource temporarily unavailable
150311 9:08:53 [Note] WSREP: Node 1ab47b34-c7bd-11e4-becb-c6c739f575b5 state prim
150311 9:08:53 [Note] WSREP: view(view_id(PRIM,1ab47b34-c7bd-11e4-becb-c6c739f575b5,7) memb {
1ab47b34-c7bd-11e4-becb-c6c739f575b5,0
} joined {
} left {
} partitioned {
a14a65da-c7cf-11e4-9e6a-4716280818a2,0
})
150311 9:08:53 [Note] WSREP: forgetting a14a65da-c7cf-11e4-9e6a-4716280818a2 (tcp://217.195.204.3:4567)
150311 9:08:53 [Note] WSREP: deleting entry tcp://217.195.204.3:4567
150311 9:08:53 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
150311 9:08:53 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 7996b011-c7bd-11e4-8872-0a7861f822e3
150311 9:08:53 [Warning] WSREP: SYNC message from member 0 in non-primary configuration. Ignored.
150311 9:08:53 [Note] WSREP: STATE EXCHANGE: sent state msg: 7996b011-c7bd-11e4-8872-0a7861f822e3
150311 9:08:53 [Note] WSREP: STATE EXCHANGE: got state msg: 7996b011-c7bd-11e4-8872-0a7861f822e3 from 0 (db1)
150311 9:08:53 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 3,
members = 1/1 (joined/total),
act_id = 31,
last_appl. = 0,
protocols = 0/5/3 (gcs/repl/appl),
group UUID = d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac
150311 9:08:53 [Note] WSREP: Flow-control interval: [16, 16]
150311 9:08:53 [Note] WSREP: Member 0.0 (db1) synced with group.
150311 9:08:53 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 31)
150311 9:08:53 [Note] WSREP: New cluster view: global state: d2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:31, view# 4: Primary, number of nodes: 1, my index: 0, protocol version 3
150311 9:08:53 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:08:53 [Note] WSREP: REPL Protocols: 5 (3, 1)
150311 9:08:53 [Note] WSREP: Service thread queue flushed.
150311 9:08:53 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
150311 9:08:53 [Note] WSREP: Service thread queue flushed.
150311 9:08:53 [Note] WSREP: Synchronized with group, ready for connections
150311 9:08:53 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150311 9:08:58 [Note] WSREP: cleaning up a14a65da-c7cf-11e4-9e6a-4716280818a2 (tcp://217.195.204.3:4567)
hi
today i found that our sip's router block randomly 4444 port, when they give the permission for that port, the problem were solved
thank you
The donor logs contains :
rsync: failed to connect to 217.195.204.3 (217.195.204.3): Connection timed out (110)
rsync error: error in socket IO (code 10) at clientserver.c(128) [sender=3.1.1]
WSREP_SST: [ERROR] rsync returned code 10: (20150309 17:00:25.683)
150309 17:00:25 [ERROR] WSREP: Failed to read from: wsrep_sst_rsync --role 'donor' --address '217.195.204.3:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:0'
150309 17:00:25 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'donor' --address '217.195.204.3:4444/rsync_sst' --auth 'root:FQvhBrEycz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' '' --gtid 'd2e0cc7e-c66c-11e4-9640-9baeb4ba3dac:0': 255 (Unknown error 255)
It shows that rsync client on donor is not able to reach rsyncd (rsync daemon) on the joiner node.
Its quite likely a configuration issue.
Could you share you cluster configuration (my.cnf's) ? Also please check that wsrep_cluster_address
has right IP addresses. Also, how did you disable SELinux and firewalls? Will it be possible for you
to share these information?
Thanks!