the same Error was observed on when joining new node to the cluster with srver 10.2.12
on CentOS 7.4
2018-01-22 15:25:21 140167331440384 [Note] WSREP: STATE EXCHANGE: got state msg: b1f7499a-ff77-11e7-85b9-ee24c1b0321e from 1 (t4w5)
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 253,
members = 1/2 (joined/total),
act_id = 31,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = ca2bb6d4-ff5d-11e7-8871-7f48103f96c0
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Flow-control interval: [23, 23]
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Trying to continue unpaused monitor
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 31)
2018-01-22 15:25:21 140167538136832 [Note] WSREP: State transfer required:
Group state: ca2bb6d4-ff5d-11e7-8871-7f48103f96c0:31
Local state: 00000000-0000-0000-0000-000000000000:-1
2018-01-22 15:25:21 140167538136832 [Note] WSREP: New cluster view: global state: ca2bb6d4-ff5d-11e7-8871-7f48103f96c0:31, view# 254: Primary, number of nodes: 2, my index: 0, protocol version 3
2018-01-22 15:25:21 140167538136832 [Warning] WSREP: Gap in state sequence. Need state transfer.
2018-01-22 15:25:21 140167323047680 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'joiner' --address '192.168.104.195' --datadir '/var/lib/mysql/' --parent '14715' '' '
WSREP_SST: [INFO] Streaming with xbstream (20180122 15:25:21.348)
WSREP_SST: [INFO] Using socat as streamer (20180122 15:25:21.351)
WSREP_SST: [INFO] Stale sst_in_progress file: /var/lib/mysql//sst_in_progress (20180122 15:25:21.357)
2018-01-22 15:25:21 140167538136832 [Note] WSREP: Prepared SST request: mariabackup|192.168.104.195:4444/xtrabackup_sst//1
2018-01-22 15:25:21 140167538136832 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2018-01-22 15:25:21 140167538136832 [Note] WSREP: REPL Protocols: 7 (3, 2)
2018-01-22 15:25:21 140167538136832 [Note] WSREP: Assign initial position for certification: 31, protocol version: 3
2018-01-22 15:25:21 140167580100352 [Note] WSREP: Service thread queue flushed.
2018-01-22 15:25:21 140167538136832 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (ca2bb6d4-ff5d-11e7-8871-7f48103f96c0): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Member 0.0 (t4w5) requested state transfer from 'any'. Selected 1.0 (t4w5)(SYNCED) as donor.
2018-01-22 15:25:21 140167331440384 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 31)
2018-01-22 15:25:21 140167538136832 [Note] WSREP: Requesting state transfer: success, donor: 1
2018-01-22 15:25:21 140167538136832 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> ca2bb6d4-ff5d-11e7-8871-7f48103f96c0:31
WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | mbstream -x; RC=( ${PIPESTATUS[@]} ) (20180122 15:25:21.412)
2018/01/22 15:25:21 socat[15000] E bind(6,
{AF=2 0.0.0.0:4444}
, 16): Address already in use
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0 (20180122 15:25:21.422)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20180122 15:25:21.425)
2018-01-22 15:25:21 140167323047680 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address '192.168.104.195' --datadir '/var/lib/mysql/' --parent '14715' '' : 32 (Broken pipe)
2018-01-22 15:25:21 140167323047680 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2018-01-22 15:25:21 140167672174720 [ERROR] WSREP: SST failed: 32 (Broken pipe)
2018-01-22 15:25:21 140167672174720 [ERROR] Aborting
2018-01-22 15:25:21 140167331440384 [Warning] WSREP: 1.0 (t4w5): State transfer to 0.0 (t4w5) failed: -32 (Broken pipe)
2018-01-22 15:25:21 140167331440384 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
Note2 :
add port 4444/tcp as permanent to firewall or iptables rules
avoids the Errors socat[15000] E bind(6,
{AF=2 0.0.0.0:4444}
, 16): Address already in use
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0 (20180122 15:25:21.422)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20180122 15:25:21.425)
2018-01-22 15:25:21 140167323047680 [ERROR] WSREP: Process completed with error:
Hi,
Experiencing the same problem, nodes can't join on Ubuntu Xenial.
mysqld stops with:
socat[21170] E bind(6, {AF=2 0.0.0.0:4444}, 16): Address already in use
and socat, xbstream, wsrep_sst_xtrabackup-v2 keep running, parenthood inherited by 'init'.
As soon as you kill all the sst processes systemd restarts mysqld and the problems is back, mysqld stops, sst processes keep running.
10.1.28 seems not to contain this fix.
In sql/wsrep_utils.cc:
}
err_ = posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETSIGDEF |
POSIX_SPAWN_SETSIGMASK |
POSIX_SPAWN_USEVFORK);
{