[MDEV-23503] node fails to join galera cluster with SST failures Created: 2020-08-18  Updated: 2020-09-15  Resolved: 2020-09-15

Status: Closed
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 5.5.54
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Raghunath Dhandapani Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: galera, need_feedback


 Description   

second node fails to join the cluster with the SST transfer failures.

Donor node log

200817 11:22:19 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 18723213042)
200817 11:22:19 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
200817 11:22:19 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address 'x.x.x.90:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:18723213042''
200817 11:22:19 [Note] WSREP: sst_donor_thread signaled with 0
200817 11:22:20 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address 'x.x.x.x:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:18723213042'
200817 11:22:20 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address 'x.x.x.90:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:18723213042': 1 (Operation not permitted)
200817 11:22:20 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address 'x.x.x.90:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '3bf24a64-e806-11e5-8238-ea129650fffe:18723213042'
200817 11:22:20 [Warning] WSREP: 1.0 (db201.asp.com): State transfer to 0.0 (db202.asp.com) failed: -1 (Operation not permitted)

Joiner node

WSREP_SST: [INFO] Stale sst_in_progress file: /var/lib/mysql//sst_in_progress (20200817 11:22:08.120)
WSREP_SST: [INFO] Evaluating timeout -s9 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20200817 11:22:08.164)
2020/08/17 11:22:08 socat[48257] E bind(3,

{AF=2 0.0.0.0:4444}

, 16): Address already in use
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0 (20200817 11:22:08.175)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20200817 11:22:08.179)
200817 11:22:10 [Note] WSREP: (cab8fc95, 'tcp://0.0.0.0:4567') turning message relay requesting off
200817 11:22:19 [Note] WSREP: Prepared SST request: xtrabackup-v2|x.x.x.90:4444/xtrabackup_sst//1
200817 11:22:19 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
200817 11:22:19 [Note] WSREP: REPL Protocols: 7 (3, 2)
200817 11:22:19 [Note] WSREP: Assign initial position for certification: 18723211619, protocol version: 3
200817 11:22:19 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'x.x.x.90' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '47975': 32 (Broken pipe)
200817 11:22:19 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
200817 11:22:19 [Note] WSREP: Service thread queue flushed.
200817 11:22:19 [ERROR] WSREP: SST failed: 32 (Broken pipe)
200817 11:22:19 [ERROR] Aborting

200817 11:22:19 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (3bf24a64-e806-11e5-8238-ea129650fffe): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
200817 11:22:19 [Note] WSREP: Member 0.0 (db202.asp.com) requested state transfer from 'any'. Selected 1.0 (db201.asp.com)(SYNCED) as donor.
200817 11:22:19 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 18723213042)
200817 11:22:19 [Note] WSREP: Requesting state transfer: success, donor: 1
200817 11:22:19 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(3bf24a64-e806-11e5-8238-ea129650fffe:18723211619)
200817 11:22:20 [Warning] WSREP: 1.0 (db201.asp.com): State transfer to 0.0 (db202.asp.com) failed: -1 (Operation not permitted)
200817 11:22:20 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.



 Comments   
Comment by Raghunath Dhandapani [ 2020-08-18 ]

selinux and firewall are disabled in all the cluster nodes.

using wsrep_sst_method = xtrabackup-v2

Packages installed
MariaDB-client-5.5.54-1.el6.x86_64
MariaDB-common-5.5.54-1.el6.x86_64
MariaDB-compat-5.5.54-1.el6.x86_64
MariaDB-devel-5.5.54-1.el6.x86_64
MariaDB-Galera-server-5.5.54-1.el6.x86_64
MariaDB-shared-5.5.54-1.el6.x86_64
MariaDB-test-5.5.54-1.el6.x86_64
galera-25.3.19-1.rhel6.el6.x86_64
percona-xtrabackup-2.3.10-1.el6.x86_64

Comment by Elena Stepanova [ 2020-08-18 ]

5.5 release line is no longer supported. Please try to upgrade to something more up-to-date and check if the problem still exists.

While choosing which version to upgrade to, please take into account that 10.1 will go EOL this autumn.

Generated at Thu Feb 08 09:22:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.