[MDEV-31467] wsrep_sst_mariabackup not working on FreeBSD Created: 2023-06-13  Updated: 2023-12-07  Resolved: 2023-10-17

Status: Closed
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 10.6.13
Fix Version/s: 10.4.32, 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2, 11.3.1

Type: Bug Priority: Critical
Reporter: Thomas Babut Assignee: Julius Goryavsky
Resolution: Fixed Votes: 1
Labels: crash, galera
Environment:

FreeBSD, all current versions



 Description   

With wsrep_sst_method=mariabackup the SST/IST process fails at this stage:

WSREP_SST: [INFO] mariabackup SST started on joiner (20230613 09:04:17.N)
WSREP_SST: [INFO] SSL configuration: CA='', CAPATH='', CERT='', KEY='', MODE='DISABLED', encrypt='0' (20230613 09:04:17.N)
WSREP_SST: [INFO] Streaming with mbstream (20230613 09:04:18.N)
WSREP_SST: [INFO] Using socat as streamer (20230613 09:04:18.N)
 
Usage: timeout [--signal sig | -s sig] [--preserve-status] [--kill-after time | -k time] [--foreground] <duration> <command> <arg ...>
WSREP_SST: [INFO] Evaluating timeout -s9 300 socat -u TCP-LISTEN:4444,reuseaddr stdio | '/usr/local//bin/mbstream' -x; RC=( ${PIPESTATUS[@]} ) 
(20230613 09:04:18.N)
 
2023-06-13  9:04:18 2 [Note] WSREP: ####### IST uuid:7f4bd4bc-73ee-11ed-8cc1-77200a87198f f: 203400, l: 203407, STRv: 3
2023-06-13  9:04:18 2 [Note] WSREP: IST receiver addr using tcp://192.168.1.1:4568
2023-06-13  9:04:18 2 [Note] WSREP: Prepared IST receiver for 203400-203407, listening at: tcp://192.168.1.1:4568
2023-06-13  9:04:18 0 [Note] WSREP: Member 1.0 (galera1) requested state transfer from '*any*'. Selected 0.0 (galera3)(SYNCED) as don
or.
2023-06-13  9:04:18 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 203407)
2023-06-13  9:04:18 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2023-06-13  9:04:18 0 [Warning] WSREP: 0.0 (galera3): State transfer to 1.0 (galera1) failed: -2 (No such file or directory)
2023-06-13  9:04:18 0 [ERROR] WSREP: /wrkdirs/usr/ports/databases/galera26/work/galera-release_26.4.14/gcs/src/gcs_group.cpp:gcs_group_handle_j
oin_msg():1172: Will never receive state. Need to abort.
2023-06-13  9:04:18 0 [Note] WSREP: gcomm: terminating thread
2023-06-13  9:04:18 0 [Note] WSREP: gcomm: joining thread
2023-06-13  9:04:18 0 [Note] WSREP: gcomm: closing backend
2023-06-13  9:04:18 0 [Note] WSREP: view(view_id(NON_PRIM,76dfdac7-93fe,53) memb {
        82d2bc20-a998,0
} joined {
} left {
} partitioned {
        76dfdac7-93fe,0
        8f298c44-9cca,0
})
2023-06-13  9:04:18 0 [Note] WSREP: PC protocol downgrade 1 -> 0
2023-06-13  9:04:18 0 [Note] WSREP: view((empty))
2023-06-13  9:04:18 0 [Note] WSREP: gcomm: closed
2023-06-13  9:04:18 0 [Note] WSREP: mariadbd: Terminated.

The timeout command seems to expect a different call on FreeBSD systems than on Linux.



 Comments   
Comment by Daniel Black [ 2023-07-06 ]

Is this the right patch?

diff --git a/scripts/wsrep_sst_mariabackup.sh b/scripts/wsrep_sst_mariabackup.sh
index d6334052f24..54e92bfba7b 100644
--- a/scripts/wsrep_sst_mariabackup.sh
+++ b/scripts/wsrep_sst_mariabackup.sh
@@ -799,9 +799,9 @@ recv_joiner()
     if [ $tmt -gt 0 ]; then
         if [ -n "$(commandex timeout)" ]; then
             if timeout --help | grep -qw -F -- '-k'; then
-                ltcmd="timeout -k $(( tmt+10 )) $tmt $tcmd"
+                ltcmd="timeout -k $(( tmt+10 ))s ${tmt}s $tcmd"
             else
-                ltcmd="timeout -s9 $tmt $tcmd"
+                ltcmd="timeout -s 9 $tmt $tcmd"
             fi
         fi
     fi

Comment by Julius Goryavsky [ 2023-10-17 ]

Fixed, https://github.com/MariaDB/server/commit/073a088f3190c6da63df979a154853d4b5309e50

Generated at Thu Feb 08 10:24:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.