[MDEV-26101] Galera WSREP SST broken under FreeBSD Created: 2021-07-07  Updated: 2021-09-14  Resolved: 2021-08-16

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera SST, wsrep
Affects Version/s: 10.6.3, 10.2, 10.3, 10.4, 10.5, 10.7
Fix Version/s: 10.2.41, 10.3.32, 10.4.22, 10.5.13, 10.6.5, 10.7.0

Type: Bug Priority: Major
Reporter: Vincent Milum Jr Assignee: Julius Goryavsky
Resolution: Fixed Votes: 2
Labels: galera, rsync, sst
Environment:

FreeBSD 13.0-RELEASE


Attachments: File wsrep_sst_rsync_freebsd.diff    
Issue Links:
Relates
relates to MDEV-10432 Remove bash/lsof dependencies from sc... Closed

 Description   

This bug is also being tracked in the FreeBSD Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256112

I created fresh FreeBSD VMs with fresh installs of MariaDB 10.6.3

I started the first node with "mysqld_safe --wsrep-new-cluster" and tried to have the second node connect.

No data was changed on either node, just trying to get them to communicate.

The first node remains in the "sync" state the entire time, never entering the "donor" state.

The second node starts rsync, and is stuck listening for a connection indefinitely.

Using the default configs for FreeBSD, with default bind-address commented out, and then this cnf file added on both nodes:

[galera]
wsrep_on=ON
wsrep_provider = /usr/local/lib/libgalera_smm.so
wsrep_provider_options = "gmcast.segment=202;pc.weight=1"
wsrep_cluster_name = "testdb"
wsrep_cluster_address = "gcomm://192.168.20.132,192.168.20.122"
wsrep_sst_method = rsync
 
[mysqld]
binlog_format=ROW
bind-address=0.0.0.0

On the first node (donor):

2021-07-07  4:44:25 0 [Note] WSREP: Loading provider /usr/local/lib/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:44:25 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/local/lib/libgalera_smm.so'
2021-07-07  4:44:25 0 [Note] WSREP: wsrep_load(): Galera 4.6(rrelease_26.4.6) by Codership Oy <info@codership.com> loaded successfully.
2021-07-07  4:44:25 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2021-07-07  4:44:25 0 [Note] WSREP: Found saved state: 4e704a59-dedc-11eb-a825-dfcedc24af98:-1, safe_to_bootstrap: 1
2021-07-07  4:44:25 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 4e704a59-dedc-11eb-a825-dfcedc24af98
Seqno: -1 - -1
Offset: -1
Synced: 0
2021-07-07  4:44:25 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 4e704a59-dedc-11eb-a825-dfcedc24af98, offset: -1
2021-07-07  4:44:25 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
2021-07-07  4:44:25 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
2021-07-07  4:44:25 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 1-4
2021-07-07  4:44:25 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...  0.0% (   0/1264 bytes) complete.
2021-07-07  4:44:25 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (1264/1264 bytes) complete.
2021-07-07  4:44:25 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 3/7 locked buffers
2021-07-07  4:44:25 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): free space: 134217016/134217728
2021-07-07  4:44:25 0 [Note] WSREP: Passing config to GCS: base_dir = /var/db/mysql/; base_host = 192.168.20.132; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/db/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 202; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum 
2021-07-07  4:44:25 0 [Note] WSREP: Start replication
2021-07-07  4:44:25 0 [Note] WSREP: Connecting with bootstrap option: 1
2021-07-07  4:44:25 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:44:25 0 [Note] WSREP: protonet asio version 0
2021-07-07  4:44:25 0 [Note] WSREP: Using CRC-32C for message checksums.
2021-07-07  4:44:25 0 [Note] WSREP: backend: asio
2021-07-07  4:44:25 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2021-07-07  4:44:25 0 [Note] WSREP: restore pc from disk successfully
2021-07-07  4:44:25 0 [Note] WSREP: GMCast version 0
2021-07-07  4:44:25 0 [Note] WSREP: (dff40154-8099, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2021-07-07  4:44:25 0 [Note] WSREP: (dff40154-8099, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2021-07-07  4:44:25 0 [Note] WSREP: EVS version 1
2021-07-07  4:44:25 0 [Note] WSREP: gcomm: bootstrapping new group 'testdb'
2021-07-07  4:44:25 0 [Note] WSREP: start_prim is enabled, turn off pc_recovery
2021-07-07  4:44:25 0 [Note] WSREP: EVS version upgrade 0 -> 1
2021-07-07  4:44:25 0 [Note] WSREP: PC protocol upgrade 0 -> 1
2021-07-07  4:44:25 0 [Note] WSREP: Node dff40154-8099 state prim
2021-07-07  4:44:25 0 [Note] WSREP: view(view_id(PRIM,dff40154-8099,3) memb {
	dff40154-8099,202
} joined {
} left {
} partitioned {
})
2021-07-07  4:44:25 0 [Note] WSREP: save pc into disk
2021-07-07  4:44:25 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.20.122:4567
2021-07-07  4:44:25 0 [Note] WSREP: discarding pending addr proto entry 0x42435460
2021-07-07  4:44:25 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.20.132:4567
2021-07-07  4:44:25 0 [Note] WSREP: discarding pending addr proto entry 0x42435540
2021-07-07  4:44:25 0 [Note] WSREP: clear restored view
2021-07-07  4:44:25 0 [Note] WSREP: gcomm: connected
2021-07-07  4:44:25 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2021-07-07  4:44:25 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2021-07-07  4:44:25 0 [Note] WSREP: Opened channel 'testdb'
2021-07-07  4:44:25 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2021-07-07  4:44:25 0 [Note] WSREP: Starting new group from scratch: 01bbfaaf-dede-11eb-b363-16edbc96f9c8
2021-07-07  4:44:25 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 01bc0065-dede-11eb-85ce-9a71a683bc8e
2021-07-07  4:44:25 1 [Note] WSREP: Starting applier thread 1
2021-07-07  4:44:25 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 01bc0065-dede-11eb-85ce-9a71a683bc8e
2021-07-07  4:44:25 0 [Note] WSREP: STATE EXCHANGE: got state msg: 01bc0065-dede-11eb-85ce-9a71a683bc8e from 0 (arm-galera-10-6-2)
2021-07-07  4:44:25 0 [Note] WSREP: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 0,
	members    = 1/1 (joined/total),
	act_id     = 0,
	last_appl. = 0,
	protocols  = 2/10/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 01bbfaaf-dede-11eb-b363-16edbc96f9c8
2021-07-07  4:44:25 0 [Note] WSREP: Flow-control interval: [16, 16]
2021-07-07  4:44:25 0 [Note] WSREP: Restored state OPEN -> JOINED (1)
2021-07-07  4:44:25 0 [Note] WSREP: Member 0.202 (arm-galera-10-6-2) synced with group.
2021-07-07  4:44:25 2 [Note] WSREP: Starting rollbacker thread 2
2021-07-07  4:44:25 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 1)
2021-07-07  4:44:25 1 [Note] WSREP: ####### processing CC 1, local, ordered
2021-07-07  4:44:25 1 [Note] WSREP: Process first view: 01bbfaaf-dede-11eb-b363-16edbc96f9c8 my uuid: dff40154-dedd-11eb-8099-3206581b79af
2021-07-07  4:44:25 1 [Note] WSREP: Server arm-galera-10-6-2 connected to cluster at position 01bbfaaf-dede-11eb-b363-16edbc96f9c8:1 with ID dff40154-dedd-11eb-8099-3206581b79af
2021-07-07  4:44:25 1 [Note] WSREP: Server status change disconnected -> connected
2021-07-07  4:44:25 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:25 1 [Note] WSREP: ####### My UUID: dff40154-dedd-11eb-8099-3206581b79af
2021-07-07  4:44:25 1 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: no
2021-07-07  4:44:25 0 [Note] WSREP: Service thread queue flushed.
2021-07-07  4:44:25 1 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2021-07-07  4:44:25 1 [Note] WSREP: REPL Protocols: 10 (5)
2021-07-07  4:44:25 1 [Note] WSREP: ####### Adjusting cert position: -1 -> 1
2021-07-07  4:44:25 0 [Note] WSREP: Service thread queue flushed.
2021-07-07  4:44:25 1 [Note] WSREP: GCache history reset: 4e704a59-dedc-11eb-a825-dfcedc24af98:4 -> 01bbfaaf-dede-11eb-b363-16edbc96f9c8:0
2021-07-07  4:44:25 1 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): discarded 712 bytes
2021-07-07  4:44:25 1 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): found 0/1 locked buffers
2021-07-07  4:44:25 1 [Note] WSREP: ================================================
View:
  id: 01bbfaaf-dede-11eb-b363-16edbc96f9c8:1
  status: primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(1):
	0: dff40154-dedd-11eb-8099-3206581b79af, arm-galera-10-6-2
=================================================
2021-07-07  4:44:25 1 [Note] WSREP: Server status change connected -> joiner
2021-07-07  4:44:25 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:25 1 [Note] WSREP: Server status change joiner -> initializing
2021-07-07  4:44:25 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:25 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-07-07  4:44:25 0 [Note] InnoDB: Number of pools: 1
2021-07-07  4:44:25 0 [Note] InnoDB: Using generic crc32 instructions
2021-07-07  4:44:25 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2021-07-07  4:44:25 0 [Note] InnoDB: Completed initialization of buffer pool
2021-07-07  4:44:25 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=78394,78394
2021-07-07  4:44:25 0 [Note] InnoDB: Starting final batch to recover 4 pages from redo log.
2021-07-07  4:44:25 0 [Note] InnoDB: 128 rollback segments are active.
2021-07-07  4:44:25 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2021-07-07  4:44:25 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-07-07  4:44:25 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-07-07  4:44:25 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-07-07  4:44:25 0 [Note] InnoDB: 10.6.3 started; log sequence number 82877; transaction id 77
2021-07-07  4:44:25 0 [Note] InnoDB: Loading buffer pool(s) from /var/db/mysql/ib_buffer_pool
2021-07-07  4:44:25 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-07-07  4:44:25 0 [Note] Recovering after a crash using tc.log
2021-07-07  4:44:25 0 [Note] Starting table crash recovery...
2021-07-07  4:44:25 0 [Note] Crash table recovery finished.
2021-07-07  4:44:25 0 [Note] InnoDB: Buffer pool(s) load completed at 210707  4:44:25
2021-07-07  4:44:25 0 [Note] Server socket created on IP: '0.0.0.0'.
2021-07-07  4:44:25 0 [Note] WSREP: wsrep_init_schema_and_SR 0x0
2021-07-07  4:44:25 0 [Note] WSREP: Server initialized
2021-07-07  4:44:25 0 [Note] WSREP: Server status change initializing -> initialized
2021-07-07  4:44:25 0 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:25 1 [Note] WSREP: Bootstrapping a new cluster, setting initial position to 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:44:26 4 [Note] WSREP: Recovered cluster id 4e704a59-dedc-11eb-a825-dfcedc24af98
2021-07-07  4:44:26 1 [Note] WSREP: Server status change initialized -> joined
2021-07-07  4:44:26 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:26 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:44:26 0 [Note] /usr/local/libexec/mariadbd: ready for connections.
Version: '10.6.3-MariaDB'  socket: '/var/run/mysql/mysql.sock'  port: 3306  FreeBSD Ports
2021-07-07  4:44:26 1 [Note] WSREP: Lowest cert index boundary for CC from group: 1
2021-07-07  4:44:26 1 [Note] WSREP: Min available from gcache for CC from group: 1
2021-07-07  4:44:26 1 [Note] WSREP: Server arm-galera-10-6-2 synced with group
2021-07-07  4:44:26 1 [Note] WSREP: Server status change joined -> synced
2021-07-07  4:44:26 1 [Note] WSREP: Synchronized with group, ready for connections
2021-07-07  4:44:26 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:45:08 0 [Note] WSREP: (dff40154-8099, 'tcp://0.0.0.0:4567') connection established to 1b87f978-a60b tcp://192.168.20.122:4567
2021-07-07  4:45:08 0 [Note] WSREP: (dff40154-8099, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
2021-07-07  4:45:09 0 [Note] WSREP: declaring 1b87f978-a60b at tcp://192.168.20.122:4567 stable
2021-07-07  4:45:09 0 [Note] WSREP: Node dff40154-8099 state prim
2021-07-07  4:45:09 0 [Note] WSREP: view(view_id(PRIM,1b87f978-a60b,4) memb {
	1b87f978-a60b,202
	dff40154-8099,202
} joined {
} left {
} partitioned {
})
2021-07-07  4:45:09 0 [Note] WSREP: save pc into disk
2021-07-07  4:45:09 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2021-07-07  4:45:09 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2021-07-07  4:45:09 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 1c291b89-dede-11eb-9bde-0648d8449482
2021-07-07  4:45:09 0 [Note] WSREP: STATE EXCHANGE: got state msg: 1c291b89-dede-11eb-9bde-0648d8449482 from 0 (arm-galera-10-6-1)
2021-07-07  4:45:09 0 [Note] WSREP: STATE EXCHANGE: got state msg: 1c291b89-dede-11eb-9bde-0648d8449482 from 1 (arm-galera-10-6-2)
2021-07-07  4:45:09 0 [Note] WSREP: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 1,
	members    = 1/2 (joined/total),
	act_id     = 1,
	last_appl. = 0,
	protocols  = 2/10/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 01bbfaaf-dede-11eb-b363-16edbc96f9c8
2021-07-07  4:45:09 0 [Note] WSREP: Flow-control interval: [23, 23]
2021-07-07  4:45:09 1 [Note] WSREP: ####### processing CC 2, local, ordered
2021-07-07  4:45:09 1 [Note] WSREP: ####### My UUID: dff40154-dedd-11eb-8099-3206581b79af
2021-07-07  4:45:09 1 [Note] WSREP: Skipping cert index reset
2021-07-07  4:45:09 1 [Note] WSREP: REPL Protocols: 10 (5)
2021-07-07  4:45:09 1 [Note] WSREP: ####### Adjusting cert position: 1 -> 2
2021-07-07  4:45:09 0 [Note] WSREP: Service thread queue flushed.
2021-07-07  4:45:09 1 [Note] WSREP: ================================================
View:
  id: 01bbfaaf-dede-11eb-b363-16edbc96f9c8:2
  status: primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 1
  members(2):
	0: 1b87f978-dede-11eb-a60b-b74d48a2b703, arm-galera-10-6-1
	1: dff40154-dedd-11eb-8099-3206581b79af, arm-galera-10-6-2
=================================================
2021-07-07  4:45:09 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:45:09 1 [Note] WSREP: Lowest cert index boundary for CC from group: 2
2021-07-07  4:45:09 1 [Note] WSREP: Min available from gcache for CC from group: 1
2021-07-07  4:45:11 0 [Note] WSREP: (dff40154-8099, 'tcp://0.0.0.0:4567') turning message relay requesting off

On the second node (joiner):

2021-07-07  4:45:08 0 [Note] WSREP: Loading provider /usr/local/lib/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:45:08 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/local/lib/libgalera_smm.so'
2021-07-07  4:45:09 0 [Note] WSREP: wsrep_load(): Galera 4.6(rrelease_26.4.6) by Codership Oy <info@codership.com> loaded successfully.
2021-07-07  4:45:09 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2021-07-07  4:45:09 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2021-07-07  4:45:09 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 00000000-0000-0000-0000-000000000000
Seqno: -1 - -1
Offset: -1
Synced: 0
2021-07-07  4:45:09 0 [Note] WSREP: Skipped GCache ring buffer recovery: could not determine history UUID.
2021-07-07  4:45:09 0 [Note] WSREP: Passing config to GCS: base_dir = /var/db/mysql/; base_host = 192.168.20.122; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/db/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 202; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum 
2021-07-07  4:45:09 0 [Note] WSREP: Start replication
2021-07-07  4:45:09 0 [Note] WSREP: Connecting with bootstrap option: 0
2021-07-07  4:45:09 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:45:09 0 [Note] WSREP: protonet asio version 0
2021-07-07  4:45:09 0 [Note] WSREP: Using CRC-32C for message checksums.
2021-07-07  4:45:09 0 [Note] WSREP: backend: asio
2021-07-07  4:45:09 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2021-07-07  4:45:09 0 [Warning] WSREP: access file(/var/db/mysql//gvwstate.dat) failed(No such file or directory)
2021-07-07  4:45:09 0 [Note] WSREP: restore pc from disk failed
2021-07-07  4:45:09 0 [Note] WSREP: GMCast version 0
2021-07-07  4:45:09 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2021-07-07  4:45:09 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2021-07-07  4:45:09 0 [Note] WSREP: EVS version 1
2021-07-07  4:45:09 0 [Note] WSREP: gcomm: connecting to group 'testdb', peer '192.168.20.132:,192.168.20.122:'
2021-07-07  4:45:09 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://192.168.20.122:4567
2021-07-07  4:45:09 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') connection established to dff40154-8099 tcp://192.168.20.132:4567
2021-07-07  4:45:09 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
2021-07-07  4:45:09 0 [Note] WSREP: EVS version upgrade 0 -> 1
2021-07-07  4:45:09 0 [Note] WSREP: declaring dff40154-8099 at tcp://192.168.20.132:4567 stable
2021-07-07  4:45:09 0 [Note] WSREP: PC protocol upgrade 0 -> 1
2021-07-07  4:45:09 0 [Note] WSREP: Node dff40154-8099 state prim
2021-07-07  4:45:09 0 [Note] WSREP: view(view_id(PRIM,1b87f978-a60b,4) memb {
	1b87f978-a60b,202
	dff40154-8099,202
} joined {
} left {
} partitioned {
})
2021-07-07  4:45:09 0 [Note] WSREP: save pc into disk
2021-07-07  4:45:10 0 [Note] WSREP: gcomm: connected
2021-07-07  4:45:10 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2021-07-07  4:45:10 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2021-07-07  4:45:10 0 [Note] WSREP: Opened channel 'testdb'
2021-07-07  4:45:10 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2021-07-07  4:45:10 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 1c291b89-dede-11eb-9bde-0648d8449482
2021-07-07  4:45:10 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 1c291b89-dede-11eb-9bde-0648d8449482
2021-07-07  4:45:10 0 [Note] WSREP: STATE EXCHANGE: got state msg: 1c291b89-dede-11eb-9bde-0648d8449482 from 0 (arm-galera-10-6-1)
2021-07-07  4:45:10 0 [Note] WSREP: STATE EXCHANGE: got state msg: 1c291b89-dede-11eb-9bde-0648d8449482 from 1 (arm-galera-10-6-2)
2021-07-07  4:45:10 0 [Note] WSREP: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 1,
	members    = 1/2 (joined/total),
	act_id     = 1,
	last_appl. = 0,
	protocols  = 2/10/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 01bbfaaf-dede-11eb-b363-16edbc96f9c8
2021-07-07  4:45:10 0 [Note] WSREP: Flow-control interval: [23, 23]
2021-07-07  4:45:10 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 2)
2021-07-07  4:45:10 2 [Note] WSREP: Starting rollbacker thread 2
2021-07-07  4:45:10 1 [Note] WSREP: Starting applier thread 1
2021-07-07  4:45:10 1 [Note] WSREP: ####### processing CC 2, local, ordered
2021-07-07  4:45:10 1 [Note] WSREP: Process first view: 01bbfaaf-dede-11eb-b363-16edbc96f9c8 my uuid: 1b87f978-dede-11eb-a60b-b74d48a2b703
2021-07-07  4:45:10 1 [Note] WSREP: Server arm-galera-10-6-1 connected to cluster at position 01bbfaaf-dede-11eb-b363-16edbc96f9c8:2 with ID 1b87f978-dede-11eb-a60b-b74d48a2b703
2021-07-07  4:45:10 1 [Note] WSREP: Server status change disconnected -> connected
2021-07-07  4:45:10 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:45:10 1 [Note] WSREP: ####### My UUID: 1b87f978-dede-11eb-a60b-b74d48a2b703
2021-07-07  4:45:10 1 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
2021-07-07  4:45:10 0 [Note] WSREP: Service thread queue flushed.
2021-07-07  4:45:10 1 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2021-07-07  4:45:10 1 [Note] WSREP: State transfer required: 
	Group state: 01bbfaaf-dede-11eb-b363-16edbc96f9c8:2
	Local state: 00000000-0000-0000-0000-000000000000:-1
2021-07-07  4:45:10 1 [Note] WSREP: Server status change connected -> joiner
2021-07-07  4:45:10 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-07-07  4:45:10 0 [Note] WSREP: Joiner monitor thread started to monitor
2021-07-07  4:45:10 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.20.122' --datadir '/var/db/mysql/' --defaults-extra-file '/usr/local/etc/mysql/my.cnf' --parent '989' --mysqld-args --defaults-extra-file=/usr/local/etc/mysql/my.cnf --basedir=/usr/local --datadir=/var/db/mysql --plugin-dir=/usr/local/lib/mysql/plugin --log-error=/var/log/mysql/mysqld.err --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --port=3306'
2021-07-07  4:45:12 0 [Note] WSREP: (1b87f978-a60b, 'tcp://0.0.0.0:4567') turning message relay requesting off



 Comments   
Comment by Vincent Milum Jr [ 2021-08-08 ]

Since this has sat idle for a month, I decided to delve further into this myself. Log files were certainly unhelpful in this one.

Turns out that the non-"lsof" path in the rsync sst script is once again broken. This has been something that has broken countless times over the year, and would be very much appreciated if there was actually a unit test for non-lsof users, or make it a hard requirement that this is no longer supported.

For reference: MDEV-10432

Comment by Vincent Milum Jr [ 2021-08-08 ]

This is the commit that broke sockstat by changing the parameters passed into it

https://github.com/MariaDB/server/commit/8fef2b8667f30e4562ac006f992f859d1defdc0e#diff-215ae7f8a5944630ef06ff325fede6c5209198b8a6283d01e58596c75e14a051

Comment by drkspc [ 2021-08-13 ]

From what I see, this issue affects all versions of MariaDB on FreeBSD.

wsrep_sst_rsync_freebsd.diff
This patch got me back into a working state. In addition to changing the parameters ('-s' to actually display the socket status) I found it necessary to adjust the filter applied to the output of sockstat. I did not test this against Linux's ss, though.

Comment by Jan Lindström (Inactive) [ 2021-08-16 ]

ok to push but please check sockstat usage (see comment on git). Why fix version contains 10.2-10.5 if only 10.6 is affected or is title misleading?

Comment by Vincent Milum Jr [ 2021-08-16 ]

Because the initial bug report was written up when I was porting 10.6 to FreeBSD. It was after the fact that it was discovered that the Galera regression that was introduced was also pushed to the older branches too. The title on this bug can probably be updated to reflect that now.

Comment by Julius Goryavsky [ 2021-08-16 ]

jplindst Our SST scripts are now common across all versions and this flaw is also present in all at the same time - it was introduced when the unification of checks for the presence of a previous instance of socat or rsync was made, because earlier sockstat was not supported for Linux, and lsof was not supported for FreeBSD. But during unification, I did not take into account that this utility gives a slightly different output in FreeBSD - initially I thought that it is identical in terms of basic options and output in modern systems, but this is not so - in FreeBSD sockstat is more advanced and supports additional options, while it does not have 100% compatible by output format.

Comment by Julius Goryavsky [ 2021-08-16 ]

drkspc Thanks for diff, I have been more careful to embed the changes so that they only affect FreeBSD, since the Linux sockstat implementation has different options and a different output format. At first I had the intention of dynamically distinguishing them, but judging by the sockstat documentation for FreeBSD, it does not support the "-h" option by which one could get a list of its options for soft checking support of "-s", so I added system name checking via uname, as was done before, when sockstat was only supported for FreeBSD (but at that time it was not supported at all for Linux in our scripts): https://github.com/MariaDB/server/commit/50428b3995d24233d8ba3e76ebb51a18761f335d

Comment by Julius Goryavsky [ 2021-08-16 ]

Fixed, https://github.com/MariaDB/server/commit/50428b3995d24233d8ba3e76ebb51a18761f335d

Generated at Thu Feb 08 09:42:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.