[MDEV-23647] Garbd can't initiate SST anymore in 10.5 Created: 2020-09-01  Updated: 2021-04-05  Resolved: 2021-01-21

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera Arbitrator garbd, Galera SST
Affects Version/s: 10.5.5
Fix Version/s: 10.4.18, 10.5.9

Type: Bug Priority: Critical
Reporter: Hartmut Holzgraefe Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None


 Description   

The galera arbitrator binary garbd can be used to trigger any actual cluster node to act as a donor towards the arbitrator using the -sst and -donor options.

Up to MariaDB 10.4 this works fine, but with 10.5 the SST request does not lead to the actual requested wsrep_sst_... script being executed anymore.

The donor node will just log:

Aug 21 15:10:27 node-1 mariadbd[2902]: 2020-08-21 15:10:27 0 [Note] WSREP: Member 0.0 (garbd) requested state transfer from 'node-1'. Selected 1.0 (node-1)(SYNCED) as donor.
Aug 21 15:10:27 node-1 mariadbd[2902]: 2020-08-21 15:10:27 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 3)
Aug 21 15:10:27 node-1 mariadbd[2902]: 2020-08-21 15:10:27 1 [Note] WSREP: ================================================
Aug 21 15:10:27 node-1 mariadbd[2902]: View:
Aug 21 15:10:27 node-1 mariadbd[2902]:   id: a0e49d83-e3bd-11ea-905f-ff2c5b67f9c2:3
Aug 21 15:10:27 node-1 mariadbd[2902]:   status: primary
Aug 21 15:10:27 node-1 mariadbd[2902]:   protocol_version: 4
Aug 21 15:10:27 node-1 mariadbd[2902]:   capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
Aug 21 15:10:27 node-1 mariadbd[2902]:   final: no
Aug 21 15:10:27 node-1 mariadbd[2902]:   own_index: 1
Aug 21 15:10:27 node-1 mariadbd[2902]:   members(3):
Aug 21 15:10:27 node-1 mariadbd[2902]:         0: 71303bfe-e3c0-11ea-b367-d2112cd7bec7, garbd
Aug 21 15:10:27 node-1 mariadbd[2902]:         1: a0e2712c-e3bd-11ea-9f04-db21411dccbc, node-1
Aug 21 15:10:27 node-1 mariadbd[2902]:         2: d2324935-e3bd-11ea-a3b6-da0d74e66e47, node-2
Aug 21 15:10:27 node-1 mariadbd[2902]: =================================================
Aug 21 15:10:27 node-1 mariadbd[2902]: 2020-08-21 15:10:27 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Aug 21 15:10:27 node-1 mariadbd[2902]: 2020-08-21 15:10:27 0 [Note] WSREP: 0.0 (garbd): State transfer from 1.0 (node-1) complete.

With 10.4 it shows the actual wsrep script being executed, but nothing like this is to be seen in the 10.5 donor log:

Aug 21 15:35:07 node-1 mysqld[2778]: 2020-08-21 15:35:07 0 [Note] WSREP: Running: 'wsrep_sst_backup --role 'donor' --address '' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --gtid 'b21e7239-e3c1-11ea-a1cb-f36fe61c8c1c:0' --gtid-domain-id '0' --binlog 'node-1-bin' --mysqld-args --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1'

And on the garbd side:

2020-08-21 15:10:27.576  INFO: Shifting OPEN -> PRIMARY (TO: 3)
2020-08-21 15:10:27.577  INFO: Sending state transfer request: 'backup', size: 6
2020-08-21 15:10:27.578  INFO: Member 0.0 (garbd) requested state transfer from 'node-1'. Selected 1.0 (node-1)(SYNCED) as donor.
2020-08-21 15:10:27.578  INFO: Shifting PRIMARY -> JOINER (TO: 3)
2020-08-21 15:10:27.578  INFO: Closing send monitor...
2020-08-21 15:10:27.578  INFO: Closed send monitor.
2020-08-21 15:10:27.578  INFO: gcomm: terminating thread
2020-08-21 15:10:27.578  INFO: gcomm: joining thread
2020-08-21 15:10:27.579  INFO: gcomm: closing backend
2020-08-21 15:10:27.580  INFO: 0.0 (garbd): State transfer from 1.0 (node-1) complete.
2020-08-21 15:10:27.580  INFO: Shifting JOINER -> JOINED (TO: 3)



 Comments   
Comment by Mario Karuza (Inactive) [ 2020-12-21 ]

Hartmut, you mentioned that this was working on 10.4. Which MariaDB & Galera version were used when this was working ?

Generated at Thu Feb 08 09:23:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.