[MDEV-15466] SST not working after upgrade from 10.2.11 Created: 2018-03-05  Updated: 2018-07-25  Resolved: 2018-07-25

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera SST, wsrep
Affects Version/s: 10.2.13
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Cătălin Nicolescu Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 0
Labels: galera, mariabackup, rsync, wsrep, xtrabackup
Environment:

CentOS 7 minimal
MariaDB and Percona repos configured
percona-xtrabackup-24


Attachments: File donor-mariabackup-innobackup.backup.log     File donor-mariabackup-mysql-error.log     File donor-rsync-mysql-error.log     File donor-xtrabackup-mysql-error.log     File joiner-mariabackup-mysql-error.log     File joiner-rsync-mysql-error.log     File joiner-xtrabackup-mysql-error.log     File server.cnf    
Issue Links:
Relates
relates to MDEV-15607 mysqld crashed few after node is bein... Closed
relates to MDEV-15436 If log_bin and log_bin_index is diffe... Closed
relates to MDEV-15453 IST failed during upgrade of version ... Closed

 Description   

gcache.size was sufficient enough not to require a SST on the previous upgrades so the problem may be older.

2 node + garb setup running since 10.2.9

Yesterday after a blackout, tried restarting the cluster. 2nd node was damaged bad, so I wiped out the entire datadir in order to do a full SST

initial config was using rsync. tried to do it with mariabackup and xtrabackup-v2

both servers have identical config, except for node names
firewall and selinux both off



 Comments   
Comment by Jan Lindström (Inactive) [ 2018-06-15 ]

Hi, thank you for your bug report and detailed logs. Joiner is killed during SST, at least in mariabackup case
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20180305 10:52:16.898)
2018-03-05 10:52:18 140327973275392 [Note] WSREP: (80ad3090, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [ERROR] Removing /var/lib/mysql//.sst/xtrabackup_galera_info file due to signal (20180305 10:53:44.684)
could it be due to systemd?

Comment by Cătălin Nicolescu [ 2018-06-15 ]

Yes.
Doing systemctl edit mariadb.service and overriding with

[Service]
TimeoutStartSec=0

did the trick for version 10.2.15 but I think that this was the issue all along

Generated at Thu Feb 08 08:21:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.