Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
-
None
Description
After the code for detecting busy ports was improved in the SST script code, the SST script for rsync sometimes began to diagnose an error associated with a busy port, which especially often happens when running some tests in parallel or when restarting quickly after failures:
2021-05-25 7:26:53 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.1:16020' --datadir '/dev/shm/bb-10.4-merge/mysql-test/var/1/mysqld.5/data/' --defaults-file '/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf' --defaults-group-suffix '.5' --parent '107939' --binlog 'mysqld-bin' --binlog-index 'mysqld-bin.index' --mysqld-args --defaults-group-suffix=.5 --defaults-file=/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --core-file --loose-debug-sync-timeout=300'
|
WSREP_SST: [ERROR] rsync or stunnel daemon port '16020' has been taken by another program (20210525 07:26:53.410)
|
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 109214 (20210525 07:26:53.412)
|
/dev/shm/bb-10.4-merge/scripts/wsrep_sst_rsync: line 41: kill: (109214) - No such process
|
WSREP_SST: [INFO] Joiner cleanup done. (20210525 07:26:53.415)
|
2021-05-25 7:26:53 0 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '127.0.0.1:16020' --datadir '/dev/shm/bb-10.4-merge/mysql-test/var/1/mysqld.5/data/' --defaults-file '/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf' --defaults-group-suffix '.5' --parent '107939' --binlog 'mysqld-bin' --binlog-index 'mysqld-bin.index' --mysqld-args --defaults-group-suffix=.5 --defaults-file=/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --core-file --loose-debug-sync-timeout=300
|
Read: '(null)'
|
2021-05-25 7:26:53 0 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '127.0.0.1:16020' --datadir '/dev/shm/bb-10.4-merge/mysql-test/var/1/mysqld.5/data/' --defaults-file '/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf' --defaults-group-suffix '.5' --parent '107939' --binlog 'mysqld-bin' --binlog-index 'mysqld-bin.index' --mysqld-args --defaults-group-suffix=.5 --defaults-file=/dev/shm/bb-10.4-merge/mysql-test/var/1/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --core-file --loose-debug-sync-timeout=300: 16 (Device or resource busy)
|
2021-05-25 7:26:53 2 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable.
|
MDEV-25818: RSYNC SST failed due to busy port
This commit reduces the likelihood of getting a busy port on
quick restarts with rsync SST (problem MDEV-25818) and fixes
a number of other flaws in SST scripts, adds new functionality,
and also synchronizes the xtrabackup-v2 script with the
mariabackup script (the latter applies only to the 10.2 branch):
1) SST via rsync: rsync and stunnel does not always get the right
time to complete by correctly handling SIGTERM. These utilities
are now given more time to complete normally (via normal SIGTERM
processing) before we move on to using "kill -9";
2) SST via rsync: attempts to terminate an rsync or stunnel process
(via "kill" utility) are only made if it did not terminated on
its own;
3) SST via rsync: if a combination of stunnel and rsync is used,
then we need to wait for both utilities to finish or stop, not
just one of them;
4) The config file and pid file for stunnel are now deleted after
successful completion of SST on the donor node;
5) The configs and pid files from rsync and stunnel should not be
deleted unless these utilities succeed (or are sucessfully
terminated) on the joiner node;
6) The configs and pid files now excluded from transfer via rsync;
7) Spaces in paths are now valid for config files as well (when
used with SST via rsync or mariabackup / xtrabackup[-v2]);
8) SST via mariabackup: added preliminary verification of keys and
certificates that are used when establishing a connection using
SSL (to avoid long timeouts and improve diagnostics) - by analogy
with how it is done for the xtrabackup-v2 (plus check for CA file),
while that check is skipped if the user does not have openssl
installed (or does not have diff utility);
9) Added backup-threads=<n> configuration option which adds
"--parallel=<n>" for mariabackup / xtrabackup at backup and
move-back stages;
10) Added encrypt-threads and encrypt-chunk-size configuration
options for xbcrypt management (when xbcrypt is used);
11) Small optimization: checking the socat version and adding
a file with parameters for 2048-bit Diffie-Hellman (if necessary)
is done only if the user has not specified "dhparam=" in the
"sockopt" option value;
12) SST via rsync now supports "backup-threads" configuration option
(in server-related sections or in the "[sst]");
13) Determining the number of available processors is now supported
for FreeBSD + mariabackup/xtrabackup: before that we might have
problems with "--compact" (rebuild indexes) or qpress on FreeBSD;
14) The check_pid() function should not raise an error state in
the rare cases when the pid file was created, but it is empty,
or if it is deleted right during the check, or when zero is read
from the pid file;
15) Iproved templates that are used to check if a requested socket
is "listening" when using the ss utility;
16) Shortened some other templates for socket state utilities;
17) Temporary files created by mariabackup / xtrabackup are moved
to a separate subdirectory inside tmpdir (so they don't get
mixed with other temporary files, which can make debugging
more difficult);
18) 10.2 only: the script for SST via xtrabackup-v2 has been brought
in full compliance with all the bugfixes made for mariabackup (as
it previously contained many flaws compared to the updated script
for mariabackup).
10.2: https://github.com/MariaDB/server/commit/3bfbd805adf4c0504f230b673fa213ed97301e94
10.6: https://github.com/MariaDB/server/commit/87cd77599a00c1d806c8d703e21bc4578e3e5e79
Tests:
http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-MDEV-25818
http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.6-MDEV-25818-galera
Galera BB (10.2):
http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-galera