[MDEV-13580] SIGTERM must terminate sst Created: 2017-08-18  Updated: 2021-12-23

Status: Open
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 10.1.26, 10.2.7
Fix Version/s: 10.2

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-13376 Stopping mariadb.service on joiner no... Closed

 Description   

Script below downlods .tar, installs local cluster on ports 3307 and 3308 and demonstrates that mysqld process remains hanging around during (long) sst transfer despite SIGTERM signal was received

set -e
M7VER=10.2.7
 
# just use current directory if called from framework
if [ ! -f common.sh ] ; then
  [ -d mariadb-environs ] || git clone http://github.com/AndriiNikitin/mariadb-environs
  cd mariadb-environs
  ./get_plugin.sh galera
fi
 
function onExit {
  mv _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump.orig _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump
  if [ "$inited" != 1 ] ; then
    echo FAIL - failed to initialize
  elif [ ! -f m2*/dt/p.id ] ; then
    echo FAIL - the node is not running
  elif [ "$passed" != 1 ] ; then
    echo FAIL - sst wasn\'t cancelled on SIGTERM after 45 sec
    echo process $(cat m2*/dt/p.id) is still up at $(date +"%T")
    ps auxww | grep mysqld | grep $(cat m2*/dt/p.id)
    echo shutdown initiated at:
    grep -i shutdown m2*/dt/error.log
    echo last lines in log:
    tail -n 4 m2*/dt/error.log
  fi
}
trap onExit EXIT
 
echo CLEANING UP ...
[ -f c1/cleanup.sh ] && c1/cleanup.sh || :
 
echo GENERATE TEMPLATES ...
_template/plant_cluster.sh c1
echo m1 > c1/nodes.lst
echo m2 >> c1/nodes.lst
c1/replant.sh $M7VER
 
echo DOWNLOAD BINARIES
./build_or_download.sh m1
./build_or_download.sh m2
 
echo HACK SST SCRIPT ...
# one more backup of sst script just in case
[ -f _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump.orig1 ] || cp _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump.orig1
 
# mysqldump script is handled in special way inside sst, so we will use that name to be extra sure
# let's make backup of the script and insert simple sleep instead
cp _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump.orig
echo '#!/bin/bash' > _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump
echo 'for i in {1..150}; do sleep 1; done ' >> _depot/m-tar/${M7VER}/bin/wsrep_sst_mysqldump
 
echo INIT NEW CLUSTER ...
c1/gen_cnf.sh
c1/install_db.sh
c1/galera_setup_acl.sh
 
. c1/galera_start_new.sh wsrep_sst_method=mysqldump
 
echo WAITING A BIT ...
m2*/status.sh && inited=1
sleep 15
echo SENDING SIGTERM AT $(date +"%T")
kill $(cat m2*/dt/p.id)
 
echo WAIT MORE ...
sleep 45
echo CHECK IF SST IS STILL UP
set +e
if kill -0 $(cat m2*/dt/p.id) ; then
  exit 1
else
  echo PASS
  passed=1
fi

Output:

SENDING SIGTERM AT 08:13:17
WAIT MORE ...
CHECK IF SST IS STILL UP
FAIL - sst wasn't cancelled on SIGTERM after 45 sec
process 10869 is still up at 08:14:02
buildbot 10869  6.7  2.6 2426468 108380 pts/0  Sl+  08:12   0:05 /home/buildbot/mariadb-environs/m2-10.2.7/../_depot/m-tar/10.2.7/bin/mysqld --defaults-file=/home/buildbot/mariadb-environs/m2-10.2.7/my.cnf --basedir=/home/buildbot/mariadb-environs/_depot/m-tar/10.2.7 --datadir=/home/buildbot/mariadb-environs/m2-10.2.7/dt --plugin-dir=/home/buildbot/mariadb-environs/m2-10.2.7/../_depot/m-tar/10.2.7/lib/plugin --wsrep_provider=/usr/lib/galera/libgalera_smm.so --wsrep_on=ON --log-error=/home/buildbot/mariadb-environs/m2-10.2.7/dt/error.log --pid-file=/home/buildbot/mariadb-environs/m2-10.2.7/dt/p.id --socket=/home/buildbot/mariadb-environs/m2-10.2.7/dt/my.sock --port=3308 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
shutdown initiated at:
2017-08-18  8:12:25 139997963806464 [Note] /home/buildbot/mariadb-environs/m2-10.2.7/../_depot/m-tar/10.2.7/bin/mysqld (root[root] @ localhost []): Normal shutdown


Generated at Thu Feb 08 08:06:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.