[MDEV-703] LP:870310 - killall -9 in init-script Created: 2011-10-07  Updated: 2014-01-02  Resolved: 2013-06-14

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.2, 5.5.31
Fix Version/s: 10.0.4, 5.5.32

Type: Bug Priority: Minor
Reporter: Erkan Yanar Assignee: Sergei Golubchik
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug870310.xml    
Issue Links:
Relates
relates to MDEV-5110 init script for mariadb galera cluster Closed

 Description   

From /etc/init.d/mysql

#v+
  'stop')
        # * As a passwordless mysqladmin (e.g. via ~/.my.cnf) must be possible
        # at least for cron, we can rely on it here, too. (although we have 
        # to specify it explicit as e.g. sudo environments points to the normal
        # users home and not /root)
        log_daemon_msg "Stopping MariaDB database server" "mysqld"
        if ! mysqld_status check_dead nowarn; then
          set +e
          shutdown_out=`$MYADMIN shutdown 2>&1`; r=$?
          set -e
          if [ "$r" -ne 0 ]; then
            log_end_msg 1
            [ "$VERBOSE" != "no" ] && log_failure_msg "Error: $shutdown_out"
            log_daemon_msg "Killing MariaDB database server by signal" "mysqld"
            killall -15 mysqld
            server_down=
            for i in 1 2 3 4 5 6 7 8 9 10; do
              sleep 1
              if mysqld_status check_dead nowarn; then server_down=1; break; fi
            done
          if test -z "$server_down"; then killall -9 mysqld; fi
          fi
        fi
#v-

First I see no reason to use anything else than killall -15.
But:
Imagine the connections of the server are exhausted. Then after the killall -15 the server has 10 seconds to stop. After that a SIGKILL is gonna send.
Now if the shutdown is going to take some time especially for bigger databases with i.e. delay_key_write. The database is going to be corrupted afterwards.
Anyway a recovery is being forced.
So this is going to hurt bigger installations more likely.
It is also likely mysqladmin is going to fail, because you installed a dump from another server (then /etc/mysql/debian.cnf will not match the credentials in the server anymore)



 Comments   
Comment by Rasmus Johansson (Inactive) [ 2011-10-07 ]

Launchpad bug id: 870310

Comment by Kristian Nielsen [ 2013-03-01 ]

For what it is worth, this comes from original Debian MySQL packages.
I just checked, and the same killall -9 is still present in the latest
mysql-server-5.5 package on Debian wheezy.

However, nevertheless I agree that this code looks very wrong.

Comment by Elena Stepanova [ 2013-06-03 ]

It's 30 seconds now, but I think we still regularly observe the consequences. Most of production logs that we receive from users have numerous errors about possibly corrupted tables. We attributed them to previous crashes, but the amount of them suggests it's something more generic, and this kill fits. Shutdown becomes slower with each major version, servers become bigger, and 30 seconds for a big busy production server is not enough, so every time it gets shut down, it ends up killed.

Is it possible to make it somewhat smarter, or at least to increase the timeout further? After all, is somebody wants to kill the server immediately, they're unlikely to use the script, they will just shoot.

Comment by Sergei Golubchik [ 2013-06-14 ]

increased the timeout

Comment by erkan yanar [ 2014-01-02 ]

As a fact killall was and is never executed. (just a quick look indeed)
From the code:
#v+
shutdown_out=`$MYADMIN shutdown 2>&1`; r=$?
set -e
if [ "$r" -ne 0 ]; then
log_end_msg 1
....
killall -15 mysqld
#v-
because of log_end_msg 1 and set -e the script simply terminates and no killall is ever executed.

Generated at Thu Feb 08 06:30:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.