Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-703

LP:870310 - killall -9 in init-script

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 10.0.2, 5.5.31
    • 10.0.4, 5.5.32
    • None

    Description

      From /etc/init.d/mysql

      #v+
        'stop')
              # * As a passwordless mysqladmin (e.g. via ~/.my.cnf) must be possible
              # at least for cron, we can rely on it here, too. (although we have 
              # to specify it explicit as e.g. sudo environments points to the normal
              # users home and not /root)
              log_daemon_msg "Stopping MariaDB database server" "mysqld"
              if ! mysqld_status check_dead nowarn; then
                set +e
                shutdown_out=`$MYADMIN shutdown 2>&1`; r=$?
                set -e
                if [ "$r" -ne 0 ]; then
                  log_end_msg 1
                  [ "$VERBOSE" != "no" ] && log_failure_msg "Error: $shutdown_out"
                  log_daemon_msg "Killing MariaDB database server by signal" "mysqld"
                  killall -15 mysqld
                  server_down=
                  for i in 1 2 3 4 5 6 7 8 9 10; do
                    sleep 1
                    if mysqld_status check_dead nowarn; then server_down=1; break; fi
                  done
                if test -z "$server_down"; then killall -9 mysqld; fi
                fi
              fi
      #v-

      First I see no reason to use anything else than killall -15.
      But:
      Imagine the connections of the server are exhausted. Then after the killall -15 the server has 10 seconds to stop. After that a SIGKILL is gonna send.
      Now if the shutdown is going to take some time especially for bigger databases with i.e. delay_key_write. The database is going to be corrupted afterwards.
      Anyway a recovery is being forced.
      So this is going to hurt bigger installations more likely.
      It is also likely mysqladmin is going to fail, because you installed a dump from another server (then /etc/mysql/debian.cnf will not match the credentials in the server anymore)

      Attachments

        Issue Links

          Activity

            Launchpad bug id: 870310

            ratzpo Rasmus Johansson (Inactive) added a comment - Launchpad bug id: 870310

            For what it is worth, this comes from original Debian MySQL packages.
            I just checked, and the same killall -9 is still present in the latest
            mysql-server-5.5 package on Debian wheezy.

            However, nevertheless I agree that this code looks very wrong.

            knielsen Kristian Nielsen added a comment - For what it is worth, this comes from original Debian MySQL packages. I just checked, and the same killall -9 is still present in the latest mysql-server-5.5 package on Debian wheezy. However, nevertheless I agree that this code looks very wrong.

            It's 30 seconds now, but I think we still regularly observe the consequences. Most of production logs that we receive from users have numerous errors about possibly corrupted tables. We attributed them to previous crashes, but the amount of them suggests it's something more generic, and this kill fits. Shutdown becomes slower with each major version, servers become bigger, and 30 seconds for a big busy production server is not enough, so every time it gets shut down, it ends up killed.

            Is it possible to make it somewhat smarter, or at least to increase the timeout further? After all, is somebody wants to kill the server immediately, they're unlikely to use the script, they will just shoot.

            elenst Elena Stepanova added a comment - It's 30 seconds now, but I think we still regularly observe the consequences. Most of production logs that we receive from users have numerous errors about possibly corrupted tables. We attributed them to previous crashes, but the amount of them suggests it's something more generic, and this kill fits. Shutdown becomes slower with each major version, servers become bigger, and 30 seconds for a big busy production server is not enough, so every time it gets shut down, it ends up killed. Is it possible to make it somewhat smarter, or at least to increase the timeout further? After all, is somebody wants to kill the server immediately, they're unlikely to use the script, they will just shoot.

            increased the timeout

            serg Sergei Golubchik added a comment - increased the timeout
            erkules erkan yanar added a comment -

            As a fact killall was and is never executed. (just a quick look indeed)
            From the code:
            #v+
            shutdown_out=`$MYADMIN shutdown 2>&1`; r=$?
            set -e
            if [ "$r" -ne 0 ]; then
            log_end_msg 1
            ....
            killall -15 mysqld
            #v-
            because of log_end_msg 1 and set -e the script simply terminates and no killall is ever executed.

            erkules erkan yanar added a comment - As a fact killall was and is never executed. (just a quick look indeed) From the code: #v+ shutdown_out=`$MYADMIN shutdown 2>&1`; r=$? set -e if [ "$r" -ne 0 ]; then log_end_msg 1 .... killall -15 mysqld #v- because of log_end_msg 1 and set -e the script simply terminates and no killall is ever executed.

            People

              serg Sergei Golubchik
              erkanyanar Erkan Yanar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.