Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13376

Stopping mariadb.service on joiner node during SST does not stop state transfer

    XMLWordPrintable

Details

    Description

      Example cluster with only two nodes: (normally would also have a third node)
      mysql1 = IP 10.0.0.2 (JOINER)
      mysql2 = IP 10.0.0.3 (DONOR)

      wsrep_sst_method=mysqldump
      (database has to be big enough for SST to take well over three minutes)

      DONOR node started with "galera_new_cluster", JOINER node started with "systemctl start mariadb.service" after deleting grastate.dat to make sure node needs SST.
      After SST has started, run "systemctl stop mariadb.service" on JOINER node and wait three minutes:

      root@mysql1:/# systemctl stop mariadb.service
      root@mysql1:/# 
      

      No errors, thus you would assume the node has been stopped successfully, right?

      In reality, what happens is SST keeps on running (confirmed by "ps aux | grep mysql") until fully finished, which could take a looong time depending on database size, after which JOINER node completes its shutdown and DONOR finally returns to SYNCED state.

      Now since the shutdown was not actually completed and no error was returned by systemctl, the user could try running "systemctl start mariadb.service" and end up with the new mysqld process failing to start. (I didn't try this but I assume it will fail gracefully like any other time)
      Knowing that mysqld has not actually stopped, one can of course kill it manually to interrupt the SST.

      When I run "systemctl stop", I would expect the node to be stopped (and thus SST to be interrupted), so the donor could return back to SYNCED state ASAP. Since interrupting an SST, the joining node's database consistency doesn't really matter as it will get a new SST anyway when started next time. If I was stopping a node in the middle of an IST, the current "non-interrupting" way could be preferred to possibly avoid SST on next startup in case a busy cluster. (didn't test this but I assume currently also IST wouldn't be interrupted)

      Attached logs from both nodes showing what happened.
      Added line "-- waiting for SST to finish, 12 minutes in this case" for clarity. Now imagine finding the cluster at that point in logs without knowing what has happened. You see one node tried to make an SST and apparently shutting down, donor node having formed new quorum alone and "doing nothing" (since it's still in DONOR/DESYNCED state although not that obvious with the new quorum). Without checking what processes are running, it's not that clear that the SST is still running.

      TL;DR; I would suggest the systemctl command should at least return failure notice similar to when startup fails:

      root@mysql1:/# systemctl start mariadb.service
      Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.
      

      Thus the user would immediately know to check status, which correctly shows something has failed:

      root@mysql1:/# systemctl status mariadb.service -l
      ● mariadb.service - MariaDB database server
         Loaded: loaded (/lib/systemd/system/mariadb.service; disabled)
        Drop-In: /etc/systemd/system/mariadb.service.d
                 └─migrated-from-my.cnf-settings.conf
         Active: failed (Result: timeout) since Sun 2017-07-23 14:24:58 EEST; 8s ago
        Process: 2650 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
        Process: 2647 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
        Process: 2250 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
        Process: 2244 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
        Process: 2241 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
       Main PID: 2597
         Status: "Taking your SQL requests now..."
      Jul 23 14:21:10 mysql1 mysqld[13061]: 2017-07-23 14:21:10 140440152819584 [Note] /usr/sbin/mysqld (mysqld 10.2.7-MariaDB-10.2.7+maria~jessie-log) starting as process 13061 ...
      Jul 23 14:21:12 mysql1 systemd[1]: Started MariaDB database server.
      Jul 23 14:21:57 mysql1 systemd[1]: Stopping MariaDB database server...
      Jul 23 14:23:27 mysql1 systemd[1]: mariadb.service stop-sigterm timed out. Skipping SIGKILL.
      Jul 23 14:24:58 mysql1 systemd[1]: mariadb.service stop-final-sigterm timed out. Skipping SIGKILL. Entering failed mode.
      Jul 23 14:24:58 mysql1 systemd[1]: Stopped MariaDB database server.
      Jul 23 14:24:58 mysql1 systemd[1]: Unit mariadb.service entered failed state.
      

      It would also be nice to get a message in logs stating what happens to the SST (or IST), e.g. "waiting until SST has finished" or "SST interrupted, node in inconsistent state, new SST required".

      Attachments

        1. joiner.log
          18 kB
        2. donor.log
          5 kB

        Issue Links

          Activity

            People

              anikitin Andrii Nikitin (Inactive)
              juhapyy Juha Pyy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.