Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17571

Make systemd timeout behavior more compatible with long Galera SSTs

    XMLWordPrintable

    Details

      Description

      SSTs can take several hours in many cases, but the current default value of TimeoutStartSec causes systemd to force the joiner node to timeout in less than a couple minutes. It might make sense to disable systemd service's timeout by default instead.

      Depending on the systemd version, disabling this means setting either:

      TimeoutStartSec=infinity (if systemd version >=229)

      TimeoutStartSec=0 (if systemd version <=228)

      The following documentation section that describes current behavior:

      https://mariadb.com/kb/en/library/introduction-to-state-snapshot-transfers-ssts/#ssts-and-systemd

      If we don't want to disable the systemd timeout by default, then it might make more sense to extend the timeout during SSTs by doing the following:

      If a service of Type=notify sends "EXTEND_TIMEOUT_USEC=…", this may cause the start time to be extended beyond TimeoutStartSec=. The first receipt of this message must occur before TimeoutStartSec= is exceeded, and once the start time has exended beyond TimeoutStartSec=, the service manager will allow the service to continue to start, provided the service repeats "EXTEND_TIMEOUT_USEC=…" within the interval specified until the service startup status is finished by "READY=1". (see sd_notify(3)).

      https://www.freedesktop.org/software/systemd/man/systemd.service.html

      It looks like this approach was attempted while fixing MDEV-15607, but users are still seeing timeouts during SSTs, so it may not be working properly. It looks like this is the relevant commit:

      https://github.com/mariadb/server/commit/be5698265a4195586142d1a34fdd1cce9d95d8a1

      The relevant service_manager_extend_timeout function seems to be defined here:

      https://github.com/MariaDB/server/blob/be5698265a4195586142d1a34fdd1cce9d95d8a1/include/my_service_manager.h#L30

      And it sets the EXTEND_TIMEOUT_USEC environment variable mentioned in the systemd manual.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jplindst Jan Lindström
                Reporter:
                claudio.nanni Claudio Nanni
              • Votes:
                4 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: