Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17571

Make systemd timeout behavior more compatible with long Galera SSTs

Details

    Description

      SSTs can take several hours in many cases, but the current default value of TimeoutStartSec causes systemd to force the joiner node to timeout in about 90 seconds. It might make sense to disable systemd service's timeout by default instead.

      Depending on the systemd version, disabling the startup timeout means setting either TimeoutStartSec=0 (if systemd version <=228) or TimeoutStartSec=infinity (if systemd version >=229).

      In systemd 236 and later, the startup timeout can be extended by setting EXTEND_TIMEOUT_USEC:

      If a service of Type=notify sends "EXTEND_TIMEOUT_USEC=…", this may cause the start time to be extended beyond TimeoutStartSec=. The first receipt of this message must occur before TimeoutStartSec= is exceeded, and once the start time has exended beyond TimeoutStartSec=, the service manager will allow the service to continue to start, provided the service repeats "EXTEND_TIMEOUT_USEC=…" within the interval specified until the service startup status is finished by "READY=1". (see sd_notify(3)).

      https://www.freedesktop.org/software/systemd/man/systemd.service.html

      It looks like this approach was used to extend the startup timeout during SSTs while fixing MDEV-15607. It looks like this is the relevant commit:

      https://github.com/mariadb/server/commit/be5698265a4195586142d1a34fdd1cce9d95d8a1

      The relevant service_manager_extend_timeout function seems to be defined here:

      https://github.com/MariaDB/server/blob/be5698265a4195586142d1a34fdd1cce9d95d8a1/include/my_service_manager.h#L30

      And it sets the EXTEND_TIMEOUT_USEC environment variable mentioned in the systemd manual.

      However, a lot of users are still seeing startup timeouts during SSTs. The cause seems to be that most systemd installations are not yet using version 236 or later.

      The following documentation section that describes current behavior:

      https://mariadb.com/kb/en/library/introduction-to-state-snapshot-transfers-ssts/#ssts-and-systemd

      https://mariadb.com/kb/en/library/systemd/#configuring-the-systemd-service-timeout

      Attachments

        Issue Links

          Activity

            Loosely related: MDEV-17934

            GeoffMontee Geoff Montee (Inactive) added a comment - Loosely related: MDEV-17934

            I just noticed that EXTEND_TIMEOUT_USEC was added in systemd version 236:

            https://lists.freedesktop.org/archives/systemd-devel/2017-December/039996.html

            The most common OS that we tend to see for MariaDB with Galera is RHEL 7, and that still has systemd version 219:

            [ec2-user@ip-172-30-0-249 ~]$ cat /etc/redhat-release
            Red Hat Enterprise Linux Server release 7.2 (Maipo)
            [ec2-user@ip-172-30-0-249 ~]$ sudo yum info systemd
            Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
            Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
            Installed Packages
            Name        : systemd
            Arch        : x86_64
            Version     : 219
            Release     : 19.el7
            Size        : 21 M
            Repo        : installed
            From repo   : anaconda
            Summary     : A System and Service Manager
            URL         : http://www.freedesktop.org/wiki/Software/systemd
            License     : LGPLv2+ and MIT and GPLv2+
            Description : systemd is a system and service manager for Linux, compatible with
                        : SysV and LSB init scripts. systemd provides aggressive parallelization
                        : capabilities, uses socket and D-Bus activation for starting services,
                        : offers on-demand starting of daemons, keeps track of processes using
                        : Linux cgroups, supports snapshotting and restoring of the system
                        : state, maintains mount and automount points and implements an
                        : elaborate transactional dependency-based service control logic. It can
                        : work as a drop-in replacement for sysvinit.
             
            Available Packages
            Name        : systemd
            Arch        : x86_64
            Version     : 219
            Release     : 62.el7
            Size        : 5.1 M
            Repo        : rhui-REGION-rhel-server-releases/7Server/x86_64
            Summary     : A System and Service Manager
            URL         : http://www.freedesktop.org/wiki/Software/systemd
            License     : LGPLv2+ and MIT and GPLv2+
            Description : systemd is a system and service manager for Linux, compatible with
                        : SysV and LSB init scripts. systemd provides aggressive parallelization
                        : capabilities, uses socket and D-Bus activation for starting services,
                        : offers on-demand starting of daemons, keeps track of processes using
                        : Linux cgroups, supports snapshotting and restoring of the system
                        : state, maintains mount and automount points and implements an
                        : elaborate transactional dependency-based service control logic. It can
                        : work as a drop-in replacement for sysvinit.
            

            So even if MDEV-15607 fixed this problem for systemd versions 236 and above, a lot of users are using systemd versions that are much older, so they would not benefit from that functionality. That explains a lot.

            GeoffMontee Geoff Montee (Inactive) added a comment - I just noticed that EXTEND_TIMEOUT_USEC was added in systemd version 236: https://lists.freedesktop.org/archives/systemd-devel/2017-December/039996.html The most common OS that we tend to see for MariaDB with Galera is RHEL 7, and that still has systemd version 219: [ec2-user@ip-172-30-0-249 ~]$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.2 (Maipo) [ec2-user@ip-172-30-0-249 ~]$ sudo yum info systemd Loaded plugins: amazon-id, rhui-lb, search-disabled-repos Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast Installed Packages Name : systemd Arch : x86_64 Version : 219 Release : 19.el7 Size : 21 M Repo : installed From repo : anaconda Summary : A System and Service Manager URL : http://www.freedesktop.org/wiki/Software/systemd License : LGPLv2+ and MIT and GPLv2+ Description : systemd is a system and service manager for Linux, compatible with : SysV and LSB init scripts. systemd provides aggressive parallelization : capabilities, uses socket and D-Bus activation for starting services, : offers on-demand starting of daemons, keeps track of processes using : Linux cgroups, supports snapshotting and restoring of the system : state, maintains mount and automount points and implements an : elaborate transactional dependency-based service control logic. It can : work as a drop-in replacement for sysvinit.   Available Packages Name : systemd Arch : x86_64 Version : 219 Release : 62.el7 Size : 5.1 M Repo : rhui-REGION-rhel-server-releases/7Server/x86_64 Summary : A System and Service Manager URL : http://www.freedesktop.org/wiki/Software/systemd License : LGPLv2+ and MIT and GPLv2+ Description : systemd is a system and service manager for Linux, compatible with : SysV and LSB init scripts. systemd provides aggressive parallelization : capabilities, uses socket and D-Bus activation for starting services, : offers on-demand starting of daemons, keeps track of processes using : Linux cgroups, supports snapshotting and restoring of the system : state, maintains mount and automount points and implements an : elaborate transactional dependency-based service control logic. It can : work as a drop-in replacement for sysvinit. So even if MDEV-15607 fixed this problem for systemd versions 236 and above, a lot of users are using systemd versions that are much older, so they would not benefit from that functionality. That explains a lot.
            GeoffMontee Geoff Montee (Inactive) added a comment - - edited

            This Percona blog post is relevant:

            https://www.percona.com/blog/2019/02/12/debugging-mariadb-galera-cluster-sst-problems-a-tale-of-a-funny-experience/

            Somehow, they're under the impression that this timeout used to be 900 seconds in MariaDB:

            The MariaDB init script has changed its timeout from 900 seconds to 90 while MySQL Community and Percona Server has this value set to 15 mins.

            But as far as I can tell from the git commit history, MariaDB's systemd unit file has never explicitly defined TimeoutSec or TimeoutStartSec, and it has used the systemd default value of 90 seconds for TimeoutStartSec since systemd support was added in MariaDB 10.1.

            I guess the 900 seconds is a reference to the service startup timeout in mysql.server, which is used as the init script on distributions that don't support systemd.

            https://github.com/MariaDB/server/blob/32062cc61cd00e4cd3b7939c8a09f9c3ac34ec76/support-files/mysql.server.sh#L48

            It looks like Percona XtraDB Cluster sets an infinite timeout by default in its systemd unit file:

            https://github.com/percona/percona-xtradb-cluster/blob/3f1af140a962e731d6cf9ae83c18c30cf26165d0/scripts/systemd/mysqld.service.in#L39

            Should we set TimeoutStartSec to a higher value than 90 seconds by default for systemd versions that don't support EXTEND_TIMEOUT_USEC? It probably wouldn't hurt to at least set it to the old 900 second value from mysql.server that some users are used to.

            GeoffMontee Geoff Montee (Inactive) added a comment - - edited This Percona blog post is relevant: https://www.percona.com/blog/2019/02/12/debugging-mariadb-galera-cluster-sst-problems-a-tale-of-a-funny-experience/ Somehow, they're under the impression that this timeout used to be 900 seconds in MariaDB: The MariaDB init script has changed its timeout from 900 seconds to 90 while MySQL Community and Percona Server has this value set to 15 mins. But as far as I can tell from the git commit history, MariaDB's systemd unit file has never explicitly defined TimeoutSec or TimeoutStartSec, and it has used the systemd default value of 90 seconds for TimeoutStartSec since systemd support was added in MariaDB 10.1. I guess the 900 seconds is a reference to the service startup timeout in mysql.server, which is used as the init script on distributions that don't support systemd. https://github.com/MariaDB/server/blob/32062cc61cd00e4cd3b7939c8a09f9c3ac34ec76/support-files/mysql.server.sh#L48 It looks like Percona XtraDB Cluster sets an infinite timeout by default in its systemd unit file: https://github.com/percona/percona-xtradb-cluster/blob/3f1af140a962e731d6cf9ae83c18c30cf26165d0/scripts/systemd/mysqld.service.in#L39 Should we set TimeoutStartSec to a higher value than 90 seconds by default for systemd versions that don't support EXTEND_TIMEOUT_USEC? It probably wouldn't hurt to at least set it to the old 900 second value from mysql.server that some users are used to.

            axel Is there some way to get required settings for TimeoutStartSec and TimeoutSec to mariadb.service file ?

            jplindst Jan Lindström (Inactive) added a comment - axel Is there some way to get required settings for TimeoutStartSec and TimeoutSec to mariadb.service file ?
            axel Axel Schwenke added a comment -

            I've read a lot of code and documentation lately. Let me try to summarize things:

            1. the MariaDB Server systemd unit never set a timeout for server startup. The old init script used to set it to 900 seconds, but for systemd we rely on the systemd default which is 90 seconds.
            2. those 90 seconds are in general considered to be too short, 900 seconds (the old value) seem to be more appropriate.
            3. there is a way to extend the timeout dynamically (from the running SST). This was implemented in MDEV-15607 - but it requires a recent version of systemd that not all users have. I also see that with the switching to Galera 4 the respective code was removed from sql/wsrep_sst.cc (commit 36a2a185fe18d31a644da46cfabd9757a379280c)

            From the above I conclude the following steps:

            1. we should explicitly set a timeout of 900 seconds in our systemd unit file. This is trivial, but it will only help those users whose SST needs longer than 90 sec, but shorter than 900 sec.
            2. for all other users we already documented how to bypass the timeout, by setting it to 'infinite'. This is not considered a general solution, but rather a "last resort" action. And it should only be implemented by users who strictly need it.
            3. the general solution should be along MDEV-15607
            axel Axel Schwenke added a comment - I've read a lot of code and documentation lately. Let me try to summarize things: the MariaDB Server systemd unit never set a timeout for server startup. The old init script used to set it to 900 seconds, but for systemd we rely on the systemd default which is 90 seconds. those 90 seconds are in general considered to be too short, 900 seconds (the old value) seem to be more appropriate. there is a way to extend the timeout dynamically (from the running SST). This was implemented in MDEV-15607 - but it requires a recent version of systemd that not all users have. I also see that with the switching to Galera 4 the respective code was removed from sql/wsrep_sst.cc (commit 36a2a185fe18d31a644da46cfabd9757a379280c) From the above I conclude the following steps: we should explicitly set a timeout of 900 seconds in our systemd unit file. This is trivial, but it will only help those users whose SST needs longer than 90 sec, but shorter than 900 sec. for all other users we already documented how to bypass the timeout, by setting it to 'infinite'. This is not considered a general solution, but rather a "last resort" action. And it should only be implemented by users who strictly need it. the general solution should be along MDEV-15607

            People

              jplindst Jan Lindström (Inactive)
              claudio.nanni Claudio Nanni
              Votes:
              9 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.