Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17934

Make systemd timeout behavior more compatible with longer Galera recovery times

Details

    Description

      When Galera is enabled, MariaDB's systemd service executes the "galera_recovery" script as an ExecStartPre operation. See the following:

      https://github.com/MariaDB/server/blob/ce8716a1ed786ff971b5e15c88385d50b649ec7f/support-files/mariadb.service.in#L71

      The MariaDB systemd service has a default TimeoutStartSec value of 90 seconds, so if this ExecStartPre step takes longer than that, then this can cause startup to fail. For example, see the following failure from a syslog:

      Sep 13 15:48:28 server1 systemd[1]: Starting MariaDB 10.2.16 database server...
      Sep 13 15:49:58 server1 systemd[1]: mariadb.service: Start-pre operation timed out. Terminating.
      Sep 13 15:49:58 server1 systemd[1]: Failed to start MariaDB 10.2.16 database server.
      Sep 13 15:49:58 server1 systemd[1]: mariadb.service: Unit entered failed state.
      Sep 13 15:49:58 server1 systemd[1]: mariadb.service: Failed with result 'timeout'.
      

      galera_recovery has to perform server startup, so this step can take a while, especially if the server previously crashed, and it has to perform crash recovery. However, it looks like systemd timeouts should have been extended during server startup as part of MDEV-14705. Despite that, server versions with the fix for MDEV-14705 still see timeouts during ExecStartPre. Is it likely that important long-running startup functions were missed?

      See also MDEV-17571 as another case where systemd timeout extensions didn't seem to work as intended.

      Attachments

        Issue Links

          Activity

            I just noticed that EXTEND_TIMEOUT_USEC was added in systemd version 236:

            https://lists.freedesktop.org/archives/systemd-devel/2017-December/039996.html

            The most common OS that we tend to see for MariaDB with Galera is RHEL 7, and that still has systemd version 219:

            [ec2-user@ip-172-30-0-249 ~]$ cat /etc/redhat-release
            Red Hat Enterprise Linux Server release 7.2 (Maipo)
            [ec2-user@ip-172-30-0-249 ~]$ sudo yum info systemd
            Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
            Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
            Installed Packages
            Name        : systemd
            Arch        : x86_64
            Version     : 219
            Release     : 19.el7
            Size        : 21 M
            Repo        : installed
            From repo   : anaconda
            Summary     : A System and Service Manager
            URL         : http://www.freedesktop.org/wiki/Software/systemd
            License     : LGPLv2+ and MIT and GPLv2+
            Description : systemd is a system and service manager for Linux, compatible with
                        : SysV and LSB init scripts. systemd provides aggressive parallelization
                        : capabilities, uses socket and D-Bus activation for starting services,
                        : offers on-demand starting of daemons, keeps track of processes using
                        : Linux cgroups, supports snapshotting and restoring of the system
                        : state, maintains mount and automount points and implements an
                        : elaborate transactional dependency-based service control logic. It can
                        : work as a drop-in replacement for sysvinit.
             
            Available Packages
            Name        : systemd
            Arch        : x86_64
            Version     : 219
            Release     : 62.el7
            Size        : 5.1 M
            Repo        : rhui-REGION-rhel-server-releases/7Server/x86_64
            Summary     : A System and Service Manager
            URL         : http://www.freedesktop.org/wiki/Software/systemd
            License     : LGPLv2+ and MIT and GPLv2+
            Description : systemd is a system and service manager for Linux, compatible with
                        : SysV and LSB init scripts. systemd provides aggressive parallelization
                        : capabilities, uses socket and D-Bus activation for starting services,
                        : offers on-demand starting of daemons, keeps track of processes using
                        : Linux cgroups, supports snapshotting and restoring of the system
                        : state, maintains mount and automount points and implements an
                        : elaborate transactional dependency-based service control logic. It can
                        : work as a drop-in replacement for sysvinit.
            

            So even if MDEV-14705 fixed this problem for systemd versions 236 and above, a lot of users are using systemd versions that are much older, so they would not benefit from that functionality. That explains a lot.

            GeoffMontee Geoff Montee (Inactive) added a comment - I just noticed that EXTEND_TIMEOUT_USEC was added in systemd version 236: https://lists.freedesktop.org/archives/systemd-devel/2017-December/039996.html The most common OS that we tend to see for MariaDB with Galera is RHEL 7, and that still has systemd version 219: [ec2-user@ip-172-30-0-249 ~]$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.2 (Maipo) [ec2-user@ip-172-30-0-249 ~]$ sudo yum info systemd Loaded plugins: amazon-id, rhui-lb, search-disabled-repos Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast Installed Packages Name : systemd Arch : x86_64 Version : 219 Release : 19.el7 Size : 21 M Repo : installed From repo : anaconda Summary : A System and Service Manager URL : http://www.freedesktop.org/wiki/Software/systemd License : LGPLv2+ and MIT and GPLv2+ Description : systemd is a system and service manager for Linux, compatible with : SysV and LSB init scripts. systemd provides aggressive parallelization : capabilities, uses socket and D-Bus activation for starting services, : offers on-demand starting of daemons, keeps track of processes using : Linux cgroups, supports snapshotting and restoring of the system : state, maintains mount and automount points and implements an : elaborate transactional dependency-based service control logic. It can : work as a drop-in replacement for sysvinit.   Available Packages Name : systemd Arch : x86_64 Version : 219 Release : 62.el7 Size : 5.1 M Repo : rhui-REGION-rhel-server-releases/7Server/x86_64 Summary : A System and Service Manager URL : http://www.freedesktop.org/wiki/Software/systemd License : LGPLv2+ and MIT and GPLv2+ Description : systemd is a system and service manager for Linux, compatible with : SysV and LSB init scripts. systemd provides aggressive parallelization : capabilities, uses socket and D-Bus activation for starting services, : offers on-demand starting of daemons, keeps track of processes using : Linux cgroups, supports snapshotting and restoring of the system : state, maintains mount and automount points and implements an : elaborate transactional dependency-based service control logic. It can : work as a drop-in replacement for sysvinit. So even if MDEV-14705 fixed this problem for systemd versions 236 and above, a lot of users are using systemd versions that are much older, so they would not benefit from that functionality. That explains a lot.

            Since ratzpo previously assigned MDEV-17571 to himself, I'm assigning this one to him as well.

            elenst Elena Stepanova added a comment - Since ratzpo previously assigned MDEV-17571 to himself, I'm assigning this one to him as well.
            axel Axel Schwenke added a comment -

            This is a duplicate of MDEV-17571

            axel Axel Schwenke added a comment - This is a duplicate of MDEV-17571

            People

              ratzpo Rasmus Johansson (Inactive)
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.