[MDEV-23044] systemctl enable sporadically turns the service into inactive on SUSE Created: 2020-06-29  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Packaging
Affects Version/s: 10.1, 10.2, 10.3, 10.4
Fix Version/s: 10.4, 10.5

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Elena Stepanova
Resolution: Unresolved Votes: 0
Labels: None


 Description   

The following sequence of commands sporadically ends with mariadb service reported as inactive (while the server is still running):

sudo systemctl status mariadb
sudo systemctl start mariadb
sudo systemctl status mariadb --no-pager
sudo systemctl enable mariadb
sudo systemctl status mariadb --no-pager

Usually it goes all right and the service stays active. But in about 20% of cases it ends like this:

buildbot@opensuse150-amd64:~> sudo systemctl enable mariadb
Created symlink /etc/systemd/system/mysql.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/mysqld.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /usr/lib/systemd/system/mariadb.service.
buildbot@opensuse150-amd64:~> sudo systemctl status mariadb --no-pager
● mariadb.service - MariaDB 10.4.12 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: inactive (dead)
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
 Main PID: 1795 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 30 (limit: 4915)
   CGroup: /system.slice/mariadb.service
           └─1795 /usr/sbin/mysqld

It happens both in buildbot where the commands are scripted, and upon manual execution, when the commands are typed in. So, if a race condition is involved, it's unlikely to be related to the speed of the command execution, it's more likely to be with a single command (systemctl enable would be my main suspect).

In buildbot it happens reguarly, always on SUSE (opensuse150, sles150, sles123). Quite possibly it's a SUSE problem and not MariaDB's, but I couldn't find anything by an online search.

Syslog doesn't show anything suspicious and looks the same regardless the outcome. Here is the full output of the commands above (executed immediately after starting a VM with pre-installed 10.4.12, service not enabled yet):

+ bash
buildbot@opensuse150-amd64:~> sudo systemctl status mariadb
● mariadb.service - MariaDB 10.4.12 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: inactive (dead)
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
buildbot@opensuse150-amd64:~> sudo systemctl start mariadb 
buildbot@opensuse150-amd64:~> sudo systemctl status mariadb --no-pager
● mariadb.service - MariaDB 10.4.12 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Mon 2020-06-29 18:35:06 UTC; 5s ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 1826 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 1785 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 1784 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
 Main PID: 1795 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 30 (limit: 4915)
   CGroup: /system.slice/mariadb.service
           └─1795 /usr/sbin/mysqld
 
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: 10.4.12 started; log sequence number 60999; transaction id 21
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Plugin 'FEEDBACK' is disabled.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Server socket created on IP: '::'.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: Buffer pool(s) load completed at 200629 18:35:06
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Reading of all Master_info entries succeeded
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Added new Master_info '' to hash table
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] /usr/sbin/mysqld: ready for connections.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: Version: '10.4.12-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
Jun 29 18:35:06 opensuse150-amd64 systemd[1]: Started MariaDB 10.4.12 database server.
buildbot@opensuse150-amd64:~> sudo systemctl enable mariadb
Created symlink /etc/systemd/system/mysql.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/mysqld.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /usr/lib/systemd/system/mariadb.service.
buildbot@opensuse150-amd64:~> sudo systemctl status mariadb --no-pager
● mariadb.service - MariaDB 10.4.12 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: inactive (dead)
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
 Main PID: 1795 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 30 (limit: 4915)
   CGroup: /system.slice/mariadb.service
           └─1795 /usr/sbin/mysqld
 
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: 10.4.12 started; log sequence number 60999; transaction id 21
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Plugin 'FEEDBACK' is disabled.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Server socket created on IP: '::'.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] InnoDB: Buffer pool(s) load completed at 200629 18:35:06
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Reading of all Master_info entries succeeded
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] Added new Master_info '' to hash table
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: 2020-06-29 18:35:06 0 [Note] /usr/sbin/mysqld: ready for connections.
Jun 29 18:35:06 opensuse150-amd64 mysqld[1795]: Version: '10.4.12-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
Jun 29 18:35:06 opensuse150-amd64 systemd[1]: Started MariaDB 10.4.12 database server.
buildbot@opensuse150-amd64:~> uptime
 18:35:36  up   0:01,  1 user,  load average: 0.48, 0.20, 0.08



 Comments   
Comment by Elena Stepanova [ 2020-06-29 ]

The failure was happening in buildbot close to the end of minor-upgrade-* tests, when the test was doing this after upgrade:

systemctl status mariadb.service --no-pager
systemctl status mariadb --no-pager
sudo systemctl enable mariadb
systemctl is-enabled mariadb
systemctl status mysql --no-pager

The systemctl enable command was there to work around MDEV-19432 (Systemd service does not get re-enabled after upgrade). Since the correct solution discovered in MDEV-19432 turned out to be relying on the previous enabled/disabled status rather than forcing any of them, the systemctl enable command has been moved to the first part of the test, to be executed after the initial installation of the server. It remains to be seen whether it will make the sporadic failure caused by the problem described here go away.

Comment by Elena Stepanova [ 2020-07-02 ]

The action described in the previous comment didn't help the test. Now it fails in a weirder way. After the clean installation of an "old" version is done and the service is started and enabled, the server continues working seemingly normally, but when upgrade is performed, it reports something like

(2/2) Installing: MariaDB-server-10.1.46-1.x86_64 [\]
(2/2) Installing: MariaDB-server-10.1.46-1.x86_64 [|]
(2/2) Installing: MariaDB-server-10.1.46-1.x86_64 [/]
(2/2) Installing: MariaDB-server-10.1.46-1.x86_64 [done]
There are some running programs that might use files deleted by recent upgrade. You may wish to check and restart some of them. Run 'zypper ps -s' to list these programs.

and the server isn't restarted after upgrade.

To avoid further problems, systemctl enable will not be run for SUSE in minor upgrade tests.

Comment by Sergei Golubchik [ 2021-11-02 ]

Is it still happening? Haven't seen it for a while

Generated at Thu Feb 08 09:19:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.