[MDEV-19170] Systemd not working the same as with 10.3.13 Created: 2019-04-04  Updated: 2019-05-03

Status: Open
Project: MariaDB Server
Component/s: Configuration
Affects Version/s: 10.3.14
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Jack Palmadesso Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: systemd
Environment:

CENTOS 7.6 Kernel: 3.10.0-957.10.1.el7.x86_64



 Description   

We use our own Internal LDAP infrastructure to manage applications users. The mysql user is in LDAP but not local on any db server. In the previous version (10.3.13) I had this working in the mariadb.service unit file:

[Unit]
Description=MariaDB 10.3.13 database server
Documentation=man:mysqld(8)
Documentation=https://mariadb.com/kb/en/library/systemd/
*Requires=nslcd.service*
After=*nslcd.service* network.target

Items in bold above were added so that the mariadb service would not start until the ldap service (nslcd.service) was up and running. This worked until the other day when I upgraded a few machines to 10.3.14. Since then this no longer works. The mariadb.service tries to start and fails because it cannot find the mysql user. Workaround is to add a local user on the affected system.

I've compared the unit files from 10.3.13 and 103.14 and they are identical except for the Description field which should not matter.



 Comments   
Comment by Elena Stepanova [ 2019-04-06 ]

serg,
Can it be related to cmake changes in 10.3.14?

Comment by Daniel Black [ 2019-04-09 ]

Only difference in service file is the version number:

[root@b3848d0bc8a4 /]# cp  /lib/systemd/system/mariadb.service  /tmp
[root@b3848d0bc8a4 /]# yum install MariaDB-server-10.3.14
Loaded plugins: fastestmirror, ovl
Loading mirror speeds from cached hostfile
 * base: mirror.aarnet.edu.au
 * extras: mirror.aarnet.edu.au
 * updates: mirror.aarnet.edu.au
Resolving Dependencies
--> Running transaction check
---> Package MariaDB-server.x86_64 0:10.3.13-1.el7.centos will be updated
---> Package MariaDB-server.x86_64 0:10.3.14-1.el7.centos will be an update
--> Finished Dependency Resolution
 
Dependencies Resolved
 
===================================================================================================================================================================
 Package                                  Arch                             Version                                         Repository                         Size
===================================================================================================================================================================
Updating:
 MariaDB-server                           x86_64                           10.3.14-1.el7.centos                            mariadb                            24 M
 
Transaction Summary
===================================================================================================================================================================
Upgrade  1 Package
 
Total download size: 24 M
Is this ok [y/d/N]: y
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
MariaDB-server-10.3.14-1.el7.centos.x86_64.rpm                                                                                              |  24 MB  00:00:49     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Updating   : MariaDB-server-10.3.14-1.el7.centos.x86_64                                                                                                      1/2 
  Cleanup    : MariaDB-server-10.3.13-1.el7.centos.x86_64                                                                                                      2/2 
  Verifying  : MariaDB-server-10.3.14-1.el7.centos.x86_64                                                                                                      1/2 
  Verifying  : MariaDB-server-10.3.13-1.el7.centos.x86_64                                                                                                      2/2 
 
Updated:
  MariaDB-server.x86_64 0:10.3.14-1.el7.centos                                                                                                                     
 
Complete!
[root@b3848d0bc8a4 /]# diff /tmp/mariadb.service  /lib/systemd/system/mariadb.service      
16c16
< Description=MariaDB 10.3.13 database server
---
> Description=MariaDB 10.3.14 database server

To make a change to the systemd file like you have (from https://mariadb.com/kb/en/library/systemd/) create /etc/systemd/system/mariadb.service.d/nslcd.conf

[Unit]
Requires=nslcd.service
After=nslcd.service

And then run `systemctl daemon-reload`. Then it can have an effect without overriding the distro version.

For the failing case what does:

`systemd-analyze critical-chain mariadb.service` show?

Could it be nslcd.service claiming it is started to systemd but not responding to requests? What does `journalctl -u nslcd.service`/ `systemctl status nslcd.service` show?

and for comparison, what does `journalctl -u mariadb.service` / `systemctl status mariadb.service` show?

Comment by Sergei Golubchik [ 2019-04-09 ]

elenst, yes, it can

Comment by Jack Palmadesso [ 2019-04-09 ]

DEV-DCV-Database usorla7vd0115x (~)-92> systemd-analyze critical-chain mariadb.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
RESULTS OF: systemd-analyze critical-chain mariadb.service
mariadb.service +1.141s
└─nslcd.service @19.757s +101ms
└─firewall.service @15.447s +4.305s
└─network-online.target @15.442s
└─network.target @15.402s
└─rdma.service @15.761s +1.303s
└─system.slice
└─-.slice

Journal entry after reboot:
– Reboot –
Apr 03 15:00:05 usorla7vd0115x systemd[1]: Starting MariaDB 10.3.14 database server...
Apr 03 15:00:05 usorla7vd0115x systemd[1]: mariadb.service: control process exited, code=exited status=216
Apr 03 15:00:05 usorla7vd0115x systemd[1]: Failed to start MariaDB 10.3.14 database server.
Apr 03 15:00:05 usorla7vd0115x systemd[1]: Unit mariadb.service entered failed state.
Apr 03 15:00:05 usorla7vd0115x systemd[1]: mariadb.service failed.

Comment by Daniel Black [ 2019-04-10 ]

I thought the 101ms for nslcd.service looked quite quick.

docker run -ti centos/systemd bash
[root@51d3d56ea4b7 /]# yum update
[root@51d3d56ea4b7 /]# yum install nss-pam-ldapd
..
Installed:
  nss-pam-ldapd.x86_64 0:0.8.13-16.el7                                                                                                                             
..
[root@51d3d56ea4b7 /]#  more /usr/lib/systemd/system/nslcd.service
[Unit]
Description=Naming services LDAP client daemon.
After=syslog.target network.target named.service dirsrv.target slapd.service
Documentation=man:nslcd(8) man:nslcd.conf(5)
 
[Service]
Type=forking
PIDFile=/var/run/nslcd/nslcd.pid
ExecStart=/usr/sbin/nslcd
RestartSec=10s
Restart=on-failure
 
[Install]
WantedBy=multi-user.target

Looking at the source for the 0.8.13 code upstream here is where the pid is created, a few lines after it has daemonised and the socket is started a few lines of code later.

The systemd behaviour for type=forking says ' The parent process is expected to exit when start-up is complete and all communication channels are set up'. With nslcd this isn't the case. systemd could legitimately start the mariadb service before the socket is listening creating the failure you have observed. Even later version of nslcd create the socket after forking.

Suggest for the nslcd.service change Type=oneshot, and append and ExecStart= some script that waits for the socket to listen. I'm not sure why this manifested during the 10.3.14 upgrade for you however it looks like the nslcd service file isn't correct for the behaviour nslcd exhibits.

Comment by Jack Palmadesso [ 2019-04-10 ]

Thanks, working on it. For reference here are the versions of software involved. I've confirmed that with the previous version (10.3.13) it works as expected. Versions of systemd, nss-pam-ldap are identical. Only difference is the version of the DB software.

Distribution .................. CentOS 7.6.1810
Kernel Version ................ 3.10.0-957.10.1.el7.x86_64
Kernel Architecture ........... x86_64

MariaDB-client-10.3.14-1.el7.centos.x86_64
MariaDB-server-10.3.14-1.el7.centos.x86_64
MariaDB-common-10.3.14-1.el7.centos.x86_64
MariaDB-compat-10.3.14-1.el7.centos.x86_64

nss-pam-ldapd-0.8.13-16.el7.x86_64

systemd-python-219-62.el7_6.5.x86_64
systemd-219-62.el7_6.5.x86_64
systemd-sysv-219-62.el7_6.5.x86_64
systemd-libs-219-62.el7_6.5.i686
systemd-libs-219-62.el7_6.5.x86_64

Comment by Sergei Golubchik [ 2019-04-10 ]

I don't think you or Daniel need to look into it in details, it's quite probably a server bug in 10.3.14, which will be fixed in 10.3.15.

Comment by Sergei Golubchik [ 2019-05-03 ]

Now it looks like I was wrong and it's not a server bug.

Back to square one...

Generated at Thu Feb 08 08:49:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.