[MDEV-7694] MariaDB - mysqld cannot start when restarting from systemd Created: 2015-03-10  Updated: 2016-02-12  Resolved: 2015-07-25

Status: Closed
Project: MariaDB Server
Component/s: Scripts & Clients
Affects Version/s: 10.0.16
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Nirbhay Choubey (Inactive) Assignee: Sergei Golubchik
Resolution: Duplicate Votes: 0
Labels: SUSE, systemd
Environment:

SLES-12, x86_64


Attachments: File mysqld.log-20150307    
Issue Links:
PartOf
is part of MDEV-427 Provide a systemd script for MariaDB Closed

 Description   

MariaDB - mysqld cannot start when restarting from systemd

s390vsl048:~/tmpmariadb # systemctl restart mysql.service
s390vsl048:~/tmpmariadb # mysql -u ssluser -pssluser -D test --ssl-cipher=DHE-RSA-AES256-SHA:AES128-SHA --ssl-ca=/etc/mysql/certs/ca-cert.pem --ssl-cert=/etc/mysql/certs/client-cert.pem --ssl-key=/etc/mysql/certs/client-key.pem
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 104 "Connection reset by peer"

s390vsl048:~/tmpmariadb # systemctl status mysql.service
mysql.service - LSB: Start the MySQL database server
   Loaded: loaded (/etc/init.d/mysql)
   Active: active (exited) since Fri 2015-03-06 17:57:33 CET; 3min 21s ago
  Process: 6875 ExecStop=/etc/init.d/mysql stop (code=exited, status=0/SUCCESS)
  Process: 6942 ExecStart=/etc/init.d/mysql start (code=exited, status=0/SUCCESS)
 
Mar 06 17:57:33 s390vsl048 mysql[6942]: Starting service MySQL Starting service MySQL ..done

s390vsl048:~/tmpmariadb # systemctl restart mysql.service
Job for mysql.service failed. See "systemctl status mysql.service" and "journalctl -xn" for details.

s390vsl048:~/tmpmariadb # journalctl -xn
-- Logs begin at Fri 2015-03-06 17:46:11 CET, end at Fri 2015-03-06 18:01:30 CET. --
Mar 06 17:57:33 s390vsl048 mysql[6942]: Starting service MySQL Starting service MySQL ..done
Mar 06 18:00:01 s390vsl048 kernel: type=1006 audit(1425661201.794:45): pid=7221 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=4 res=1
Mar 06 18:00:01 s390vsl048 cron[7221]: pam_unix(crond:session): session opened for user root by (uid=0)
Mar 06 18:00:01 s390vsl048 CRON[7221]: pam_unix(crond:session): session closed for user root
Mar 06 18:01:30 s390vsl048 systemd[1]: Stopping LSB: Start the MySQL database server...
-- Subject: Unit mysql.service has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mysql.service has begun shutting down.
Mar 06 18:01:30 s390vsl048 mysql[7289]: Shutting down service MySQL ..done
Mar 06 18:01:30 s390vsl048 mysql[7318]: Starting service MySQL Starting service MySQL chmod: cannot access '/var/run/mysql/mysqld.pid': No such file or directory
Mar 06 18:01:30 s390vsl048 systemd[1]: Failed to start LSB: Start the MySQL database server.
-- Subject: Unit mysql.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mysql.service has failed.
-- 
-- The result is failed.
Mar 06 18:01:30 s390vsl048 systemd[1]: Unit mysql.service entered failed state.
Mar 06 18:01:30 s390vsl048 mysql[7318]: ..failed

== mysqld.log ==

150306 17:57:32 [Note] /usr/sbin/mysqld: Shutdown complete
 
150306 17:57:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150306 17:57:32 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150306 17:57:32 [Note] InnoDB: The InnoDB memory heap is disabled
150306 17:57:32 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150306 17:57:32 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
150306 17:57:32 [Note] InnoDB: Compressed tables use zlib 1.2.8
150306 17:57:32 [Note] InnoDB: Using Linux native AIO
150306 17:57:32 [Note] InnoDB: Not using CPU crc32 instructions
150306 17:57:32 [Note] InnoDB: Initializing buffer pool, size = 128.0M
150306 17:57:32 [Note] InnoDB: Completed initialization of buffer pool
150306 17:57:32 [Note] InnoDB: Highest supported file format is Barracuda.
150306 17:57:33 [Note] InnoDB: 128 rollback segment(s) are active.
150306 17:57:33 [Note] InnoDB: Waiting for purge to start
150306 17:57:33 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.22-71.0 started; log sequence number 1616797
150306 17:57:33 [Note] Server socket created on IP: '::'.
150306 17:57:33 [Note] Event Scheduler: Loaded 0 events
150306 17:57:33 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.16-MariaDB'  socket: '/var/run/mysql/mysql.sock'  port: 3306  openSUSE package
150306 17:57:33 mysqld_safe Number of processes running now: 1
150306 17:57:33 mysqld_safe mysqld process hanging, pid 7111 - killed
150306 17:57:33 mysqld_safe mysqld restarted
150306 17:57:33 [ERROR] mysqld: Can't lock aria control file '/var/lib/mysql/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
150306 17:57:34 mysqld_safe mysqld from pid file /var/run/mysql/mysqld.pid ended
150306 17:57:35 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150306 17:57:35 [Note] InnoDB: The InnoDB memory heap is disabled
150306 17:57:35 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150306 17:57:35 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
150306 17:57:35 [Note] InnoDB: Compressed tables use zlib 1.2.8
150306 17:57:35 [Note] InnoDB: Using Linux native AIO
150306 17:57:35 [Note] InnoDB: Not using CPU crc32 instructions
150306 17:57:35 [ERROR] mysqld: Can't create/write to file '/var/tmp/mysql.DjJntU/ibuao7B9' (Errcode: 2 "No such file or directory")
2015-03-06 17:57:35 3fffd051730  InnoDB: Error: unable to create temporary file; errno: 2
150306 17:57:35 [ERROR] Plugin 'InnoDB' init function returned error.
150306 17:57:35 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
150306 17:57:35 [ERROR] Unknown/unsupported storage engine: InnoDB
150306 17:57:35 [ERROR] Aborting
 
150306 17:57:35 [Note] /usr/sbin/mysqld: Shutdown complete

Workaround: - use stop and start
================================

s390vsl048:~/tmpmariadb # systemctl stop mysql.service
s390vsl048:~/tmpmariadb # systemctl status mysql.service
mysql.service - LSB: Start the MySQL database server
   Loaded: loaded (/etc/init.d/mysql)
   Active: failed (Result: exit-code) since Fri 2015-03-06 18:01:30 CET; 54s ago
  Process: 7289 ExecStop=/etc/init.d/mysql stop (code=exited, status=0/SUCCESS)
  Process: 7318 ExecStart=/etc/init.d/mysql start (code=exited, status=1/FAILURE)
 
Mar 06 18:01:30 s390vsl048 mysql[7318]: Starting service MySQL Starting service MySQL chmod: cannot access '/var/run/mysql/mysqld.pid': No such file or directory
Mar 06 18:01:30 s390vsl048 systemd[1]: Failed to start LSB: Start the MySQL database server.
Mar 06 18:01:30 s390vsl048 systemd[1]: Unit mysql.service entered failed state.
Mar 06 18:01:30 s390vsl048 mysql[7318]: ..failed

s390vsl048:~/tmpmariadb # systemctl start mysql.service
s390vsl048:~/tmpmariadb # systemctl status mysql.service
mysql.service - LSB: Start the MySQL database server
   Loaded: loaded (/etc/init.d/mysql)
   Active: active (running) since Fri 2015-03-06 18:02:29 CET; 1s ago
  Process: 7289 ExecStop=/etc/init.d/mysql stop (code=exited, status=0/SUCCESS)
  Process: 7542 ExecStart=/etc/init.d/mysql start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/mysql.service
           |-7366 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql...
           `-7487 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-fil...
 
Mar 06 18:02:29 s390vsl048 mysql[7542]: ..done
s390vsl048:~/tmpmariadb # mysql -u ssluser -pssluser -D test --ssl-cipher=DHE-RSA-AES256-SHA:AES128-SHA --ssl-ca=/etc/mysql/certs/ca-cert.pem --ssl-cert=/etc/mysql/certs/client-cert.pem --ssl-key=/etc/mysql/certs/client-key.pem
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 10.0.16-MariaDB openSUSE package

Note:
Cannot reproduce on every system but looks like on s390 is 100 reproducible.

https://bugzilla.suse.com/show_bug.cgi?id=921115



 Comments   
Comment by Daniel Black [ 2015-03-11 ]

https://github.com/MariaDB/server/pull/26 changes the service definition to avoid pid files

probably needs galera related socket activation for the galera ports. I'm tempted to wait until the first part is merged.

Comment by Elena Stepanova [ 2015-03-13 ]

The error log related to the restart failure is missing (the quoted fragment precedes it, timestamps don't much).
So, there is no proof that it's just the pid file problem. If there was something else, like aria_log_control or tmpdir as in the error log above, then probably the change to service definition won't help.

Comment by Elena Stepanova [ 2015-03-16 ]

nirbhay_c,
Do you think we could get matching logs?

Comment by Nirbhay Choubey (Inactive) [ 2015-03-16 ]

elenst You mean the entire log? with complete sequence?

Comment by Elena Stepanova [ 2015-03-17 ]

nirbhay_c,
I meant the part of the error log between 17:57:35 and 18:02:29; specifically the fragment that corresponds this activity:

Mar 06 18:01:30 s390vsl048 mysql[7318]: Starting service MySQL Starting service MySQL chmod: cannot access '/var/run/mysql/mysqld.pid': No such file or directory

Comment by Viktor Kijasev [ 2015-03-23 ]

Full log

Comment by Elena Stepanova [ 2015-04-09 ]

So, the log confirms that mysqld was actually started all right, it's the script that fails to recognize it and produces the error.
Most likely danblack is right and the problem is around pid files, apparently a race condition of some sort.
I can imitate a somewhat similar (yet not completely the same) issue by tampering with mysqld_safe where it checks for the pid file.
I will try to see if I can get closer to the initial problem.

Is it still happening? If it is, can you run ps -ef | grep mysqld after the failed attempt to restart the service, and do it a couple times with a few seconds interval?
The goal will be to see if mysqld_safe is running, and how many instances of it are running.

Thanks.

Comment by Viktor Kijasev [ 2015-04-15 ]

Hi Elena,
(sorry for late answer)
Yes it is still happening:

s390vsl020:~/tmpmariadb # systemctl restart mysql.service 
Job for mysql.service failed. See "systemctl status mysql.service" and "journalctl -xn" for details.
s390vsl020:~/tmpmariadb # systemctl status mysql.service
mysql.service - LSB: Start the MySQL database server
   Loaded: loaded (/etc/init.d/mysql)
   Active: failed (Result: exit-code) since Wed 2015-04-15 11:39:41 CEST; 9s ago
  Process: 32954 ExecStop=/etc/init.d/mysql stop (code=exited, status=0/SUCCESS)
  Process: 32981 ExecStart=/etc/init.d/mysql start (code=exited, status=1/FAILURE)
 
Apr 15 11:39:41 s390vsl020 mysql[32981]: Starting service MySQL Starting service MySQL chmod: cannot access '/var/run/mysql/mysqld.pid': No such file or directory
Apr 15 11:39:41 s390vsl020 systemd[1]: Unit mysql.service entered failed state.
Apr 15 11:39:41 s390vsl020 mysql[32981]: ..failed
s390vsl020:~/tmpmariadb # journalctl -xn
-- Logs begin at Thu 2015-04-09 16:18:57 CEST, end at Wed 2015-04-15 11:39:41 CEST. --
Apr 15 11:30:01 s390vsl020 CRON[32475]: pam_unix(crond:session): session closed for user root
Apr 15 11:38:49 s390vsl020 systemd[1]: Configuration file /usr/lib/systemd/system/wdmd.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Apr 15 11:38:56 s390vsl020 systemd[1]: Stopping LSB: Start the MySQL database server...
-- Subject: Unit mysql.service has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mysql.service has begun shutting down.
Apr 15 11:38:58 s390vsl020 mysql[32677]: Shutting down service MySQL ..done
Apr 15 11:38:58 s390vsl020 mysql[32727]: Starting service MySQL Starting service MySQL ..done
Apr 15 11:39:40 s390vsl020 systemd[1]: Stopping LSB: Start the MySQL database server...
-- Subject: Unit mysql.service has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mysql.service has begun shutting down.
Apr 15 11:39:40 s390vsl020 mysql[32954]: Shutting down service MySQL ..done
Apr 15 11:39:41 s390vsl020 mysql[32981]: Starting service MySQL Starting service MySQL chmod: cannot access '/var/run/mysql/mysqld.pid': No such file or directory
Apr 15 11:39:41 s390vsl020 systemd[1]: Unit mysql.service entered failed state.
Apr 15 11:39:41 s390vsl020 mysql[32981]: ..failed
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33193 29773  0 11:41 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33211 29773  0 11:42 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33215 29773  0 11:42 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33219 29773  0 11:42 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33222 29773  0 11:42 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # ps -ef | grep mysqld
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33226 29773  0 11:43 pts/0    00:00:00 grep --color=auto mysqld
s390vsl020:~/tmpmariadb # date;ps -ef | grep mysqld
Wed Apr 15 11:43:22 CEST 2015
root     33031     1  0 11:39 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --mysqld=mysqld --user=mysql --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --datadir=/var/lib/mysql
mysql    33140 33031  0 11:39 ?        00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysql/mysqld.log --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock
root     33234 29773  0 11:43 pts/0    00:00:00 grep --color=auto mysqld

Comment by Viktor Kijasev [ 2015-04-15 ]

Also I was suspicious before, it can be related to setting up ssl connection - like creating new ssl user, generating certificates, opening session as ssl user etc.
After that I saw this problem.
And last time after the reboot of machine I could not reproduce it again.
(however I may be wrong)

Comment by Elena Stepanova [ 2015-04-16 ]

So, there is nothing wrong with the actual server startup, mysqld_safe starts once, survives, and starts mysqld which also survives. The whole problem is simply the service script not being able to chmod on the pid file, apparently because the pid file doesn't get created in time.

I don't know if it can be related to SSL – I suppose hypothetically it's possible if some SSL-related activity happens upon startup before creating the pid file, thus delaying it a little bit. I didn't check it though.

Daniel's changes look rather invasive, I don't know if they'll make it to the 10.0 which is well post-GA.

On the other hand, this particular problem can probably have a simple workaround. Currently chmod on the pid file is the very last thing the service script does, then it exits with the status of the operation. While with the socket file there is a wait, and even after the wait its absence is not considered an error, with the pid file it's much simpler – a single attempt to chmod and then exit. I suppose adding at least a timed wait on the pid file should highly reduce the probability of the failure, if not eliminate it completely.

Comment by Daniel Black [ 2015-04-17 ]

Daniel's changes look rather invasive

Predominately in the socket activation mode of systemd.

Removing 'Requires=%p.socket' from the mariadb.service would make this run in the Type=Notify mode which also bypasses the pid file/mysqld_safe issues and all of the socket activation code (https://github.com/MariaDB/server/pull/26/files#diff-e7d51f293adc9257284fbdec858056dfR6149 because sd_listen_fds() return 0). The storage engine notify commands could also be removed from 10.0 if this is too much.

some SSL-related activity happens upon startup before creating the pid file, thus delaying it a little bit. I didn't check it though.

init_ssl is called in mysqd just before the bit that creates the pid. While it initializes a few variables and reads the ssl server side key/certs it doesn't seem to be that intensive.

The whole problem is simply the service script not being able to chmod on the pid file, apparently because the pid file doesn't get created in time

Looks like it too me too. And why is it doing a chmod on a pid file anyway? No idea.

Comment by Sergei Golubchik [ 2015-07-25 ]

As far as I understand, it's not a MariaDB bug, both mysqld and mysqld_safe work correctly. This is an issue with the systemd service script, but we do not have systemd service script (yet). You're having a problem with a third-party systemd service script, we cannot fix it.

But there is MDEV-427 which is about a native MariaDB systemd support. When it will be released, you will be able to use our systemd service script, which will not have this issue.

Generated at Thu Feb 08 07:21:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.