[MDEV-11949] Debian upgrade fails, stops the database every time i run apt-get install. massive downtimes on server. Created: 2017-01-31  Updated: 2017-04-29  Resolved: 2017-04-29

Status: Closed
Project: MariaDB Server
Component/s: Platform Debian
Affects Version/s: 10.1.21
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: robert andersson Assignee: Sergey Vojtovich
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Linux gimli 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux



 Description   

TLDR: The debian package install script is causing me hours of downtime.

The problem started when we upgraded from 10.1.20 to 10.1.21

After 2 hours the upgrade seems to have failed with this message:

Setting up mariadb-server-10.1 (10.1.21+maria-1jessie) ...
Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.
invoke-rc.d: initscript mysql, action "start" failed.
dpkg: error processing package mariadb-server-10.1 (--configure):
subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of mariadb-server:
mariadb-server depends on mariadb-server-10.1 (= 10.1.21+maria-1jessie); however:
Package mariadb-server-10.1 is not configured yet.
 
dpkg: error processing package mariadb-server (--configure):
dependency problems - leaving unconfigured

After that error i was able to start mariadb and it worked fine..

Fast forward to today ..
i run apt-get upgrade and apt-get decides to "set up" mariadb, (probably because it was left "unconfigured" after yesterdays upgrade..

Setting up mariadb-server-10.1 (10.1.21+maria-1~jessie) ...

The script then proceeds to stop my mariadb,, stopping my mariadb takes 1 hour roughly,, this is 1 hour of unscheduled downtime...

here's some logs from today havoc..

017-01-31 11:37:17 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:18 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:18 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:19 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:19 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:20 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:20 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:21 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:21 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:22 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:22 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:23 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:23 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:23 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:37:24 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:24 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:25 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:25 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:26 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:26 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:27 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:27 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:28 139690684753856 [ERROR] InnoDB: Unable to lock ./ibdata1, error: 11
2017-01-31 11:37:28 139690684753856 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2017-01-31 11:37:28 139690684753856 [Note] InnoDB: Unable to open the first data file
2017-01-31 11:37:28 7f0c45a4a7c0  InnoDB: Operating system error number 11 in a file operation.
InnoDB: Error number 11 means 'Resource temporarily unavailable'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
2017-01-31 11:37:28 139690684753856 [ERROR] InnoDB: Can't open './ibdata1'
2017-01-31 11:37:28 139690684753856 [ERROR] InnoDB: Could not open or create the system tablespace. If you tried to add new data files to the system tablespace, and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was, and remove the new ibdata files InnoDB created in this failed attempt. InnoDB only wrote those files full of zeros, but did not yet use them in any way. But be careful: do not remove old data files which contain your precious data!
2017-01-31 11:37:28 139690684753856 [ERROR] Plugin 'InnoDB' init function returned error.
2017-01-31 11:37:28 139690684753856 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2017-01-31 11:37:28 139690684753856 [ERROR] Unknown/unsupported storage engine: InnoDB
2017-01-31 11:37:28 139690684753856 [ERROR] Aborting
 
2017-01-31 11:38:24 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:39:24 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:40:24 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:41:24 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:42:24 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:43:25 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:44:25 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:45:25 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:46:25 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool
2017-01-31 11:47:25 139813703478016 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool

notice how it's shutting down "Waiting for page_cleaner to finish flushing of buffer pool" .. yet debian tries to start new instances at the same time.. because the script is too stupid to realize it's in the process of shutting down ....

PLEASE make the install script understand that some databases takes hours to shut down and prompt for shutdown before just doing it!!

You can read about yesterdays upgrade havoc in the comments here https://github.com/MariaDB/server/commit/ab1e6fefd869242d962cb91a006f37bb9ad534a7



 Comments   
Comment by robert andersson [ 2017-01-31 ]

This time the setup succeeded and now everything is fine.

I still think that there should be a prompt before shutting down, starting.
I still think that there's a bug where more instances are started during shutdown. (this might be a systemd bug i dont know)

The reason the upgrade failed was because my systemd service file was overwritten by the installer, and i have set ProtectHome=false
because we have the data dir in /home/mariadb ... this is probably our fault

so after the upgrade failed i changed the system mysql.service file back to have ProtectHome=false
and then i started the Db all good..

and today when i ran apt-get upgrade it redid the entire upgrade procedure (2 hour downtime again..) and this time it suceeded because it managed to start the db after the upgrade because this time it did not replace the mysql.service file.

Comment by Otto Kekäläinen [ 2017-02-02 ]

This does not solve your issue, but keep in mind that you cannot edit system files that are replace on every update. If you want to customize a systemd service, either add files to /etc/systemd/system/mariadb.service.d/ or disable the mariadb.service completely and write your own my-db.service and start/stop the server using that.

Also, if you have a very big and critical system, maybe you should consider getting a support contract to have a dedicated DB admn who takes care of this to avoid downtime? https://mariadb.org/about/service-providers/

Comment by robert andersson [ 2017-02-03 ]

@otto thanks, i will look into making the systemd script better.
As for a dedicated db admin and downtime. I dont see how a dedicated db admin would have made the first upgrade have less downtime... I admit the second downtime could had been avoided by a more competent admin. (Systemd is still rather new to me)

Comment by OndÅ™ej Surý (Inactive) [ 2017-04-29 ]

robertz The correct way how to change the systemd units is to use overrides. systemctl edit <unit> will create a one for you. Also as usual, the ArchLinux wiki has a more comprehensible documentation: https://wiki.archlinux.org/index.php/systemd#Editing_provided_units

I believe there's no bug on the MariaDB packages side, as direct editing of systemd units were never supported.

Generated at Thu Feb 08 07:53:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.