[MDEV-23050] MariaDB Server failure on Proxmox (with Ubuntu 20.04 and Systemd) Created: 2020-06-30  Updated: 2021-01-28  Resolved: 2020-09-23

Status: Closed
Project: MariaDB Server
Component/s: N/A
Affects Version/s: 10.3.22, 10.5.4
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Florent Hazard Assignee: Daniel Black
Resolution: Not a Bug Votes: 1
Labels: server, systemd
Environment:

Ubuntu 20.04 Focal on Proxmox 5.4


Attachments: Text File MariaDB-Error-syslog.txt     Text File MariaDB-Error.txt     Text File proxmox-5.4.txt     Text File proxmox-mdev-23050-install.txt    
Issue Links:
Relates
relates to MDEV-23321 debian upgrade shouldn't start previo... Open

 Description   

Hello,

Just after installing MariaDB server & client, I am getting the following errors:

Setting up libdbd-mariadb-perl (1.11-3ubuntu2) ...
Setting up mariadb-client-core-10.5 (1:10.5.4+maria~focal) ...
Setting up mariadb-client-10.5 (1:10.5.4+maria~focal) ...
Setting up mariadb-client (1:10.5.4+maria~focal) ...
Setting up mariadb-server-10.5 (1:10.5.4+maria~focal) ...
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /lib/systemd/system/mariadb.service.
Job for mariadb.service failed because of unavailable resources or another system error.
See "systemctl status mariadb.service" and "journalctl -xe" for details.
Setting up mariadb-server (1:10.5.4+maria~focal) ...
Processing triggers for systemd (245.4-4ubuntu3.1) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9) ...

systemctl status mariadb.service
Failed to dump process list for 'mariadb.service', ignoring: Input/output error
● mariadb.service - MariaDB 10.5.4 database server
Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: failed (Result: resources)
Docs: man:mariadbd(8)
https://mariadb.com/kb/en/library/systemd/
CGroup: /system.slice/mariadb.service

Jun 30 06:43:57 web3 systemd[1]: mariadb.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Jun 30 06:43:57 web3 systemd[1]: mariadb.service: Failed to run 'start-pre' task: Device or resource busy
Jun 30 06:43:57 web3 systemd[1]: mariadb.service: Failed with result 'resources'.
Jun 30 06:43:57 web3 systemd[1]: Failed to start MariaDB 10.5.4 database server.

Service does not want to start. The issue seems related to Systemd.
I am able to run mariadb server manually using

cd '/usr' ; /usr/bin/mysqld_safe --datadir='/var/lib/mysql'

I tried with a brand new container on my proxmox server:

pct create 131 local:vztmpl/ubuntu-20.04-standard_20.04-1_amd64.tar.gz -hostname "web3.domain.com" -arch amd64 -cores 2 -memory 512 -swap 512 -net0 bridge=vmbr1,name=eth0,ip=10.0.1.1/24,gw=10.0.1.254 -onboot 1 -ostype ubuntu -rootfs local:30
 
pct start 131
pct enter 131
 
apt-get update
apt-get full-upgrade
 
apt-get -y install traceroute aptitude byobu software-properties-common nano curl
 
curl -LsS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash
apt-get -y install mariadb-server mariadb-client

AskUbuntu: https://askubuntu.com/questions/1254782/unable-to-install-working-mariadb



 Comments   
Comment by Daniel Black [ 2020-06-30 ]

The error is because an existing process exists in the mariadb.service cgroup at the point at which the attempted start happens.

When that occurs if systemd did start the service, assuming it was a mariadb process there already, it would be running on the same data/port. This is why the service is configured this way.

The `journalctl -n 50 -u mariadb.service` will probably contain a warning indicating which process it was. Can you include this.

Can you try:

export DEBIAN_SCRIPT_DEBUG=1
apt-get install ....

As seeing what happens in the packaging scripts https://github.com/MariaDB/server/blob/10.5/debian/mariadb-server-10.5.preinst#L36 would be useful.

note: yes, I'm to blame for this error https://github.com/systemd/systemd/pull/11457

Comment by Florent Hazard [ 2020-06-30 ]

Here is the complete debug trace for maria db install.

I think there is no other mariadb process...

root@web3:~# ps aux | grep mysql
root 13639 0.0 0.1 3308 668 ? S+ 09:37 0:00 grep --color=auto mysql
root@web3:~# ps aux | grep mariadb
root 13642 0.0 0.1 3308 664 ? S+ 09:37 0:00 grep --color=auto mariadb

root@web3:~# netstat -antu
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 10.0.1.1:33066 91.189.88.XX:80 TIME_WAIT
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 ::1:25 :::* LISTEN
udp 0 0 127.0.0.53:53 0.0.0.0:*

(Ip addresses are anonymized.)
MariaDB-Error.txt

Comment by Otto Kekäläinen [ 2020-06-30 ]

Please paste the output of `journalctl -n 50 -u mariadb.service` or /var/log/journal/mariadb....log or /var/log/syslog. There should be more error messages somewhere.

Comment by Florent Hazard [ 2020-06-30 ]

journalctl -n 50 -u mariadb.service

Jun 30 09:31:56 web3 systemd[1]: mariadb.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Jun 30 09:31:56 web3 systemd[1]: mariadb.service: Failed to run 'start-pre' task: Device or resource busy
Jun 30 09:31:56 web3 systemd[1]: mariadb.service: Failed with result 'resources'.
Jun 30 09:31:56 web3 systemd[1]: Failed to start MariaDB 10.5.4 database server.

ll /var/log/mysql/

=> Empty

ll /var/log/journal/maria*

=> No such file or directory

less +G /var/log/syslog

MariaDB-Error-syslog.txt

Comment by Otto Kekäläinen [ 2020-06-30 ]

syslog has:

Jun 30 12:53:27 web3 systemd[1]: Reloading.
Jun 30 12:53:27 web3 systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket <E2><86><92> /run/dbus/system_bus_socket; please update the unit file accordingly.
Jun 30 12:53:28 web3 systemd[1]: Reloading.
Jun 30 12:53:28 web3 systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket <E2><86><92> /run/dbus/system_bus_socket; please update the unit file accordingly.
Jun 30 12:53:28 web3 systemd[1]: Reloading.
Jun 30 12:53:28 web3 systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket <E2><86><92> /run/dbus/system_bus_socket; please update the unit file accordingly.
Jun 30 12:53:28 web3 systemd[1]: mariadb.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Jun 30 12:53:28 web3 systemd[1]: mariadb.service: Failed to run 'start-pre' task: Device or resource busy
Jun 30 12:53:28 web3 systemd[1]: mariadb.service: Failed with result 'resources'.
Jun 30 12:53:28 web3 systemd[1]: Failed to start MariaDB 10.5.4 database server.
Jun 30 12:53:29 web3 systemd[1]: Reloading.
Jun 30 12:53:29 web3 systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket <E2><86><92> /run/dbus/system_bus_socket; please update the unit file accordingly.

Is this something danblack recognizes?

Comment by Daniel Black [ 2020-07-01 ]

dbus messages seem unrelated however overall it seems to be the same message as on the console.

I was hoping to see messages of the form:
https://github.com/systemd/systemd/blob/master/src/core/unit.c#L5888 "Found left-over process...." in the journal for the service. `journalctl --priority=debug -u mariadb.service -n 100` to get all messages, including the warning, as a guess.

notably from the trace provided:

+ stop_server
+ pgrep -x --ns 13249 mariadbd
+ return

So no existing mariadbd is found. If it was there as mysqld (waiting on this fix https://github.com/MariaDB/server/pull/1600/files#diff-6f983df6d1ad84e9a9cf71cd78af39caR22) it seems odd that it went away before you looked at `ps aux`.

Note the pgrep is also restricted with -ns to the post install script namespace. This is currently flawed as the cgroup in which mariadbd is started is a different namespace (< otto note):

dan@fstn4-p1:~$ systemctl status mariadb.service
● mariadb.service - MariaDB 10.5.5 database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Tue 2020-06-30 13:54:43 AEST; 22h ago
     Docs: man:mariadbd(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 19491 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
  Process: 19489 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 19465 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq
  Process: 19463 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 19462 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 19474 (mariadbd)
   Status: "Taking your SQL requests now..."
   CGroup: /system.slice/mariadb.service
           └─19474 /usr/sbin/mariadbd
 
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] Plugin 'FEEDBACK' is disabled.
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] InnoDB: Buffer pool(s) load completed at 200630 13:54:43
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Warning] Failed to create a socket for IPv6 '::': errno: 97.
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] Server socket created on IP: '0.0.0.0'.
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] Reading of all Master_info entries succeeded
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] Added new Master_info '' to hash table
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: 2020-06-30 13:54:43 0 [Note] /usr/sbin/mariadbd: ready for connections.
Jun 30 13:54:43 fstn4-p1 mariadbd[19474]: Version: '10.5.5-MariaDB-1:10.5.5+maria~bionic'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariad
Jun 30 13:54:43 fstn4-p1 systemd[1]: Started MariaDB 10.5.5 database server.
 
dan@fstn4-p1:~$ ps -ef | grep mariadb
mysql     19474      1  0 Jun30 ?        00:00:19 /usr/sbin/mariadbd
dan       62981  62942  0 12:40 pts/0    00:00:00 grep --color=auto mariadb
dan@fstn4-p1:~$ pgrep -x --ns $$  mariadbd
dan@fstn4-p1:~$ 
dan@fstn4-p1:~$ pgrep -x --ns $$  /usr/sbin/mariadbd
dan@fstn4-p1:~$ 
dan@fstn4-p1:~$ sudo bash
root@fstn4-p1:~# pgrep -x --ns $$  mariadbd
root@fstn4-p1:~# 
root@fstn4-p1:~# pgrep -x --ns $$  /usr/sbin/mariadbd
root@fstn4-p1:~# 

I could be wrong somewhere about the relation between cgroups and namespace, however the pgrep output is consistenlty empty.

root@fstn4-p1:~# ls -la /proc/self/ns
total 0
dr-x--x--x 2 root root 0 Jul  1 12:56 .
dr-xr-xr-x 8 root root 0 Jul  1 12:56 ..
lrwxrwxrwx 1 root root 0 Jul  1 12:56 cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 ipc -> 'ipc:[4026531839]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 mnt -> 'mnt:[4026531840]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 net -> 'net:[4026531872]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 pid -> 'pid:[4026531836]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 pid_for_children -> 'pid:[4026531836]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 user -> 'user:[4026531837]'
lrwxrwxrwx 1 root root 0 Jul  1 12:56 uts -> 'uts:[4026531838]'
root@fstn4-p1:~# 
root@fstn4-p1:~# ls -la /proc/$(pidof mariadbd)/ns
total 0
dr-x--x--x 2 mysql mysql 0 Jul  1 12:40 .
dr-xr-xr-x 8 mysql mysql 0 Jun 30 13:54 ..
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:56 cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 ipc -> 'ipc:[4026531839]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 mnt -> 'mnt:[4026532286]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 net -> 'net:[4026531872]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 pid -> 'pid:[4026531836]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:56 pid_for_children -> 'pid:[4026531836]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 user -> 'user:[4026531837]'
lrwxrwxrwx 1 mysql mysql 0 Jul  1 12:40 uts -> 'uts:[4026531838]'

One way to look at the cgroup contents (I assume in the same way that starting a service checks) is:

$ systemd-cgls --unit mariadb.service
Unit mariadb.service (/system.slice/mariadb.service):
└─19474 /usr/sbin/mariadbd

Alternately:

$ cat /sys/fs/cgroup/systemd/system.slice/mariadb.service/cgroup.procs
19474

Another guess at what could be running:

 root@fstn4-p1:~# ps -u mysql
   PID TTY          TIME CMD
 19474 ?        00:00:19 mariadbd

Given the debian output has:

+ '[' '!' -d /var/lib/mysql ']'
+ '[' '!' -L /var/lib/mysql ']'
+ mkdir -Z /var/lib/mysql

I assume this is a fresh install which make most assumptions about an existing process being circumstantial. There was no adduser/addgroup executed to create the mysql user, however maybe that was added from the previous attempt.

Comment by Florent Hazard [ 2020-07-01 ]

I'm almost sure there is no running mariadb process.

ps -u mysql => Empty
ps aux => No mariadb or mysql
top => No mariadb or mysql (but there is all other services)
systemd-cgls --unit mariadb.service => Empty
pidof mariadbd => empty

This is a fresh install, I gave you all instructions to create it on a Proxmox 5.4
On the same proxmox server, with Ubuntu 18.04, this is ok for MariaDB 10.2 10.3 10.4 & 10.5 for my former installs.

I am 99.9% sure this is a false positive, the script launching mariadb detect an existing process but there is no one. But why is it detecting this ?

For my case, I think the best quick fix for now is to create my own service.
This is working fine with SendSIGKILL=yes & systemctl daemon-reload

Comment by Otto Kekäläinen [ 2020-07-06 ]

Unfortunately I don't have time to install Proxmox to try to replicate this. Does faust want to take a look or should we deem Proxmox marginal until more users report issues with it?

Comment by Alexandros Ioannides [ 2020-07-11 ]

I am facing the exact same issue.

Proxmox 5.4
Ubuntu 20.04
MariaDB 10.4 & 10.5

I tried a lot of things but it was impossible to find a workaround.

Comment by Alexandros Ioannides [ 2020-07-20 ]

Any updates? Can this bug be fixed?

Comment by Florent Hazard [ 2020-07-20 ]

The temporary fix is to set SendSIGKILL to yes.

nano /lib/systemd/system/mariadb.service
SendSIGKILL=yes
 
systemctl daemon-reload
service mysql restart

Comment by Dashamir Hoxha [ 2020-08-22 ]

I have exactly the same problem with Docker containers.
However I have noticed that if mariadb is already running in another container, then it fails to start on a second one. However if there is no other mariadb running on any container, then it starts without a problem. As Florent Hazard said above: "the script launching mariadb detect an existing process but there is no one". It seems that it detects a process on another container, which it shouldn't do.

So, another workaround might be to use only one docker container that runs mariadb.

Of course, the workaround described above by Florent (`SendSIGKILL=yes`) does work and maybe is more suitable. Thanks Florent.

Comment by Daniel Black [ 2020-08-24 ]

Seems this could be fixed focusing on the right things to stop MDEV-23321.

dashohoxha I'm not sure why you'd run mariadb under systemd in a container however it does seem to provide an easier test case to set up. Can you attach a minimal Dockerfile for this?

Comment by Dashamir Hoxha [ 2020-08-24 ]

@danblack I use Docker containers as lightweight virtual machines, by running systemd inside them.
It is not easy to attach a minimal Dockerfile, but here are the steps to install the container: https://gitlab.com/docker-scripts/mariadb#installation
I can also write a Katacoda scenario that reproduces the problem, if this helps.

Comment by Alexandros Ioannides [ 2020-09-16 ]

Any proper fixes on this?

Comment by Daniel Black [ 2020-09-17 ]

We'll I've got a proxmox Virtual Environment 6.2-11 vm installed. Instructions missed downloading template (p200+ of the manual) and assumed storage configuration. Please consider the reader here (me) as a really proxmox newbie when doing instructions please, because I am.

created

root@proxmox:~#  pct create 131 local:vztmpl/ubuntu-20.04-standard_20.04-1_amd64.tar.gz -hostname "web3.domain.com" -arch amd64 -cores 2 -memory 512 -swap 512 -net0 bridge=vmbr0,name=eth0,ip=192.168.122.192/24,gw=192.168.122.1 -onboot 1 -ostype ubuntu -rootfs local-lvm:30
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  Logical volume "vm-131-disk-0" created.
  WARNING: Sum of all thin volume sizes (30.00 GiB) exceeds the size of thin pool pve/data and the size of whole volume group (<19.50 GiB).
mke2fs 1.44.5 (15-Dec-2018)
Discarding device blocks: done                            
Creating filesystem with 7864320 4k blocks and 1966080 inodes
Filesystem UUID: 4830e771-91dd-42f5-b0de-b51e91adda97
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000
 
Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Multiple mount protection is enabled with update interval 5 seconds.
Writing superblocks and filesystem accounting information: done   
 
extracting archive '/var/lib/vz/template/cache/ubuntu-20.04-standard_20.04-1_amd64.tar.gz'
Total bytes read: 669050880 (639MiB, 194MiB/s)
Creating SSH host key 'ssh_host_rsa_key' - this may take some time ...
done: SHA256:0rF3ZFxs+cT7816UY3qQW9weNNWqN5o5f7SdNInNQBE root@web3
Creating SSH host key 'ssh_host_ecdsa_key' - this may take some time ...
done: SHA256:4niQPFVMFotb0e7NVyJTVjMKZcYjYNLSTN+Lrs2Yw+0 root@web3
Creating SSH host key 'ssh_host_ed25519_key' - this may take some time ...
done: SHA256:jfB8xpfUaTxrxZCwKn8U1gAwF3t9v2SNUPDvSRPo3to root@web3
Creating SSH host key 'ssh_host_dsa_key' - this may take some time ...
done: SHA256:0Juqejpdqw6h40fPHPpIML3iEwI5MZb0iZIIFvuqCww root@web3
root@proxmox:~# pct start 131
root@proxmox:~# pct enter 131
...
root@web3:~# curl -LsS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash
[info] Repository file successfully written to /etc/apt/sources.list.d/mariadb.list
[info] Adding trusted package signing keys...
[info] Running apt-get update...
[info] Done adding trusted package signing keys
root@web3:~# export DEBIAN_SCRIPT_DEBUG=1
root@web3:~# apt-get -y install mariadb-server    
...
(see attached) /proxmox-mdev-23050-install.txt
---
+ systemctl --system daemon-reload
+ deb-systemd-invoke start mariadb.service
Setting up mariadb-server (1:10.5.5+maria~focal) ...
Processing triggers for systemd (245.4-4ubuntu3) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9) ...
root@web3:~# systemctl status mariadb.service
● mariadb.service - MariaDB 10.5.5 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/mariadb.service.d
             └─migrated-from-my.cnf-settings.conf
     Active: active (running) since Thu 2020-09-17 07:36:24 UTC; 3min 31s ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 4167 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 9 (limit: 4915)
     Memory: 80.2M
     CGroup: /system.slice/mariadb.service
             └─4167 /usr/sbin/mariadbd
 
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: mysql
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: performance_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Phase 6/7: Checking and upgrading tables
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Processing databases
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: information_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: performance_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Phase 7/7: Running 'FLUSH PRIVILEGES'
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: OK
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4812]: Checking for insecure root accounts.
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4816]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables
root@web3:~# mysql 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 43
Server version: 10.5.5-MariaDB-1:10.5.5+maria~focal mariadb.org binary distribution
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> select version();
+-------------------------------------+
| version()                           |
+-------------------------------------+
| 10.5.5-MariaDB-1:10.5.5+maria~focal |
+-------------------------------------+
1 row in set (0.001 sec)
+ systemctl --system daemon-reload
+ deb-systemd-invoke start mariadb.service
Setting up mariadb-server (1:10.5.5+maria~focal) ...
Processing triggers for systemd (245.4-4ubuntu3) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9) ...
root@web3:~# systemctl status mariadb.service
● mariadb.service - MariaDB 10.5.5 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/mariadb.service.d
             └─migrated-from-my.cnf-settings.conf
     Active: active (running) since Thu 2020-09-17 07:36:24 UTC; 3min 31s ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 4167 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 9 (limit: 4915)
     Memory: 80.2M
     CGroup: /system.slice/mariadb.service
             └─4167 /usr/sbin/mariadbd
 
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: mysql
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: performance_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Phase 6/7: Checking and upgrading tables
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Processing databases
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: information_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: performance_schema
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: Phase 7/7: Running 'FLUSH PRIVILEGES'
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4188]: OK
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4812]: Checking for insecure root accounts.
Sep 17 07:36:29 web3 /etc/mysql/debian-start[4816]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables
root@web3:~# mysql 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 43
Server version: 10.5.5-MariaDB-1:10.5.5+maria~focal mariadb.org binary distribution
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> select version();
+-------------------------------------+
| version()                           |
+-------------------------------------+
| 10.5.5-MariaDB-1:10.5.5+maria~focal |
+-------------------------------------+
1 row in set (0.001 sec)
MariaDB [(none)]> Bye
root@web3:~# systemd-cgls --unit mariadb.service
Unit mariadb.service (/system.slice/mariadb.service):
└─4167 /usr/sbin/mariadbd
root@web3:~# systemctl show mariadb.service| grep KILL
SendSIGKILL=no

root@proxmox:~# lxc-ls -f
NAME STATE   AUTOSTART GROUPS IPV4            IPV6 UNPRIVILEGED 
131  RUNNING 0         -      192.168.122.192 -    false      

I haven't done anything special. Change and IP, default routte and interface to match the local install. Used the local-lvm volume for the container.

Please tell me what I've done differently.

Comment by Alexandros Ioannides [ 2020-09-17 ]

In my case, problem only exists on Proxmox 5.

Comment by Daniel Black [ 2020-09-17 ]

So maybe I really need proxmox-5.4?

dashohoxha if Katacoda is easy I'll take it, sure.

Comment by Alexandros Ioannides [ 2020-09-17 ]

To replicate the issue, yes, and install Ubuntu 20.04 alongside MariaDB 10.5.

Comment by Dashamir Hoxha [ 2020-09-17 ]

> if Katacoda is easy I'll take it, sure

I am going to prepare soon a scenario that replicates the problem. Stay tuned.

Comment by Dashamir Hoxha [ 2020-09-17 ]

Here it is: https://katacoda.com/dashohoxha/scenarios/mariadb

Comment by Alexandros Ioannides [ 2020-09-20 ]

Any updates? Thanks for trying to resolve this issue.

By the way, just to make this more clear, issue can't be replicated with Ubuntu 18.04. Only with 20.04.

Comment by Daniel Black [ 2020-09-21 ]

Got a bit caught up with the MariaDB Server Fest that was running last week but I do want to get this resolved before the next release.

Thanks for the 18.04 counter case.

Looking closer it might be `mariadb.service: Failed to run 'start-pre' task: Device or resource busy`

https://github.com/MariaDB/server/blob/10.5/cmake/systemd.cmake#L49 (which I've wanted to remove for a while - https://github.com/MariaDB/server/pull/1105)
https://github.com/MariaDB/server/blob/10.5/support-files/mariadb.service.in#L79 (which is in a similar state of neglect with https://github.com/MariaDB/server/pull/1143 waiting)

with this gone, PermissionsStartOnly=true can be removed (root permissions on start-pre services) can be removed which might be the culprit.

Comment by Daniel Black [ 2020-09-22 ]

So proxmox, it only shows up once you create a second instance of mariadb.

so here I've got two proxmox guests running, different address etc but otherwise the same:

root@proxmox54:~# ps -ef | grep init
root         1     0  0 21:55 ?        00:00:01 /sbin/init
root      1644  1579  0 21:55 ?        00:00:00 /sbin/init
root      3357     2  0 22:00 ?        00:00:00 [ext4lazyinit]
root      3409  3349  1 22:00 ?        00:00:00 /sbin/init
root      4254  3333  0 22:00 pts/2    00:00:00 grep init
 
root@proxmox54:~# cat /proc/1644/cgroup 
12:blkio:/lxc/131/ns
11:memory:/lxc/131/ns
10:freezer:/lxc/131/ns
9:rdma:/lxc/131/ns
8:devices:/lxc/131/ns
7:hugetlb:/lxc/131/ns
6:cpu,cpuacct:/lxc/131/ns
5:perf_event:/lxc/131/ns
4:cpuset:/lxc/131/ns
3:pids:/lxc/131/ns
2:net_cls,net_prio:/lxc/131/ns
1:name=systemd:/lxc/131/ns/init.scope
0::/init.scope
 
root@proxmox54:~# cat /proc/3409/cgroup 
12:blkio:/lxc/121/ns
11:memory:/lxc/121/ns
10:freezer:/lxc/121/ns
9:rdma:/lxc/121/ns
8:devices:/lxc/121/ns
7:hugetlb:/lxc/121/ns
6:cpu,cpuacct:/lxc/121/ns
5:perf_event:/lxc/121/ns
4:cpuset:/lxc/121/ns
3:pids:/lxc/121/ns
2:net_cls,net_prio:/lxc/121/ns
1:name=systemd:/lxc/121/ns
0::/init.scope
 
root@proxmox54:~# ps -ef | egrep '(mysqld|mariadbd)'
110       2269  1644  0 21:55 ?        00:00:00 /usr/sbin/mariadbd
root      4485  3333  0 22:01 pts/2    00:00:00 grep -E (mysqld|mariadbd)
root@proxmox54:~# cat /proc/2269/cgroup 
12:blkio:/lxc/131/ns
11:memory:/lxc/131/ns/system.slice/mariadb.service
10:freezer:/lxc/131/ns
9:rdma:/lxc/131/ns
8:devices:/lxc/131/ns/system.slice/mariadb.service
7:hugetlb:/lxc/131/ns
6:cpu,cpuacct:/lxc/131/ns
5:perf_event:/lxc/131/ns
4:cpuset:/lxc/131/ns
3:pids:/lxc/131/ns/system.slice/mariadb.service
2:net_cls,net_prio:/lxc/131/ns
1:name=systemd:/lxc/131/ns/system.slice/mariadb.service
0::/system.slice/mariadb.service
 
root@proxmox54:~# pct enter 121
 
root@gopher:~# apt-get update
root@gopher:~# apt-get install mariadb-server
...
Job for mariadb.service failed because of unavailable resources or another system error.
See "systemctl status mariadb.service" and "journalctl -xe" for details.
Setting up libhttp-message-perl (6.22-1) ...
Setting up libcgi-pm-perl (4.46-1) ...
Setting up libhtml-template-perl (2.97-1) ...
Setting up mariadb-server (1:10.3.22-1ubuntu1) ...
Setting up libcgi-fast-perl (1:2.15-1) ...
Processing triggers for systemd (245.4-4ubuntu3) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9) ...
root@gopher:~# 
root@gopher:~# systemctl status mariadb.service 
Failed to dump process list for 'mariadb.service', ignoring: Input/output error
● mariadb.service - MariaDB 10.3.22 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
     Active: failed (Result: resources)
       Docs: man:mysqld(8)
             https://mariadb.com/kb/en/library/systemd/
     CGroup: /system.slice/mariadb.service
 
Sep 22 12:09:38 gopher systemd[1]: mariadb.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Sep 22 12:09:38 gopher systemd[1]: mariadb.service: Failed to run 'start-pre' task: Device or resource busy
Sep 22 12:09:38 gopher systemd[1]: mariadb.service: Failed with result 'resources'.
Sep 22 12:09:38 gopher systemd[1]: Failed to start MariaDB 10.3.22 database server.

dashohoxha, same applies in your setup (I had trouble copying the katacoder screen, but essentially the same, the systemd init of both docker containers shared the 0 cgroup namespace.

So appoligies for not believing the SendSIGKILL=no was the problem, it kinda is, because if two different systemd are sharing the same base namespace, they create the same hierarcy for all services under it.

I did a test with apache2. Installed it systemctl edit apache2.service. Append

apache2 edit

[Service]
SendSIGKill=no

In two different instances, and only one can start.

So not a mariadb problem. Definately a problem I triggered https://github.com/systemd/systemd/pull/11457.

promox6.2 separates namespaces

root@proxmox:~# cat /proc/1641/cgroup 
12:perf_event:/lxc/131/ns
11:hugetlb:/lxc/131/ns
10:memory:/lxc/131/ns
9:cpuset:/lxc/131/ns
8:freezer:/lxc/131/ns
7:rdma:/lxc/131/ns
6:blkio:/lxc/131/ns
5:pids:/lxc/131/ns
4:devices:/lxc/131/ns
3:net_cls,net_prio:/lxc/131/ns
2:cpu,cpuacct:/lxc/131/ns
1:name=systemd:/lxc/131/ns/init.scope
0::/lxc/131/ns/init.scope
 
root@proxmox:~# cat /proc/2968/cgroup 
12:perf_event:/lxc/121/ns
11:hugetlb:/lxc/121/ns
10:memory:/lxc/121/ns/init.scope
9:cpuset:/lxc/121/ns
8:freezer:/lxc/121/ns
7:rdma:/lxc/121/ns
6:blkio:/lxc/121/ns
5:pids:/lxc/121/ns/init.scope
4:devices:/lxc/121/ns/init.scope
3:net_cls,net_prio:/lxc/121/ns
2:cpu,cpuacct:/lxc/121/ns
1:name=systemd:/lxc/121/ns/init.scope
0::/lxc/121/ns/init.scope

For proxmox-5.4 users I suggest asking support how to get each container having its own cgroup namespace like 6.2, because its at least an information leak between instances otherwise, and in special cases like this a DoS.

What to do on the systemd side? I'm going to sleep on it and probably raise an FYI bug, but I'm not sure what they can do about it.
https://github.com/systemd/systemd/blob/bc2ed3bbf01ff75a317f019ccaddf3f87d1ea6bd/src/core/service.c#L2093 , try to access the other process?

Comment by Daniel Black [ 2020-09-23 ]

This has been reported as a security vulnerability to Proxmox, Red Hat (systemd owners) and Ubuntu (for docker package). While its public here please keep further attention to a minimum while the respective teams process the implications and deliver a solution.

Comment by Daniel Black [ 2020-09-23 ]

Not so much "Not a bug" as "Not our bug". Progress of above entities will guide if MariaDB changes it packaging to circumvent the problem.

In the mean time docker can run containers with an explict parent cgroup --cgroup-parent=/..., proxmox users can either upgrade of find some other cgroup solution. Feel free to share here if you'd like.

Comment by Daniel Black [ 2020-09-23 ]

Me:
> I'm note sure why I
> didn't think the running of two systemd instances in the same cgroup
> would happen.

Response from systemd folks:

That's simply not supported. We expressly document that this is not
allowed. Linux cgroups operate on a delegation model, where each
subtree has a single writer. If you allow multiple systemd instances
manage the very same subtree then everything falls apart and you get to
keep the pieces.

Docker shouldn't even allow this. If they do, it's a bug in Docker. ([and Proxmox])

https://systemd.io/CGROUP_DELEGATION

Comment by Daniel Black [ 2020-09-23 ]


I tested the upstream Docker nightly 19.03.13 and it still shares the common cgroupv2 unified between containers.

Comment by Daniel Black [ 2020-09-23 ]

Proxmox support pointed me at the following to say 5.x is EOL

https://forum.proxmox.com/threads/proxmox-ve-support-lifecycle.35755/

Comment by Daniel Black [ 2020-09-23 ]

dashohoxha, recommend podman instead, they seem to have a better clue about cgroups and systemd which seems to be the basis of your platform. It should avoid other subtle errors too. Symlinking `ln -s /usr/bin/podman /opt/docker-scripts/bin/docker` mostly worked.

Thank you so much for katacoda too. It helped a lot.

Comment by Dashamir Hoxha [ 2020-09-24 ]

[danblack] thanks for recommending podman, it seems promising.
However it doesn't support yet `--network-alias`, and this seems to be a blocker for docker-scripts. I will keep an eye on it anyway.

Comment by Daniel Black [ 2021-01-28 ]

fyi docker-20.10 has the release notes and from information and this commit https://github.com/moby/moby/commit/816fbcd306274b9561c62ae076bdff71062ebe85 hopefully dashohoxha you should be able to run your systemd in containers with MariaDB without conflict now.

Comment by Daniel Black [ 2021-01-28 ]

If it doesn't, I'm sorry. Maybe you'll have better luck reporting a bug there directly.

Generated at Thu Feb 08 09:19:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.