Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.3(EOL)
-
Centos 7.9 with systemd 219
Description
During SST, sometimes MariaDB service systemd failed with
"Got notification message from PID xxxx, but reception only permitted for main PID yyyy"
Found an interesting page, which face the same issue:
https://community.centminmod.com/threads/mariadb-failed-to-restart-at-first-install.18780/
Root Cause Analysis here is:
1.The issue may happen when reaching system limits in term of available PIDs, causing PID recycling to be used.
2. systemd is not always able to find the cgroup the systemd notification was issued from, typically when services spawn many short-living processes.
3. In such case, systemd collects the PID of the process that sent the notification, browses the list of processes it monitors and delivers the notification to the service that once had a process with that same PID attached to it, which may be wrong in case PIDs were recycled.
Usually, this has no impact, because usual systemd notification is just I'm alive.
In case a service defined in systemd with the Type=Notify and NotifyAccess=All settings shuts down quickly before its STOPPING=1 notification was processed by systemd, then systemd may deliver the notification to another service and make that service shut down instead.
So maybe
NotifyAccess=all
is a solution, but I can't estimate a possible drawback.
Attachments
Issue Links
- relates to
-
MDEV-15607 mysqld crashed few after node is being joined with sst
- Closed
-
MDEV-27613 Fixing debian to only run the full mysql_upgrade process when necessary
- Stalled
- links to