[MDEV-24669] implement innodb_fatal_semaphore_wait_threshold as systemd watchdog task Created: 2021-01-25  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: 10.11

Type: Task Priority: Major
Reporter: Daniel Black Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: Papercut

Issue Links:
Relates
relates to MDEV-24911 Missing warning before [ERROR] [FATAL... Open
relates to MDEV-25434 mariadb container to have HEALTHCHECK Closed

 Description   

From marko:

Using the interfaces:
sd_event_get_watchdog (3) - Enable event loop watchdog support
sd_event_set_watchdog (3) - Enable event loop watchdog support
sd_watchdog_enabled (3) - Check whether the service manager expects watchdog keep-alive notifications from a service

Rough idea: If this interface is available, we can merge srv_monitor_task and srv_master_callback, which would keep petting the watchdog in systemd. Currently a main reason to have a separate srv_monitor_task is to be able to enforce innodb_fatal_semaphore_wait_threshold when srv_master_callback is stuck waiting for a mutex. (BTW, pthread_mutex_timedwait cannot be relied on; in my experience, it can instantly return EBUSY without waiting for the timeout to expire.)

As noted in MDEV-24426, the idle 10.6 server keeps waking up 4.6 times per second. That is a bit too often in my opinion.



 Comments   
Comment by Daniel Black [ 2022-03-31 ]

Container/k8s healthchecks are largely polled, so a status variable than can be queried would be a good addition here.

Generated at Thu Feb 08 09:31:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.