Details
-
Bug
-
Status: Needs Feedback (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.11.6
-
Debian 12 / Stable.
Description
Every ~2-7 days, around midnight, one of our SQL servers is experiencing an issue where it deadlocks near-completely. The log, as expected, just stops abruptly with no indication of what's wrong.
I can still connect using the 'root' account using a unix socket while this happens. Active queries (show processlist) seems independent.
Not the queries are deadlocking, the program itself is. No queries will process or complete as the program internally waits endlessly for mutexes.
I researched a possible cause; the most common appears to be calling unsafe functions in signal handlers.
I'm not too well versed in gdb. I don't know how to reproduce the problem (that's our entire issue). It can and does occur periodically. THe situation in [info threads] of gdb looks a bit like this:
- About 300 threads stuck at `0x7f7bf8fa86c0 (LWP 684969) "mariadbd" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38`
- 10-12 threads inside __futex_abstimed_wait_common64
- 5-6 threads in _GI__poll
- About twenty entries like this (unknown); `Thread 0x7f7bf838d6c0 (LWP 329146) "iou-wrk-298768" 0x0000000000000000 in ?? ()`