Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.11.9
-
None
-
Debian 12.9 (ARM)
Description
We are facing serious issues with MariaDB server hanging on queries, seemingly at random.
Mostly we are seeing this in our production environments; we suspect this is due to these servers hosting larger sized databases compared to our other workloads.
When mariadb-server hangs, it is not possible to stop it (short of using kill -9). it does not seem to matter whether or not the server is actively replicating or idle. The point where the server hangs seems to be random, but since it's usually mid-transaction, we face data loss each time this happens.
There is nothing obvious to go on in the server logs or system logs. The systems are not resource constrained in any way. All are aws ec2 instance deployed in various regions.
2025-01-19 22:08:26 0 [Note] /usr/sbin/mariadbd (initiated by: unknown): Normal shutdown
|
2025-01-19 22:08:46 0 [Warning] /usr/sbin/mariadbd: Thread 8349 (user : 'root') did not exit
|
The only change I can identify on our systems when this issue arose is normal system patching, which updated Debian from 12.8. to 12.9. I've included the package versions below.
libsmartcols1:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
udev:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
python3.11:arm64 (3.11.2-6+deb12u4, 3.11.2-6+deb12u5)
|
openssh-client:arm64 (1:9.2p1-2+deb12u3, 1:9.2p1-2+deb12u4)
|
libnss-myhostname:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
libpam-systemd:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
ucf:arm64 (3.0043+nmu1, 3.0043+nmu1+deb12u1)
|
libavahi-common-data:arm64 (0.8-10, 0.8-10+deb12u1)
|
libtiff6:arm64 (4.5.0-6+deb12u1, 4.5.0-6+deb12u2)
|
libsystemd0:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
libmount1:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
openssh-server:arm64 (1:9.2p1-2+deb12u3, 1:9.2p1-2+deb12u4)
|
python3-urllib3:arm64 (1.26.12-1, 1.26.12-1+deb12u1)
|
libpython3.11-minimal:arm64 (3.11.2-6+deb12u4, 3.11.2-6+deb12u5)
|
util-linux:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
util-linux-extra:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
systemd:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
libudev1:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
fdisk:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
python3-pkg-resources:arm64 (66.1.1-1, 66.1.1-1+deb12u1)
|
qemu-utils:arm64 (1:7.2+dfsg-7+deb12u7, 1:7.2+dfsg-7+deb12u12)
|
libfdisk1:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
eject:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
libuuid1:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
uuid-runtime:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
systemd-resolved:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
base-files:arm64 (12.4+deb12u8, 12.4+deb12u9)
|
python3-jinja2:arm64 (3.1.2-1, 3.1.2-1+deb12u1)
|
libpython3.11-stdlib:arm64 (3.11.2-6+deb12u4, 3.11.2-6+deb12u5)
|
libavahi-common3:arm64 (0.8-10, 0.8-10+deb12u1)
|
mount:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
libglib2.0-0:arm64 (2.74.6-2+deb12u4, 2.74.6-2+deb12u5)
|
openssh-sftp-server:arm64 (1:9.2p1-2+deb12u3, 1:9.2p1-2+deb12u4)
|
python3.11-minimal:arm64 (3.11.2-6+deb12u4, 3.11.2-6+deb12u5)
|
libsystemd-shared:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
systemd-sysv:arm64 (252.31-1~deb12u1, 252.33-1~deb12u1)
|
libblkid1:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
linux-image-cloud-arm64:arm64 (6.1.119-1, 6.1.123-1)
|
bsdutils:arm64 (1:2.38.1-5+deb12u2, 1:2.38.1-5+deb12u3)
|
libavahi-client3:arm64 (0.8-10, 0.8-10+deb12u1)
|
bsdextrautils:arm64 (2.38.1-5+deb12u2, 2.38.1-5+deb12u3)
|
linux-libc-dev:arm64 (6.1.119-1, 6.1.123-1)
|
Attachments
Issue Links
- is duplicated by
-
MDEV-35892 Server is not responding anymore
-
- Closed
-
-
MDEV-35923 buf_load() hangs during shutdown (rpl.rpl_from_mysql80 sporadic failure)
-
- Closed
-
-
MDEV-36023 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch
-
- Closed
-
- relates to
-
MDEV-35923 buf_load() hangs during shutdown (rpl.rpl_from_mysql80 sporadic failure)
-
- Closed
-
-
MDEV-35334 Incorrect page checksum at the start of an .ibd file
-
- Needs Feedback
-
-
MDEV-36182 revert incorrect 5.14 kernel warnings and correct liburing interface usage
-
- Closed
-
- links to
stephen.hames, the version number is something like this:
Linux 6.1.119 – Debian version 6.1.0-28
Linux 6.1.123 – Debian version 6.1.0-29
Linux 6.1.124 – Debian version 6.1.0-30
There were six different io_uring fixes backported into the 6.1 branch between 6.1.119 and 6.1.123. One of the backports was buggy and causing the issue here in
MDEV-35886. marko is thinking that it's possible that one of the other fixes might possibly have addressed MDEV-35334.I would recommend trying the kernel packages I produced which fixed the backported issue by adding the call to smp_mb(). Except you're on ARM so those won't work. Best I can think of is to do what I did on your platform, following this guide:
https://www.dwarmstrong.org/kernel/
To your question about timing of an updated kernel: the last several 6.1 releases from the kernel project were Jan 2, Jan 9, Jan 17, Jan 19, and Jan 23. Makes it seem like it won't be long before 6.1.128 is released with this fix.
As for when Debian might release a package containing that new kernel, it's harder to say. They don't release one for every version. But since there's a critical bug open for this issue, they might jump on it quickly.