[MDEV-27196]  guest-agent fs-freeze not working on VM with Debian 11 + MariaDB 10.6 and 10.7 Created: 2021-12-08  Updated: 2022-11-30

Status: Open
Project: MariaDB Server
Component/s: Packaging, Server
Affects Version/s: 10.6.5, 10.7.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Paweł Kośka Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Environment:

Virtual Machine (Debian 11.1 x86_64) on Proxmox VE 7.1



 Description   

There is something strange between mariadb and qemu-ga on debian 11 (Virtual Machine).

When mariadb is running, you cannot backup a running virtual machine (snapshot).
Host sends the command "fs-freeze" to qemu-ga but it does not work properly.
it hangs the entire VM

Affected mariadb version: 10.6.5 and 10.7.1
Not affected: 10.5.13 (and 10.6.5 on Centos 8 Stream)

My test procedure.



 Comments   
Comment by FingerlessGloves [ 2022-02-23 ]

I too have this issue with my install of MariaDB 10.7 on Debian 11.

I also have MariaDB 10.5 on Debian 11 which doesn't get this issue. But this 10.5 is from Debian repo's not MariaDB.

I've also reported the issue on qemu, but the fix maybe within MariaDB's area.
https://gitlab.com/qemu-project/qemu/-/issues/881

Comment by MBO [ 2022-11-16 ]

We have the same issue on one of our servers after upgrading to MariaDB 10.6. OS is CentOS 7 on Proxmox

Comment by Faustin Lammler [ 2022-11-23 ]

Hi!
I don't have a proxmox environment to test for now so I tried to reproduce this on KVM/Qemu and I am not able to reproduce it.

Here is the test:

Some information about the test system (Debian 11):

❯ uname -r
6.0.0-0.deb11.2-amd64
❯ virsh --version
8.0.0
❯ qemu-system-x86_64 --version
QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-11+deb11u2)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
❯ dpkg -l | grep libvirt-daemon | awk '{print $1" "$2" "$3}'
ii libvirt-daemon 8.0.0-1~bpo11+1
ii libvirt-daemon-config-network 8.0.0-1~bpo11+1
ii libvirt-daemon-config-nwfilter 8.0.0-1~bpo11+1
ii libvirt-daemon-driver-lxc 8.0.0-1~bpo11+1
ii libvirt-daemon-driver-qemu 8.0.0-1~bpo11+1
ii libvirt-daemon-driver-vbox 8.0.0-1~bpo11+1
ii libvirt-daemon-driver-xen 8.0.0-1~bpo11+1
ii libvirt-daemon-system 8.0.0-1~bpo11+1
ii libvirt-daemon-system-systemd 8.0.0-1~bpo11+1

Comment by FingerlessGloves [ 2022-11-23 ]

@Faustin
Your QEMU version is older than what I tested with the issue, at the time of my testing I was using 6.1.1 your using 5.2.0 with new 6 kernel. Could be the issue doesn't happen on older versions or newer kernel could be the difference.

Proxmox comes with some newer packages over it's base Debian.

@MBO
Can you get your kernel and qemu version when you have the issue. I assume your on Kernel 5.15 now on pve, I had the issue on 5.13 when I reported it.

Comment by MBO [ 2022-11-23 ]

@FingerlessGloves

The VM with the issue was running CentOS7 with kernel: 3.10.0-1160.80.1.el7.x86_64
The Proxmox machine was running: 5.15.39-3 with Proxmox version 7.2-7 and Qemu 6.2.0

Hope this information is useful.

Comment by Daniel Black [ 2022-11-23 ]

A notable 10.6 difference to 10.5 with regard to storage is innodb_flush_method=O_DIRECT by defaults previously on fsync. If you test the old default that might be a useful for the qemu folks.

Also 10.6 added liburing as the innodb_use_native_aio=1 implementation in 10.6 (where liburing is available as a distro package). I've yet to look how Centos 8 Stream package theirs, but there might be a difference there.

Since tmpfs was mentioned a few times in the forum I'll mention https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020831, but that's pretty much only for the 5.10 debian kernel so won't be the entire story.

Comment by Linus [ 2022-11-30 ]

After setting innodb_flush_method=fsync and innodb_use_native_aio=0 the error still occurs.

(QEMU 7.0 by i440fx, Tested on Debian 11, MariaDB 10.8)

Backup log:

INFO: starting new backup job: vzdump --mode snapshot
INFO: Starting Backup of VM XXXXXX (qemu)
INFO: Backup started at 2022-11-30 09:43:43
INFO: status = running
INFO: include disk 'scsi0' 'local-zfs:vm-XXXXXX-disk-0' 300G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/XXXXXX/2022-11-30T08:43:43Z'
INFO: issuing guest-agent 'fs-freeze' command
 
[TIMEOUT]

Generated at Thu Feb 08 09:51:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.