[MDBF-436] LibVirt VMs remain running on master shutdown Created: 2022-06-20  Updated: 2023-11-28  Resolved: 2023-11-28

Status: Closed
Project: MariaDB Foundation Development
Component/s: Buildbot
Affects Version/s: N/A
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Vlad Bogolin Assignee: Faustin Lammler
Resolution: Fixed Votes: 0
Labels: buildbot
Remaining Estimate: 0d
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

When the master-libvirt is shut down, the VM remains on. We need to identify this case and shut the VM down.

Potential solution 1:
Monitor the buildbot-worker process and shutdown the machine if it is unable to connect to master

Potential solution 2:
Alert if the VM is running for more than X minutes



 Comments   
Comment by Faustin Lammler [ 2022-08-09 ]

We now have an alert if the VM remains running for more than 1 hour. We still need to investigate why this happen.

Comment by Faustin Lammler [ 2022-09-19 ]

A probably better and long therm solution would be to auto-shutdown VMs if they stay running for more than a certain amount of time, or:

  • buildbot process not active;
  • connection to bb master is cut.
Comment by Faustin Lammler [ 2023-11-28 ]

Alerts are in place in case this happen (and working). The problem comes from a flaky network between BB master and historical PPC libvirt worker (db-p9-bbw1). It's way more stable since libvirt workers are exclusively on Hetzner.

Generated at Thu Feb 08 03:37:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.