Debian, QEMU, libvirt, qcow2 and fstrim

After some discussion with colleagues on how to best approach fstrim for qcow2 on libvirt in Debian 10, I sat down one sunday afternoon researching and applying fstrim to my libvirt VMs.

My hypervisors and VMs are mostly running vanilla Debian stable, which is why this post is not necessarily applicable to other distributions – but perhaps somewhat helpful nonetheless.

Directive

The goal was to have my libvirt VMs (around two dozen across two hypervisors) automatically discard unused space from their underlying qcow2 image files. Apart from saving space, I was hoping to take some time off my online backup mechanism, which can take up to four hours for seven VMs on spinning disks. The two main approaches – as far as I can see – are either to add a discard option to a VMs fstab, or use a manual fstrim timer provided by Debian. Some more explanation here. I’ll be using a custom cronjob to invoke the fstrim command manually every few days, more on that later.

State of things

All of my VMs root systems are hosted inside qcow2 images, which I find to be more flexible than using LVM volumes. Some of these VMs have extra data partitions (eg. blockchain data, apt-mirrors) which don’t need backups and are therefore arranged as LVM volume groups. That’s why I’ll only be looking at setting up fstrim for root partitions (but extending it’s functionality across all partitons is trivial). Debian 10 ships with QEMU 3.1. Additionally, there’s one Windows 10 VM.

Research

There’s a really helpful post regarding fstrim and KVM by Chris Irwin for (what I’m guessing) he’s doing with non-Debian hypervisors. I recommend reading it, but here’s a summary:

  • starting with QEMU 4.0, virtio supports the discard option natively
  • no need to add an additional virtio-scsi controller anymore
  • specific VM machine type has to be pc-q35-4.0 and upwards

Executing kvm -machine help on my hypervisor shows support only up to pc-q35-3.1, which expected with QEMU 3.1:

root@atlas:~# kvm -machine help
Supported machines are:
pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-3.1)
pc-i440fx-3.1        Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-3.0        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.9        Standard PC (i440FX + PIIX, 1996)
[...]
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-3.1)
pc-q35-3.1           Standard PC (Q35 + ICH9, 2009)
pc-q35-3.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9           Standard PC (Q35 + ICH9, 2009)
[...]

Setup part 1

Luckily, Debian is offering QEMU 5.0 through buster-backports as of now (November 2020). After manually upgrading the respective packages, I’m now able to use pc-q35-5.0.

Note: at this point I recommend shutting down all VMs on the hypervisor that’s being worked on.

apt update; apt install qemu qemu-block-extra qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86 -t buster-backports
root@atlas:~# kvm -machine help
Supported machines are:
[...]
pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-5.0)
pc-i440fx-5.0        Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-4.2        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-4.1        Standard PC (i440FX + PIIX, 1996)
[...]
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-5.0)
pc-q35-5.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-4.2           Standard PC (Q35 + ICH9, 2009)
pc-q35-4.1           Standard PC (Q35 + ICH9, 2009)
[...]

Note: depending on your setup and configuration management, it might be recommended to set some kind of apt-pinning for qemu packages from buster-backports, as to not miss any updates.

Now, using virt-manager and enabling it’s XML editing setting, several things need to be taken care of:

  • for machine type, I’m using q35, which libvirt automatically extends to pc-q35-5.0
<type arch="x86_64" machine="pc-q35-5.0">hvm</type>
  • the discard option needs to be added to the qcow2 driver
<driver name="qemu" type="qcow2" discard="unmap"/>

By the way, WordPress won’t let me add angle brackets without selecting the ugly default code type, otherwise it thinks it’s HTML code. Can’t really be bothered to explore it further, but makes me think about converting my entire page to static content some time…

Little detour

At this point I had to apply a few more changes to VMs that were apparently created a while ago, with machine types like pc-i440fx-2.8. In order to apply q35 to their configs, libvirt wanted me to change the PCI controller type from pci-root to pcie-root.

<controller type="pci" index="0" model="pcie-root"/>

After booting any of these VMs, their network did not seem to come up again. With adding pcie-root, Debian cheerfully renamed the interface names according to the new PCIe bus on the virtualized systems, breaking their network settings. The naming scheme always went from the original ens0 to enp0s3. Predictable interface names anyone?

It was quickly rectified after manually logging into the machines local root console through virt-manager’s VNC connection and editing the network config.

Note: the Windows 10 VM was one of the Old Ones, but bravely handled the PCIe bus change by informing me that it’s now connected to “Network 2”. Whatever that means.

Setup part 2

With my VMs up and running again, fstrim should now be available:

root@proxy:~# fstrim -v /
/: 56.6 GiB (60766765056 bytes) trimmed

Success!

As mentioned earlier, I’ve opted for my own custom cronjob, with a small puppet module wrapped around it.

class js::module::fstrim_kvm {
 
  package { 'virt-what': ensure => installed }
 
  if $facts['virtual'] == 'kvm' {
 
    cron { 'fstrim-root':
      ensure  => present,
      command => '/sbin/fstrim -v / >> /var/log/fstrim.log',
      user    => 'root',
      minute  => [fqdn_rand(30)],
      hour    => '23',
      weekday => [3,7],
      require => Package['virt-what'],
    }
  }
}

The cronjob requires the package virt-what, which puppet is using via it’s built-in fact virtual to determine whether the host is a KVM (QEMU) VM. The cronjob executes at a random minute (as to not have them all running at the same time) during the 23rd hour twice a week, shortly before my VM backups are running. Also, if there’s a log server to pick up log data, having fstrim stats might be (mildly) interesting.

Results

Comparing qcow2 images on the hypervisor before and after fstrim, the images are now taking up almost 70% less space. Very nice.

total 148G
 28G -rw-r--r--  1 libvirt-qemu libvirt-qemu  28G Nov 21 16:49 vm01.qcow2
 21G -rw-r--r--  1 libvirt-qemu libvirt-qemu 101G Nov 21 16:49 vm02.qcow2
 14G -rw-r--r--  1 libvirt-qemu libvirt-qemu  14G Nov 21 16:49 vm03.qcow2
 53G -rw-r--r--  1 libvirt-qemu libvirt-qemu  53G Nov 21 16:49 vm04.qcow2
 11G -rw-r--r--  1 libvirt-qemu libvirt-qemu  11G Nov 21 16:49 vm05.qcow2
6,6G -rw-r--r--  1 libvirt-qemu libvirt-qemu 6,7G Nov 21 16:49 vm06.qcow2
 17G -rw-r--r--  1 libvirt-qemu libvirt-qemu  17G Nov 21 16:49 vm07.qcow2
total 43G
9,4G -rw-r--r--  1 libvirt-qemu libvirt-qemu  28G Nov 22 13:10 vm01.qcow2
7,1G -rw-r--r--  1 libvirt-qemu libvirt-qemu 101G Nov 22 13:10 vm02.qcow2
5,2G -rw-r--r--  1 libvirt-qemu libvirt-qemu  14G Nov 22 13:10 vm03.qcow2
5,0G -rw-r--r--  1 libvirt-qemu libvirt-qemu  53G Nov 22 13:10 vm04.qcow2
6,0G -rw-r--r--  1 libvirt-qemu libvirt-qemu  11G Nov 22 13:10 vm05.qcow2
2,8G -rw-r--r--  1 libvirt-qemu libvirt-qemu 6,8G Nov 22 13:10 vm06.qcow2
6,5G -rw-r--r--  1 libvirt-qemu libvirt-qemu  17G Nov 22 13:10 vm07.qcow2

To do

I’m yet to implement fstrim on my Windows VM (if possible), mostly because it’s only one VM with maybe a couple of gigabytes to reclaim. Also I’m too lazy to look into it. If you have a working solution, please drop a comment.

Virtual machine backup without downtime (update)

Using QCOW2 file based VMs in Linux has lots of neat features. One of my favourites is the virsh blockcopy operation (assuming you are using libvirt, or are familiar with it – please read the the blockcopy section in virsh’s man before proceeding).

With libvirt, it’s possible to make use of a powerful snapshot toolkit. For now, I only want to copy an image for backup purposes without having to shut the virtualized guest down. This is where the blockcopy command comes into play. It’s simple enough, the only requirement is to temporarily undefine the guest during the blockcopy operation.

You can test it with a few commands – but be careful. Clone your VM and it’s configuration manually (shutdown & cp to somewhere else) beforehand, as files are easily overwritten by accident. Both the name of the guest and of the image are identical in my example (guest123). The target device is sda – yours might be vda or hda, take a look the guest configuration, namely the disk section. Depending on your hardware and the size of the guest, the blockcopy process might take some time. I’m using htop / iotop to monitor activity during the operation.

virsh dumpxml --security-info guest123 > guest123.xml
virsh undefine guest123
virsh -q blockcopy guest123.qcow2 sda guest123-backup.qcow2 --wait --finish
virsh define guest123.xml

That’s it. You now have a backup image of your running guest, without any downtime. Libvirt does not sparse the copied image, meaning it’s as large as the original image at the moment the operation finishes.

I’m using a cron & a simple script to periodically pull backups of my VMs. It’s assuming the name of the guest and image file are the same, as in the example above. It can be used as follows:

$ ./libvirt-backup guest123

With several VMs, each one gets it’s own cronjob. For the moment, my crontab looks similar to this:

# m h  dom mon dow   command
 05 00 * * 1 /vm/backup/libvirt-backup.sh guest1
 05 01 * * 1 /vm/backup/libvirt-backup.sh guest2
 05 02 * * 1 /vm/backup/libvirt-backup.sh guest3

I am having the blockcopy-backup.sh in the same directory as the images (/vm/backup). It works for now, but I might change that setup in the future. Don’t forget to set the executable flag.

$ chmod +x libvirt-backup.sh

And the script itself. It checks if the target file is truly *.qcow2, and if the guest is running, logs the time & size of the VM, dumps the XML, undefines the VM, blockcopies the guest to /vm/backup and adds the current date to the file name, defines the VM again, transfers the copy to a “target-host” using rsync and deletes the local copy. Important note – using the “-S”flag with rsync transfers sparsed QCOW2 files, saving space & bandwidth.

#!/bin/bash
 
GUEST=$1
 
BACKUP_LOCATION=/vm/backup
XML_DUMP="${BACKUP_LOCATION}/xml"
GUEST_LOCATION=`virsh domstats $GUEST | grep block.0.path | cut -d = -f2-`
BLOCKDEVICE=`virsh domstats $GUEST | grep block.0.name | cut -d = -f2-`
DATE=`date +%F_%H-%M`
GUEST_SIZE=`du -sh $GUEST_LOCATION | awk '{ print $1 }'`
 
        if [ `qemu-img info $GUEST_LOCATION | grep --count "file format: qcow2"` -eq 0 ]; then
                echo "Image file for $GUEST not in qcow2 format."
                exit 0; 
        fi
 
	if [ `virsh list | grep running | awk '{print $2}' | grep --count $GUEST` -eq 0 ]; then
		echo "$GUEST not active, skipping.."
		exit 0;	
	fi
 
	logger "Guest backup for $GUEST starting - current image size at $GUEST_SIZE"
 
	virsh dumpxml --security-info $GUEST > $XML_DUMP/$GUEST-$DATE.xml
 
	virsh undefine $GUEST > /dev/null 2>&1
 
	virsh -q blockcopy $GUEST $BLOCKDEVICE $BACKUP_LOCATION/$GUEST-$DATE.qcow2 --wait --finish
 
	virsh define $XML_DUMP/$GUEST-$DATE.xml > /dev/null 2>&1 
 
	rsync -S $BACKUP_LOCATION/$GUEST-$DATE.qcow2 target-host:/libvirt_daily_backups/$GUEST/
 
	rsync -S $XML_DUMP/$GUEST-$DATE.xml target-host:/libvirt_daily_backups/$GUEST/
 
	rm $BACKUP_LOCATION/$GUEST-$DATE.qcow2
 
	logger "Guest backup for $GUEST done"
 
exit 0;

The blockcopy and rsync operations are rather I/O heavy. If you are scheduling VM backups, it’s always good practise to leave enough time between cronjobs and to avoid other processes on your system which might be triggered at similar times, such as a smartd scans for example.
Also, as mentioned before – development on KVM/QEMU and libvirt is ongoing & very active. For Debian based systems, it might be worth considering to upgrade to unstable APT sources for at least these packages.

Update November 2020

I’ve since revisited the backup script a few times, the biggest change being that it automatically backs up all running VMs, without taking the names as arguments. There’s also an rsync subroutine to move backups, including the XMLs, to  a remote host.

#!/bin/bash
 
BACKUP_LOCATION=/virt/backup
XML_DUMP="${BACKUP_LOCATION}/xml"
 
 
if [ ! -d "$XML_DUMP" ]; then
mkdir -p $XML_DUMP
fi
 
mapfile -t active_guests < <( virsh list | grep running | awk '{print $2}' )
 
for GUEST in ${active_guests[*]};
 
do
GUEST_LOCATION=`virsh domstats $GUEST | grep block.0.path | cut -d = -f2-`
BLOCKDEVICE=`virsh domstats $GUEST | grep block.0.name | cut -d = -f2-`
DATE=`date +%F_%H-%M`
GUEST_SIZE=`du -sh $GUEST_LOCATION | awk '{ print $1 }'`
 
if [ `qemu-img info $GUEST_LOCATION --force-share | grep --count "file format: qcow2"` -eq 0 ]; then
echo "Image file for $GUEST not in qcow2 format."
continue 2
fi
 
if [ `virsh list | grep running | awk '{print $2}' | grep --count $GUEST` -eq 0 ]; then
echo "$GUEST not active, skipping.."
continue 2
fi
 
logger "Guest backup for $GUEST starting - current image size at $GUEST_SIZE"
 
virsh dumpxml --security-info $GUEST > $XML_DUMP/$GUEST-$DATE.xml
 
virsh undefine $GUEST
 
virsh -q blockcopy $GUEST $BLOCKDEVICE $BACKUP_LOCATION/$GUEST-$DATE-temp.qcow2 --wait --finish
 
virsh define $XML_DUMP/$GUEST-$DATE.xml
 
/usr/bin/logger "$GUEST defined, sparsing backup image .."
 
/usr/bin/qemu-img convert -O qcow2 $BACKUP_LOCATION/$GUEST-$DATE-temp.qcow2 $BACKUP_LOCATION/$GUEST-$DATE-sparsed.qcow2
 
rm $BACKUP_LOCATION/$GUEST-$DATE-temp.qcow2
 
rsync -avSe ssh $BACKUP_LOCATION/$GUEST-$DATE-sparsed.qcow2 backuphost:$GUEST-$DATE-sparsed.qcow2
rsync -ave ssh $XML_DUMP/$GUEST-$DATE.xml backuphost:xml/$GUEST-$DATE.xml
 
rm $BACKUP_LOCATION/$GUEST-$DATE-sparsed.qcow2
 
logger "Guest backup for $GUEST done"
done
Scroll to top