How to recover space from thin provisioned backing-storage (like qcow2) by using DISCARD/TRIM/UNMAP on Proxmox 5

I have fun managing virtual machines, which usually run with Proxmox as my preferred virtualization solution. Some of them use qcow2 files (or thin-lvm) as backing storage,
which is - at least at the beginning - nice and space efficient.

Over time, files are created and deleted within the file system of the virtual machine. Updates are installed and data is moved around. As a consequence, the qcow2 files slowly expand and are no longer as compact as they were in the beginning.

Much more annoying: the vzdump backup needs more time and space, because previously deleted blocks are backed up, too.

In the past, to get around this issue, I simply created a big empty file within the virtual machine, deleted it and then shrunk the qcow2 file on the host, to punch the detected holes. This process is described in the Proxmox Wiki.

However, this solution is sluggish, consumes fairly high i/o, and is really hard to automate.

Fortunately there is a smarter method, called TRIM.

TRIM (also called UNMAP if the SCSI command set is used) was introduced soon after the rise of SSDs, a new command that allows the filesystem to notify the underlying block device of blocks that are no longer in use. Have a look at this great wikipedia article, if you want to know more about TRIM.

TRIM also brings many benefits, if used with thin-provisioned block devices.
If the filesystem can notify a thin LVM or a sparse disk image (like qcow2) of blocks that are no longer used, then the blocks can be released back to the pool of available space.

In short, with TRIM you are able to recover space from your guest disk images.

That sounds great, right?

 

Using TRIM with Proxmox

To use TRIM in a virtual machine together with Proxmox, the following conditions must be met:

  • A thin-provisioned backing storage (qcow2, thin-lvm, zfs, ...)
  • Virtio-SCSI disks with the discard option enabled

To achieve this goal with Proxmox, you simply have to change the virtual machine settings:

  • Hardware: --> HardDisk --> BUS/Device type: "SCSI"; Also select the "Discard" option button
  • Options: --> SCSI Controller Type --> "VirtIO SCSI"

Have a look at the pictures in the great Proxmox wiki, if you need more help setting up the virtual machine configuration. If you prefer to modify settings via command line, here is an example of the relevant config options:

[root@pve ~]# grep scsi /etc/pve/qemu-server/100.conf
 bootdisk: scsi0
 scsi0: local:100/vm-100-disk-1.qcow2,discard=on,size=10G
 scsihw: virtio-scsi-pci

 

Notice: Ensure not to use any device entries (like /dev/sda1 or /dev/vda1) in /etc/fstab inside the virtual machine. If the virtual machine already exists, changing hard disk settings in Proxmox may renumber or rename your hard disks inside your virtual machine.


So simply switch the device entries in file /etc/fstab to "LABEL" or "UUID" entries in order to avoid boot problems later on.

root@guest:~# egrep -v '^$|^#' /etc/fstab
UUID=8bc11e91-d416-4ae1-b95b-a4ab88d3ea5c / ext4 errors=remount-ro 0 1
UUID=1a30b565-89dc-4ee9-83f6-b3732ea38875 none swap sw 0 0

Another example, using LVM, which is using "/dev/mapper" entries:

root@guest:~# egrep -v '^$|^#' /etc/fstab
/dev/mapper/guest--vg-root / ext4 errors=remount-ro 0 1
UUID=999000c4-7efb-488f-a4aa-f7ad325e2451 /boot ext2 defaults 0 2
/dev/mapper/guest--vg-swap_1 none swap sw 0 0

 

Conditions inside the guest operating system

On the guest side, you need to use a recent new operating system.

I did my first test using "Univention UCS 4.1" which is based on Debian 7. Some tests failed: I was able to recover space if a plain partitioning was used, but not if I was using LVM, which I normally prefer against plain partitioning. Later I repeated my Tests with plain Debian 7.0, which is too old, because virtio-scsi was not in wheezy (see bug #686636). My final test with an freshly installed Debian 7.11 and also UCS4.1 with latest patches installed was sucessfull.

I repeated the LVM tests with "Univention UCS 4.2", which is already based on Debian 8. As it turned out, TRIM together with LVM works very well if UCS4.2 is used.
Fine!

Finally, I tested encryption, raid and LVM with a pre-release of Debian 9 "Stretch", which will become stable this month. Every single test worked just perfectly with Debian 9!

 

Testing TRIM

The example below was tested four times with Debian 9 as guest operating system, to cover different hard disk installation scenarios, like installation with software-raid, LVM or encryption. All tests worked fine with TRIM, but read on for some specific notes, especially if you plan to use encryption.

 

My tests covered the following installation scenarios:

  • Guided - use entire disk (Geführt - vollständige Festplatte verwenden)
  • Guided - use entire disk and set up LVM (Geführt - gesamte Platte verwenden und LVM einrichten)
  • Guided - use entire disk and set up encrypted LVM (Geführt - gesamte Platte mit verschlüsseltem LVM)
  • Manual --> configure software raid (Manuelle Partitionierung -> mit Einrichten von Software-Raid)

After starting your virtual machine, you should be able to check if TRIM support is enabled:

root@guest:~# lsblk -o MOUNTPOINT,DISC-MAX,FSTYPE

If you see 0B under DISC-MAX, then something did not work:

MOUNTPOINT DISC-MAX FSTYPE
/ 0B ext4

If you see an actual size, then TRIM is supported:

MOUNTPOINT DISC-MAX FSTYPE
/ 1G ext4


Another way to check if TRIM operations are supported in the guest, is simply to look at the /sys/block/sd*/queue/discard_* files:

root@guest:~# cat /sys/block/sda/queue/discard_*
4096
1073741824
1073741824
0

 

Testing the whole stack

I strongly recommend that you repeat this simple test on your own machines -
to make sure TRIM works as expected in your setup!

There is a really simple test to find out if TRIM works as expected:

  • First check the qcow2 disk image size, and then create or copy a big file into the guest:
[root@host ~]# du -hs vm-105-disk-1.qcow2
1.4G vm-105-disk-1.qcow2
[root@host ~]# scp big_file.iso root@guest:

root@guest:~# du -hs big_file.iso
1,1G big_file.iso
  • Afterwards check if the sparse disk image has increased the size in the host:
[root@host ~]# du -hs vm-105-disk-1.qcow2
2.3G vm-105-disk-1.qcow2
  • Within the guest, delete the file and execute the fstrim command, in order to notify the block devices that the blocks for that file (and any other file that has been deleted) are no longer used by the file system:
root@guest:~# rm -f big_file.iso
root@guest:~# fstrim -v /
/: 28,6 GiB (30650617856 bytes) trimmed
  • At host side, the qcow2 file should be shrunk:
[root@host ~]# du -hs vm-105-disk-1.qcow2
1.1G vm-105-disk-1.qcow2

Hooray, TRIM works fine!

 

Some installation-specific notes

disk encryption

The discard option allows discard requests to be passed through the encrypted block device, which has some security implications. If you are planning to use disk encryption together with trim-support, you MUST add the "discard" option to the file /etc/crypttab:

root@guest:~# cat /etc/crypttab
sda5_crypt UUID=2efed39d-8367-4de1-8878-a03bdcbc3905 none luks
root@guest:~# vim.tiny /etc/crypttab
root@guest:~# cat /etc/crypttab
sda5_crypt UUID=2efed39d-8367-4de1-8878-a03bdcbc3905 none luks,discard

LVM

If you are planning to use LVM together with trim-support, you SHOULD change the issue_discard parameter in the file /etc/lvm/lvm.conf:

root@guest:~# grep 'issue_discards =' /etc/lvm/lvm.conf
issue_discards = 0

root@guest:~# vim.tiny /etc/lvm/lvm.conf

root@guest:~# grep 'issue_discards =' /etc/lvm/lvm.conf
issue_discards = 1

MDRAID

Recent kernels have native support for mdraid fstrim (at least tested for raid1 and raid10)

INITRD

On most setups, you will have to rebuild your initramfs with "update-initramfs -u" (Debian and derivatives), or "dracut -f" (Redhat and derivatives).
Reboot the machine, after touching the configuration options of LVM or dm-crypt and rebuilding your initramfs.

root@guest:~# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-4.9.0-3-amd64

root@guest:~# reboot
Connection to guest.local closed by remote host.

NFS

Please note that using the TRIM option on an NFS backed virtual disk is not working, because NFS can not pass the information down to the filesystem of the NFS server.

 

Guest Settings

You need to adjust/change some settings inside your virtual machine, so that the TRIM command will be used automatically in the future. You have two options with which you can do this. The first option is to customize the mount options, the second is to configure the manual TRIM calls. Both possibilities are described below. I personally prefer the manual variant, simply using a cronjob.

Adapting mount options

Using the discard option for a mount in /etc/fstab enables continuous TRIM in device operations:

/dev/mapper/guest--vg-root / ext4 errors=remount-ro,discard 0 1

Note that on the ext4 filesystem, the discard flag can also be set as a default mount option using tune2fs:

root@guest:~# tune2fs -o discard /dev/sdXY

Use fstrim periodically

I prefer a small shell script, that runs fstrim as a cronjob.

root@guest:~# cat /usr/local/sbin/fstrim.sh
#!/bin/sh
#
# To find which FS support trim, we check that DISC-MAX (discard max bytes)
# is great than zero. Check discard_max_bytes documentation at
# https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt
#
for fs in $(lsblk -o MOUNTPOINT,DISC-MAX,FSTYPE | grep -E '^/.* [1-9]+.* ' | awk '{print $1}'); do
        fstrim "$fs"
done

If you like shell one-in-one-liners, too, the following example is made for you:

root@guest:~# cat /etc/cron.d/fstrim
@weekly root for i in $(lsblk -o MOUNTPOINT,DISC-MAX|awk '/^\/.* [1-9]+.*/{print$1}');do /sbin/fstrim "$i";done

Note that the util-linux package provides fstrim.service and fstrim.timer systemd unit files. The service executes fstrim on all mounted filesystems on devices that support the discard operation.

root@guest:~# systemctl enable fstrim.timer
root@guest:~# systemctl start fstrim.timer

 

Complex partitioning scenarios

I successfully tested TRIM in an fairly complex installation scenario: Installing Debian 9 into an LVM, which is based on crypted blockdevice, which is based on software-raid.

Used partition scenario at a glance

sda1+sdb1 -> md0 (raid) -> /boot (ext2)
sda2+sdb2 -> md1 (raid) -> /opt (ext4)
sda3+sdb3 -> md2 (raid) -> md2_crypt (dmcrypt) -> guest-vg (LVM) -> / (ext4)

Used partition scenario in detail

root@guest:~# lsblk -i -o NAME,MOUNTPOINT,DISC-MAX,FSTYPE,UUID
NAME                   MOUNTPOINT DISC-MAX FSTYPE            UUID
sda                                     1G
|-sda1                                  1G linux_raid_member e802b6e6-2065-9ae7-13df-2d80cfc49583
| `-md0                /boot            1G ext2              a75c6a6d-0a7e-4121-8284-d54ee8791d7c
|-sda2                                  1G linux_raid_member 7475850d-0005-c98a-50ed-9d962d71f796
| `-md1                /opt             1G ext4              62b36509-c9bc-4a1b-9261-ddf3348876cb
`-sda3                                  1G linux_raid_member 714b7e28-d743-01fb-d13e-1a06e8fa17c6
  `-md2                                 1G crypto_LUKS       63159682-5951-498b-894f-ed3748a7a710
    `-md2_crypt                         1G LVM2_member       SjKT7u-Ked6-pTa3-unaj-7H8J-PHca-EFIVFY
      `-guest--vg-root /                1G ext4              1d9e2614-7538-45c9-a5a4-35c2ecc626ae
sdb                                     1G
|-sdb1                                  1G linux_raid_member e802b6e6-2065-9ae7-13df-2d80cfc49583
| `-md0                /boot            1G ext2              a75c6a6d-0a7e-4121-8284-d54ee8791d7c
|-sdb2                                  1G linux_raid_member 7475850d-0005-c98a-50ed-9d962d71f796
| `-md1                /opt             1G ext4              62b36509-c9bc-4a1b-9261-ddf3348876cb
`-sdb3                                  1G linux_raid_member 714b7e28-d743-01fb-d13e-1a06e8fa17c6
  `-md2                                 1G crypto_LUKS       63159682-5951-498b-894f-ed3748a7a710
    `-md2_crypt                         1G LVM2_member       SjKT7u-Ked6-pTa3-unaj-7H8J-PHca-EFIVFY
      `-guest--vg-root /                1G ext4              1d9e2614-7538-45c9-a5a4-35c2ecc626ae
sr0                                     0B iso9660           2017-06-05-05-57-12-00

root@guest:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[1] sda1[0]
      975296 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sda3[0] sdb3[1]
      30607360 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      1951744 blocks super 1.2 [2/2] [UU]

unused devices: <none>

root@guest:~# cat /etc/crypttab
md2_crypt UUID=63159682-5951-498b-894f-ed3748a7a710 none luks,discard

root@guest:~# egrep -v '^#|^$' /etc/fstab
/dev/mapper/guest--vg-root /               ext4    errors=remount-ro 0       1
UUID=a75c6a6d-0a7e-4121-8284-d54ee8791d7c /boot           ext2    defaults        0       2
UUID=62b36509-c9bc-4a1b-9261-ddf3348876cb /opt            ext4    defaults        0       2
/dev/sr0        /media/cdrom0   udf,iso9660 user,noauto     0       0

After adapting /etc/lvm/lvm.conf and /etc/crypttab (as shown in the easy scenarios above), TRIM worked just fine in this complex installation scenario:

[root@host ~]# du -hs vm-105*
2.8G vm-105-disk-1.qcow2
2.9G vm-105-disk-2.qcow2

root@guest:~# fstrim -v /
/: 27.8 GiB (29854625792 bytes) trimmed

[root@host ~]# du -hs vm-105*
1.6G vm-105-disk-1.qcow2
1.6G vm-105-disk-2.qcow2

Copyright and feedback to this article

This article may be freely used, copied or quoted on other websites, as long as a link is set to this page and this note is not removed. For publication in print media my consent is necessary before printing. Do not hesitate to send questions, impressions, comments, ideas, and constructive criticism concerning this topics.
Author: Lutz Willek <>

Date of this article: 06/2017
Last update of this article: 07.11.2018 20:56
Systems used for this test: Proxmox 5(beta), Debian 7.0, Debian 7.11, Debian 9(release candidate), Univention UCS4.1, Univention UCS 4.11 with lates patches applied, UCS 4.2