• Re: Why is my VM image so large?!

    From Michael Paoli@21:1/5 to celejar@gmail.com on Tue May 20 04:30:01 2025
    Not a Debian specific question. You may possibly want to ask/check
    on, e.g. relevant qemu
    list or the like.

    Though qcow2 is quite flexible, and can be quite efficient, depending
    what data is written there,
    what snapshots are or may have been there, etc., it may also be rather
    to quite inefficient, including even
    less efficient than raw. So, e.g. if compression is used, and the
    data can't be compressed, it will take
    at least slightly more space than that data itself. Likewise, if
    there are no unallocated blocks, not only no space
    savings there, but there's the additional overhead of tracking where
    the blocks are, as they may be added
    in most any order.

    So, let's see how grossly inefficient I can be, and if I can recover
    some of that.
    # qemu-img create -f qcow2 -o
    compression_type=zlib,preallocation=off,size=2G
    /var/local/vtest/2GiB.qcow2
    Formatting '/var/local/vtest/2GiB.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=off compression_type=zlib
    size=2147483648 lazy_refcounts=off refcount_bits=16
    # stat -c '%s' /var/local/vtest/2GiB.qcow2
    196640
    #
    $ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
    --subdriver qcow2
    Disk attached successfully

    $
    // from the guest VM:
    bs=65536
    # (seek=0; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
    seek="$seek" status=none; do seek=$(expr "$seek" + 2); done
    dd: error writing '/dev/vdc': No space left on device
    # (seek=1; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
    seek="$seek" status=none; do seek=$(expr "$seek" + 2); done
    dd: /dev/vdc: cannot seek: Invalid argument
    #
    // I used random data so it (generally) wouldn't compress,
    // and filled clusters in alternating order to avoid possible
    contiguous mapping efficiencies
    // Back on the physical host:
    # stat -c '%s' /var/local/vtest/2GiB.qcow2
    2148073472
    # expr 512 \* 2 \* 1024 \* 1024 \* 2
    2147483648
    #
    // Let's detach it, add a snapshot, reattach, and likewise fill again
    $ virsh detach-disk balug vdc
    Disk detached successfully

    $
    # qemu-img snapshot -l /var/local/vtest/2GiB.qcow2
    # qemu-img snapshot -c snap01 /var/local/vtest/2GiB.qcow2
    # qemu-img snapshot -l /var/local/vtest/2GiB.qcow2
    Snapshot list:
    ID TAG VM SIZE DATE VM CLOCK ICOUNT 1 snap01 0 B 2025-05-20 01:42:06 00:00:00.000 0 #
    $ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
    --subdriver qcow2
    Disk attached successfully

    $
    // back on the VM guest:
    # (seek=0; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
    seek="$seek" status=none; do seek=$(expr "$seek" + 2); done; seek=1;
    while dd if=/dev/random of=/dev/vdc bs="$bs" count=1 seek="$seek"
    status=none; do seek=$(expr "$seek" + 2); done
    dd: error writing '/dev/vdc': No space left on device
    dd: /dev/vdc: cannot seek: Invalid argument
    #
    // So, back to host, let's detach and see what we can free up.
    $ virsh detach-disk balug vdc
    Disk detached successfully

    $
    # ls -ons /var/local/vtest/2GiB.qcow2
    4199252 -rw------- 1 0 4296015872 May 20 02:08 /var/local/vtest/2GiB.qcow2
    # qemu-img snapshot -d snap01 /var/local/vtest/2GiB.qcow2
    # ls -ons /var/local/vtest/2GiB.qcow2
    2099792 -rw------- 1 0 4296015872 May 20 02:09 /var/local/vtest/2GiB.qcow2
    #
    // It's sparse file, we got most all that spare space back.
    // Now let's see if we can do likewise for data in the image, if we
    replace it with something that
    // compresses highly well.
    $ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
    --subdriver qcow2
    Disk attached successfully

    $
    // and back to the VM guest:
    # dd if=/dev/zero of=/dev/vdc bs="$bs" status=none; unset bs
    dd: error writing '/dev/vdc': No space left on device
    #
    // And back to host:
    $ virsh detach-disk balug vdc
    Disk detached successfully

    $
    # ls -ons /var/local/vtest/2GiB.qcow2
    2099792 -rw------- 1 0 4296015872 May 20 02:13 /var/local/vtest/2GiB.qcow2
    # fallocate -d /var/local/vtest/2GiB.qcow2; ls -ons /var/local/vtest/2GiB.qcow2 368 -rw------- 1 0 4296015872 May 20 02:16 /var/local/vtest/2GiB.qcow2
    #
    Well, that nicely and radically shrunk it - not the logical size, but
    freed huge numbers of null blocks to make it very sparse.
    So, you might try something like that on the filesystem on the VM,
    e.g. fill the unallocated space
    with large file(s) containing nothing but ASCII NUL characters - can
    then remove those files from the
    VM's filesystem. And then with the qcow2 file inactive, see what you
    can do with fallocate -d (don't do that
    to the file while it's in use by the VM). I not uncommonly do similar
    on VMs to save space on their
    filesystem images - basically fill most or all the spare space with
    large file(s) of just null(s), then remove
    those files, and then with the backing file not in use by the VM, use
    fallocate -d
    There may be more efficient ways if discard/trim is in use all the way
    down and through, but often that's
    not the case (one may even specifically not want to do that, for
    certain reasons).

    On Fri, May 16, 2025 at 5:21 AM Celejar <celejar@gmail.com> wrote:

    Hi,

    I have a QEMU / KVM VM running Windows that has been running as a guest
    on various Debian hosts for about a decade. The Windows OS has
    undergone various repairs and reinstalls over the years. I was recently
    quite surprised to discover that the VM image size (actual size on
    disk, not apparent size) has somehow grown to about 4x the allocated
    size of the disk:

    ~# ls -alsh /var/lib/libvirt/images/win10.qcow2
    314G -rw------- 1 root root 352G May 15 12:40 /var/lib/libvirt/images/win10.qcow2

    ~# qemu-img info /var/lib/libvirt/images/win10.qcow2
    image: /var/lib/libvirt/images/win10.qcow2
    file format: qcow2
    virtual size: 80 GiB (85899345920 bytes)
    disk size: 314 GiB
    cluster_size: 65536
    Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: true
    refcount bits: 16
    corrupt: false
    extended l2: false
    Child node '/file':
    filename: /var/lib/libvirt/images/win10.qcow2
    protocol type: file
    file length: 352 GiB (377549750272 bytes)
    disk size: 314 GiB

    I've found all kinds of discussions of this type of thing online, but
    no explanation / solution that seems applicable to my situation.

    The image contains no snapshots:

    ~# qemu-img snapshot -l /var/lib/libvirt/images/win10.qcow2
    ~#

    I think TRIM / DISCARD is properly configured. From the VM XML:

    <disk type='file' device='disk'>
    <driver name='qemu' type='qcow2' discard='unmap'/>
    <source file='/var/lib/libvirt/images/win10.qcow2'/>
    <target dev='vda' bus='virtio'/>
    <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>

    TRIM / DISCARD is enabled in the Windows guest, and I've issued manual
    TRIM commands in the guest several times, like so:

    https://winaero.com/trim-ssd-windows-10/

    I think this did claw back some space, but only on the order of tens of GB.

    Can anyone explain what's going on here, and how I can fix this?

    --
    Celejar


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Dowland@21:1/5 to Celejar on Fri May 30 10:30:02 2025
    On Thu May 29, 2025 at 3:20 PM BST, Celejar wrote:
    I think I've successfully enabled TRIM/DISCARD in both the guest OS as
    well as the host libvirt configuration, and as I mentioned, I think I
    did claw back some space by doing so, but not nearly enough.

    Is 80G too big? If not, stop the VM, qemu-img convert -O raw in.img
    out.img ; then either something like `mv in.img old.img; mv out.img
    in.img` or edit your VM configuration to point at the new disk image.
    Remove the old one once you are satisfied the new one is working.




    --
    Please do not CC me for listmail.

    👱🏻 Jonathan Dowland
    jmtd@debian.org
    🔗 https://jmtd.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Dowland@21:1/5 to Celejar on Tue Jun 10 10:00:01 2025
    On Mon Jun 9, 2025 at 9:45 PM BST, Celejar wrote:
    Can you elaborate, please? Are you recommending that I just stop using
    qcow2 going forward and stick to raw?

    Yes, if 80GiB is sufficient within the VM.

    My understanding is that the former will generally be *more* space efficient, rather than *less* (in addition to providing additional functionality, such as snapshotting)?

    It can be more efficient, and it can be less, as you've discovered.

    If you need functionality that qcow2 provides and raw doesn't, then this
    isn't going to be a solution for you. However, converting the disk image
    (even from qcow2 to qcow2) *might* result in a smaller second disk image.


    --
    Please do not CC me for listmail.

    👱🏻 Jonathan Dowland
    jmtd@debian.org
    🔗 https://jmtd.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Dowland@21:1/5 to Celejar on Wed Jun 11 12:20:01 2025
    On Tue Jun 10, 2025 at 9:36 PM BST, Celejar wrote:
    Okay - I tried running "qemu-img convert" with both "-O raw" and "-O
    qcow2", and I ended up with similarly sized files, about 57GB, which is
    the size of the actual data on the VM disk. So I'm going to continue
    with qcow2, and just note for future reference that I may need to do a conversion every now and then. I see now that the "qemu-img" man page actually acknowledges this:

    "Image conversion is also useful to get smaller image when using a
    growable format such as qcow: the empty sectors are detected and
    suppressed from the destination image."

    Thanks for the help!

    No problem, glad it worked!


    --
    Please do not CC me for listmail.

    👱🏻 Jonathan Dowland
    jmtd@debian.org
    🔗 https://jmtd.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)