• Bug#1106535: ovmf: Potential regression: ovmf package update from Debia

    From Stephane Poignant@21:1/5 to All on Sun May 25 19:00:01 2025
    Subject: ovmf: Potential regression: ovmf package update from Debian bookworm to trixie breaks AMD-SEV
    Package: ovmf
    X-Debbugs-Cc: stephane.poignant@protonmail.com
    Version: 2025.02-6
    Severity: normal

    After upgrading from bookworm to trixie (ovmf package upgraded from 2022.11-6+deb12u2 to 2025.02-6), my SEV encrypted VMs became unable to boot.
    The OVMF bootloader hang, with the kvm process at 100% CPU, and nothing printed in the console. Nothing changes until the VM is destroyed manually.

    Other VMs are not affected. If i disable SEV by just removing the launchSecurity section from the VM config, it boots successfully.

    After more testing i could narrow down the root cause to the ovmf package, and more precisely determine that the regression was introduced between 2024.02-2 (last working version) and 2024.05-1 (first broken one).

    Reproducing steps:

    - Start with a bookworm or trixie system
    Initially, version [2024.02-2](https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.02-2_all.deb) of the `ovmf` package is installed on the system.
    ```
    # dpkg -i ovmf_2024.02-2_all.deb
    ```

    - Create a SEV encrypted VM, the following is a minimalistic reproducing config, inspired from [this example](https://github.com/AMDESE/AMDSEV/blob/master/xmls/sample-sev.xml):
    ```
    # cat v-testsev.xml
    <domain type='kvm'>
    <name>v-testsev</name>
    <memory unit='KiB'>2097152</memory>
    <currentMemory unit='KiB'>2097152</currentMemory>
    <memoryBacking>
    <locked/>
    </memoryBacking>
    <vcpu placement='static'>1</vcpu>
    <os>
    <type arch='x86_64' machine='pc-q35-9.2'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
    <nvram template='/usr/share/OVMF/OVMF_VARS_4M.ms.fd'>/var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd</nvram>
    <boot dev='hd'/>
    </os>
    <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
    </features>
    <cpu mode='host-passthrough' check='none' migratable='on'>
    <cache mode='passthrough'/>
    </cpu>
    <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    </clock>
    <on_poweroff>destroy</on_poweroff>
    <on_reboot>restart</on_reboot>
    <on_crash>destroy</on_crash>
    <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
    </pm>
    <devices>
    <emulator>/usr/bin/kvm</emulator>
    <controller type='usb' index='0' model='none'/>
    <serial type='pty'>
    <target type='isa-serial' port='0'>
    <model name='isa-serial'/>
    </target>
    </serial>
    <console type='pty'>
    <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
    <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <audio id='1' type='none'/>
    <memballoon model='virtio'/>
    <rng model='virtio'>
    <backend model='random'>/dev/urandom</backend>
    </rng>
    </devices>
    <launchSecurity type='sev'>
    <policy>0x0003</policy>
    <cbitpos>47</cbitpos>
    <reducedPhysBits>1</reducedPhysBits>
    </launchSecurity>
    </domain>

    # virsh define v-testsev.xml
    ```

    - Start the VM:
    ```
    # virsh start --console v-testsev
    Domain 'v-testsev' started
    Connected to domain 'v-testsev'
    Escape character is ^] (Ctrl + ])
    BdsDxe: No bootable option or device was found.
    BdsDxe: Press any key to enter the Boot Manager Menu.
    ...

    Standard PC (Q35 + ICH9, 2009)
    pc-q35-9.2 2.00 GHz
    2024.02-2 2048 MB RAM
    ...
    ```

    The loader starts and successfully boots into the setup utility as expected.

    - Destroy the VM is destroyed, delete the nvram varfile upgrade ovmf to [2024.05-1](https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.05-1_all.deb):
    ```
    # rm /var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd
    # dpkg -i ovmf_2024.05-1_all.deb
    ```

    - Try starting the VM again:
    ```
    # virsh start --console v-testsev
    Domain 'v-testsev' started
    Connected to domain 'v-testsev'
    Escape character is ^] (Ctrl + ])
    <hangs>

    # top
    ...
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    21343 libvirt+ 20 0 2497576 2.1g 60444 S 100.3 1.7 0:28.85 kvm
    ...

    # strace -p 21343,21351,21354,21355
    strace: Process 21343 attached
    strace: Process 21351 attached
    strace: Process 21354 attached
    strace: Process 21355 attached
    [pid 21355] ioctl(18, KVM_RUN <unfinished ...>
    [pid 21354] ppoll([{fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}], 3, NULL, NULL, 8 <unfinished ...>
    [pid 21351] futex(0x5557bd5c84a8, FUTEX_WAIT, 4294967295, NULL <unfinished ...> [pid 21343] ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=78, events=POLLIN}], 5, {tv_sec=27702, tv_nsec=581671146}, NULL, 8
    ```

    The VM remains unresponsive, nothing on the console, kvm process at 100% CPU. The issue is reproduced.

    - Destroy the VM, remove the launchSecurity from the config, and restart it: ```
    ~# virsh destroy v-testsev
    Domain 'v-testsev' destroyed

    # virsh edit v-testsev
    <remove launchSecurity section>

    # virsh start --console v-testsev
    Domain 'v-testsev' started
    Connected to domain 'v-testsev'
    Escape character is ^] (Ctrl + ])
    BdsDxe: No bootable option or device was found.
    BdsDxe: Press any key to enter the Boot Manager Menu.
    ...
    Standard PC (Q35 + ICH9, 2009)
    pc-q35-9.2 2.00 GHz
    2025.02-6 2048 MB RAM
    ...
    ```

    The loader starts successfully and boot into the menu. This shows that the issue only happens when SEV is configured.


    -- System Information:
    Debian Release: 13.0
    APT prefers testing
    APT policy: (500, 'testing')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.1.0-37-amd64 (SMP w/16 CPU threads; PREEMPT)
    Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)
    LSM: AppArmor: enabled

    ovmf depends on no packages.

    ovmf recommends no packages.

    Versions of packages ovmf suggests:
    ii qemu-system-x86 1:10.0.0+ds-2

    -- no debconf information

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)