Subject: ovmf: Potential regression: ovmf package update from Debian bookworm to trixie breaks AMD-SEV
Package: ovmf
X-Debbugs-Cc:
stephane.poignant@protonmail.com
Version: 2025.02-6
Severity: normal
After upgrading from bookworm to trixie (ovmf package upgraded from 2022.11-6+deb12u2 to 2025.02-6), my SEV encrypted VMs became unable to boot.
The OVMF bootloader hang, with the kvm process at 100% CPU, and nothing printed in the console. Nothing changes until the VM is destroyed manually.
Other VMs are not affected. If i disable SEV by just removing the launchSecurity section from the VM config, it boots successfully.
After more testing i could narrow down the root cause to the ovmf package, and more precisely determine that the regression was introduced between 2024.02-2 (last working version) and 2024.05-1 (first broken one).
Reproducing steps:
- Start with a bookworm or trixie system
Initially, version [2024.02-2](
https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.02-2_all.deb) of the `ovmf` package is installed on the system.
```
# dpkg -i ovmf_2024.02-2_all.deb
```
- Create a SEV encrypted VM, the following is a minimalistic reproducing config, inspired from [this example](
https://github.com/AMDESE/AMDSEV/blob/master/xmls/sample-sev.xml):
```
# cat v-testsev.xml
<domain type='kvm'>
<name>v-testsev</name>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
<memoryBacking>
<locked/>
</memoryBacking>
<vcpu placement='static'>1</vcpu>
<os>
<type arch='x86_64' machine='pc-q35-9.2'>hvm</type>
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
<nvram template='/usr/share/OVMF/OVMF_VARS_4M.ms.fd'>/var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd</nvram>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<vmport state='off'/>
</features>
<cpu mode='host-passthrough' check='none' migratable='on'>
<cache mode='passthrough'/>
</cpu>
<clock offset='utc'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/usr/bin/kvm</emulator>
<controller type='usb' index='0' model='none'/>
<serial type='pty'>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<channel type='unix'>
<target type='virtio' name='org.qemu.guest_agent.0'/>
</channel>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<audio id='1' type='none'/>
<memballoon model='virtio'/>
<rng model='virtio'>
<backend model='random'>/dev/urandom</backend>
</rng>
</devices>
<launchSecurity type='sev'>
<policy>0x0003</policy>
<cbitpos>47</cbitpos>
<reducedPhysBits>1</reducedPhysBits>
</launchSecurity>
</domain>
# virsh define v-testsev.xml
```
- Start the VM:
```
# virsh start --console v-testsev
Domain 'v-testsev' started
Connected to domain 'v-testsev'
Escape character is ^] (Ctrl + ])
BdsDxe: No bootable option or device was found.
BdsDxe: Press any key to enter the Boot Manager Menu.
...
Standard PC (Q35 + ICH9, 2009)
pc-q35-9.2 2.00 GHz
2024.02-2 2048 MB RAM
...
```
The loader starts and successfully boots into the setup utility as expected.
- Destroy the VM is destroyed, delete the nvram varfile upgrade ovmf to [2024.05-1](
https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.05-1_all.deb):
```
# rm /var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd
# dpkg -i ovmf_2024.05-1_all.deb
```
- Try starting the VM again:
```
# virsh start --console v-testsev
Domain 'v-testsev' started
Connected to domain 'v-testsev'
Escape character is ^] (Ctrl + ])
<hangs>
# top
...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21343 libvirt+ 20 0 2497576 2.1g 60444 S 100.3 1.7 0:28.85 kvm
...
# strace -p 21343,21351,21354,21355
strace: Process 21343 attached
strace: Process 21351 attached
strace: Process 21354 attached
strace: Process 21355 attached
[pid 21355] ioctl(18, KVM_RUN <unfinished ...>
[pid 21354] ppoll([{fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}], 3, NULL, NULL, 8 <unfinished ...>
[pid 21351] futex(0x5557bd5c84a8, FUTEX_WAIT, 4294967295, NULL <unfinished ...> [pid 21343] ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=78, events=POLLIN}], 5, {tv_sec=27702, tv_nsec=581671146}, NULL, 8
```
The VM remains unresponsive, nothing on the console, kvm process at 100% CPU. The issue is reproduced.
- Destroy the VM, remove the launchSecurity from the config, and restart it: ```
~# virsh destroy v-testsev
Domain 'v-testsev' destroyed
# virsh edit v-testsev
<remove launchSecurity section>
# virsh start --console v-testsev
Domain 'v-testsev' started
Connected to domain 'v-testsev'
Escape character is ^] (Ctrl + ])
BdsDxe: No bootable option or device was found.
BdsDxe: Press any key to enter the Boot Manager Menu.
...
Standard PC (Q35 + ICH9, 2009)
pc-q35-9.2 2.00 GHz
2025.02-6 2048 MB RAM
...
```
The loader starts successfully and boot into the menu. This shows that the issue only happens when SEV is configured.
-- System Information:
Debian Release: 13.0
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 6.1.0-37-amd64 (SMP w/16 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
ovmf depends on no packages.
ovmf recommends no packages.
Versions of packages ovmf suggests:
ii qemu-system-x86 1:10.0.0+ds-2
-- no debconf information
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)