Update to QEMU 9.0 including IGVM v6 patch series + direct VMSA #16

roy-hopkins · 2024-10-16T15:22:45Z

Overview

This branch is based on QEMU master commit a5dd9ee with the IGVM patch series v6 applied. Also, the top commit provides a modification to the handling of IGVM VMSA directives to directly set the VMSA in KVM via the KVM_SEV_SNP_LAUNCH_UPDATE ioctl.

Compatibility

This version of QEMU works with the 6.11 kernel with SVSM kernel patches applied: https://github.com/roy-hopkins/linux/tree/svsm_vmpl_vmsa_restinj. See PR [TODO].

Reason for direct VMSA update

The patch that updates VMSA handling can be dropped from this branch and SVSM will still work correctly. However, due to the way the VMSA is handled for each vCPU in KVM this will result in the launch measurement never matching the pre-calculated launch measurement of the IGVM file. The previous SVSM kernel included a sev_feature flag that indicated use of an SVSM which then changed this behaviour to get the measurement to match but this cannot be supported anymore.

New Command line

The QEMU command line to launch a guest with SVSM has changed since the previous version. In particular init-flags is not supported anymore (which is how we used to indicate an SVSM was present) and IGVM now has its own object:

...
-machine q35,confidential-guest-support=sev0,memory-backend=mem0,igvm-cfg=igvm0 \
-object memory-backend-memfd,size=8G,id=mem0,share=true,prealloc=false,reserve=false \
-object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 \
-object igvm-cfg,id=igvm0,file=coconut-qemu.igvm \
...

Signed-off-by: Ricardo Ribalda <[email protected]> Message-Id: <[email protected]> Acked-by: Igor Mammedov <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

This allows the vhost_net device which has multiple virtqueues to batch the setup of all its host notifiers. This significantly reduces the vhost_net device starting and stoping time, e.g. the time spend on enabling notifiers reduce from 630ms to 75ms and the time spend on disabling notifiers reduce from 441ms to 45ms for a VM with 192 vCPUs and 15 vhost-user-net devices (64vq per device) in our case. Signed-off-by: zuoboqun <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

Now virtio_address_space_lookup only lookup common/isr/device/notify MR and exclude their subregions. When VHOST_USER_PROTOCOL_F_HOST_NOTIFIER enable, the notify MR has host-notifier subregions and we need use host-notifier MR to notify the hardware accelerator directly instead of eventfd notify. Further more, maybe common/isr/device MR also has subregions in the future, so need memory_region_find for each MR incluing their subregions. Add lookup subregion of VirtIOPCIRegion MR instead of only lookup container MR. Fixes: a93c8d8 ("virtio-pci: Replace modern_as with direct access to modern_bar") Co-developed-by: Zuo Boqun <[email protected]> Signed-off-by: Gao Shiyuan <[email protected]> Signed-off-by: Zuo Boqun <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

When using the mailbox command get scan media results, the scan media restart physical address field in the ouput palyload is not 64-byte aligned. This patch removed the error source of the restart physical address. The Scan Media Restart Physical Address is the location from which the host should restart the Scan Media operation. [5:0] bits are reserved. Refer to CXL spec r3.1 Table 8-146 Fixes: 89b5cfc ("hw/cxl: Add get scan media results cmd support") Reviewed-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/linux-cxl/[email protected]/ Signed-off-by: peng guo <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

Currently, the guest may write to the device configuration space, whereas the virtio sound device specification in chapter 5.14.4 clearly states that the fields in the device configuration space are driver-read-only. Remove the set_config function from the virtio_snd class. This also prevents a heap buffer overflow. See QEMU issue #2296. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2296 Signed-off-by: Volker Rümelin <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

If the config directory in sysfs does not exist at all, we are dealing with a system that does not support THPs. Simply use 1 MiB block size then, instead of warning "Could not detect THP size, falling back to ..." and falling back to the default THP size. Cc: "Michael S. Tsirkin" <[email protected]> Cc: Gavin Shan <[email protected]> Cc: Juraj Marcin <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

This patch implements the periodic and the swsmi ICH9 chipset timers. They are especially useful when prototyping UEFI firmware (e.g. with EDK2's OVMF) using QEMU. For backwards compatibility, the compat properties "x-smi-swsmi-timer", and "x-smi-periodic-timer" are introduced. Additionally, writes to the SMI_STS register are enabled for the corresponding two bits using a write mask to make future work easier. Signed-off-by: Dominic Prinz <[email protected]> Message-Id: <1d90ea69e01ab71a0f2ced116801dc78e04f4448.1725991505.git.git@dprinz.de> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

…into staging * Split --enable-sanitizers to --enable-{asan, ubsan} * Build MSYS2 job using multiple CPUs * Fix "make distclean" wrt contrib/plugins/ * Convert more Avocado tests to plain standalone functional tests * Fix bug that breaks "make check-functional" when tesseract is missing * Use builtin hashlib of Python in the functional tests * Update the FreeBSD CI jobs to 14.1 # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbhY4YRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbU/aw/9HXl9H8BUDn8lnoEmxuuQSk8F19n/l5pt # en3L8pMBt4dGFe/9KaGes2GFfid+cp2zlx+qQhA4HW35ntMJorF/qinOH/JGDtoM # 3O6RGZrQPn60zD9P2EbFVCrVYysVYCEu0U3Uglj6tf33bE0L7SJsQxqcbIciyIj5 # aq3Te0yMM2lqzCdMqNpWHGn3VMZRvbRaGBPDU4RLP8V2Bpz1iiRE+6HCH9Kg7HzS # OmleeXtvcyInG+54onjfTcn4/XA27pl1UU04KFv5PrRPB3M2FspHn7oOT2yyQ+ls # 79mqIcd8PvycCT+3ch9p8KhVtbVBgZGmeemALLvk5FxysaWnl4KtSqmQNdqSvvpV # waDDKlLaSnjEHDUse3bCJX0m4d7/vTBY5fOYxqZ4z5dl63csDtgPY4/VF4XR08sP # tR1mW+2qEH9eygsxuKcBjx/j7Etpy+jL9pX2ii1V3ElhjjYuEnpEiURa+TaqPjpZ # jmPtBEszzUdPbrD707tDkW3/ezT7VAnASQeYneJXB/JQG6K6Z//05iX6oCzCbRm3 # ceW/fem3UaeGYpzbMdoZToTuNlXEyS7NDcr39xJjH4LyRTPJAX4zeqUEdzces9g/ # u4Dw6rJ0Yhj4rscKxRvGl3/BH6CTI+8IAsbju2B/CnVLTqaABB0q/MDB90aB44xX # bAVsl4P03Uk= # =5TR0 # -----END PGP SIGNATURE----- # gpg: Signature made Wed 11 Sep 2024 10:31:50 BST # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "[email protected]" # gpg: Good signature from "Thomas Huth <[email protected]>" [full] # gpg: aka "Thomas Huth <[email protected]>" [full] # gpg: aka "Thomas Huth <[email protected]>" [full] # gpg: aka "Thomas Huth <[email protected]>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2024-09-11' of https://gitlab.com/thuth/qemu: (24 commits) Update FreeBSD CI jobs FreeBSD 14.1 tests/functional/qemu_test: Use Python hashlib instead of external programs tests/functional: Fix bad usage of has_cmd tests/functional: Convert the multiprocess avocado test into a standalone test tests/functional: Convert the or1k-sim Avocado test tests/functional: Convert the m68k MCF5208EVB Avocado test tests/functional: Convert the Alpha Clipper Avocado test tests/functional: Convert Aarch64 Raspi4 avocado tests tests/functional: Convert Aarch64 Raspi3 avocado tests tests/functional: Convert ARM Raspi2 avocado tests tests/functional: Convert mips32eb 4Kc Malta avocado tests tests/functional: Convert nanomips Malta avocado tests tests/functional: Convert mips32el Malta YAMON avocado test tests/functional: Convert mips64el 5KEc Malta avocado tests tests/functional: Convert mips64el I6400 Malta avocado tests tests/functional: Convert mips64el Fuloong2e avocado test (2/2) tests/functional: Convert the m68k Q800 Avocado test into a functional test tests/functional: Add the LinuxKernelTest for testing the Linux boot process MAINTAINERS: Remove myself from the Meson section MAINTAINERS: Remove myself as reviewer ... Signed-off-by: Peter Maydell <[email protected]>

Add support for, and migrate, a single-entry fp instruction queue for sparc32. Signed-off-by: Carl Hauser <[email protected]> [rth: Split from a larger patch; adjust representation with union; add migration state] Signed-off-by: Richard Henderson <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

Implement a single instruction floating point queue, populated while delivering an fp exception. Signed-off-by: Carl Hauser <[email protected]> [rth: Split from a larger patch] Signed-off-by: Richard Henderson <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

Signed-off-by: Richard Henderson <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

Invalid encoding of addr should raise TT_ILL_INSN, so check before supervisor, which might raise TT_PRIV_INSN. Clear QNE after execution. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2340 Signed-off-by: Richard Henderson <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

Model fp_exception state, in which only fp stores are allowed until such time as the FQ has been flushed. Signed-off-by: Richard Henderson <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

With edk2-stable202408 LoongArch UEFI bios, CSR PGD register is set only if its value is equal to zero for boot cpu, it causes reboot issue. Since CSR PGD register is changed with linux kernel, UEFI BIOS cannot use it. Add workaround to clear CSR registers relative with TLB in function loongarch_cpu_reset_hold(), so that VM can reboot with edk2-stable202408 UEFI bios. Signed-off-by: Bibo Mao <[email protected]> Reviewed-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

For virtio VGA deivce libvirt will select VIRTIO_VGA firstly rather than VIRTIO_GPU, VIRTIO_VGA device supports frame buffer however it requires legacy VGA compatible support. Frame buffer area 0xa0000 -- 0xc0000 conflicts with low memory area 0 -- 0x10000000. Here remove default support for VIRTIO_VGA device, VIRTIO_GPU is prefered on LoongArch system. For frame buffer video card support, standard VGA can be used. Signed-off-by: Bibo Mao <[email protected]> Reviewed-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

KVM provides interface KVM_REG_LOONGARCH_VCPU_RESET to reset vCPU, it can be used to clear internal state about kvm kernel. vCPU reset function is added here for kvm mode. Signed-off-by: Bibo Mao <[email protected]> Reviewed-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

Add the support needed for creating prstatus elf notes. This allows us to use QMP dump-guest-memory. Now ELF notes of LoongArch only supports general elf notes, LSX and LASX is not supported, since it is mainly used to dump guest memory. Signed-off-by: Bibo Mao <[email protected]> Reviewed-by: Song Gao <[email protected]> Tested-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

In order to support additional channels of communication using `-serial`, add several serial ports, up to the standard 4 generally supported by the 8250 driver. Fixed: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jason A. Donenfeld <[email protected]> Tested-by: Bibo Mao <[email protected]> [gaosong: ACPI uart need't reverse order] Signed-off-by: Song Gao <[email protected]> Message-Id: <[email protected]>

If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to initialize early. Set this using the usual guest random number generation function. This is the same procedure that's done in b91b6b5 ("hw/microblaze: pass random seed to fdt"), e4b4f0b ("hw/riscv: virt: pass random seed to fdt"), c6fe3e6 ("hw/openrisc: virt: pass random seed to fdt"), 67f7e42 ("hw/i386: pass RNG seed via setup_data entry"), c287941 ("hw/rx: pass random seed to fdt"), 5e19cc6 ("hw/mips: boston: pass random seed to fdt"), 6b23a67 ("hw/nios2: virt: pass random seed to fdt") c4b0753 ("hw/ppc: pass random seed to fdt"), and 5242876 ("hw/arm/virt: dt: add rng-seed property"). These earlier commits later were amended to rerandomize the RNG seed on snapshot load, but the LoongArch code somehow already does that, despite not having this patch here, presumably due to some lucky copy and pasting. Signed-off-by: Jason A. Donenfeld <[email protected]> Reviewed-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

Serial port console redirection table can be used for default serial port selection, like chosen stdout-path selection with FDT method. With acpi SPCR table added, early debug console can be parsed from SPCR table with simple kernel parameter earlycon rather than earlycon=uart,mmio,0x1fe001e0 Signed-off-by: Bibo Mao <[email protected]> Reviewed-by: Song Gao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Song Gao <[email protected]>

…st/qemu into staging virtio,pc,pci: features, fixes, cleanups i286 acpi speedup by precomputing _PRT by Ricardo Ribalda vhost_net speedup by using MR transactions by Zuo Boqun ich9 gained support for periodic and swsmi timer by Dominic Prinz Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <[email protected]> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbhoCUPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRptpUH/iR5AmJFpvAItqlPOvJiYDEch46C73tyrSws # Kk/1EbGSL7mFFD5sfdSSV4Rw8CQBsmM/Dt5VDkJKsWnOLjkBQ2CYH0MYHktnrKcJ # LlSk32HnY5p1DsXnJhgm5M7St8T3mV/oFdJCJAFgCmpx5uT8IRLrKETN8+30OaiY # xo35xAKOAS296+xsWeVubKkMq7H4y2tdZLE/22gb8rlA8d96BJIeVLQ3y3IjeUPR # 24q6c7zpObzGhYNZ/PzAKOn+YcVsV/lLAzKRZJTzTUPyG24BcjJTyyr/zNSYAgfk # lLXzIZID3GThBmrCAiDZ1z6sfo3MRg2wNS/FBXtK6fPIuFxed+8= # =ySRy # -----END PGP SIGNATURE----- # gpg: Signature made Wed 11 Sep 2024 14:50:29 BST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "[email protected]" # gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full] # gpg: aka "Michael S. Tsirkin <[email protected]>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: hw/acpi/ich9: Add periodic and swsmi timer virtio-mem: don't warn about THP sizes on a kernel without THP support hw/audio/virtio-sound: fix heap buffer overflow hw/cxl: fix physical address field in get scan media results output virtio-pci: Add lookup subregion of VirtIOPCIRegion MR vhost_net: configure all host notifiers in a single MR transaction tests/acpi: pc: update golden masters for DSDT hw/i386/acpi-build: Return a pre-computed _PRT table tests/acpi: pc: allow DSDT acpi table changes intel_iommu: Make PASID-cache and PIOTLB type invalid in legacy mode intel_iommu: Fix invalidation descriptor type field virtio: rename virtio_split_packed_update_used_idx hw/pci/pci-hmp-cmds: Avoid displaying bogus size in 'info pci' pci: don't skip function 0 occupancy verification for devfn auto assign hw/isa/vt82c686.c: Embed i8259 irq in device state instead of allocating hw: Move declaration of IRQState to header and add init function virtio: Always reset vhost devices virtio: Allow .get_vhost() without vhost_started Signed-off-by: Peter Maydell <[email protected]>

…cross-i686-tci The cross-i686-tci CI job is persistently flaky with various tests hitting timeouts. One theory for why this is happening is that we're running too many tests in parallel and so sometimes a test gets starved of CPU and isn't able to complete within the timeout. (The environment this CI job runs in seems to cause us to default to a parallelism of 9 in the main CI.) Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Message-id: [email protected]

…to staging target/sparc: Implement single entry FP Queue # -----BEGIN PGP SIGNATURE----- # # iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmbifAAdHHJpY2hhcmQu # aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+XAwgAlj//8JuNoRB/2hi0 # gU3Ifjrs+r+AZrcsG7pTOmYTZa6cYqJX4XsYoNq1S4FHky239vNKPQOQEadkmLGv # wKH0fBjzvydOKRfrhEK2VLlhMyhGyuv59psfCCUB5HZEiueSHFFAvfjUtKNpjzRT # KE2fwL6iKK3IXeKC6ynq0bkC/OymnLUYSgSslA6C1x1sReNz5Y6ZsGUEZRwODY4f # q6s6JS2aBn1L9nJTzwXH/J5Ue8iix53d6EZ42QHqqwzRvAWHtfFqoMLc9P6Dg8P7 # FmiwHAErwr7Pj5cqcnl2C0zTp3LXg5xXpTJysi8CFJvCsObNRh9gL15W3xy9qBFX # 2WfqWQ== # =kxM7 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 Sep 2024 06:28:32 BST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "[email protected]" # gpg: Good signature from "Richard Henderson <[email protected]>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * tag 'pull-sparc-20240911' of https://gitlab.com/rth7680/qemu: target/sparc: Add gen_trap_if_nofpu_fpexception target/sparc: Implement STDFQ target/sparc: Add FSR_QNE to tb_flags target/sparc: Populate sparc32 FQ when raising fp exception target/sparc: Add FQ and FSR.QNE Signed-off-by: Peter Maydell <[email protected]>

…into staging pull-loongarch-20240912 # -----BEGIN PGP SIGNATURE----- # # iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZuLmLgAKCRBAov/yOSY+ # 38JNA/9UdorT4a7H+H5PhNeEu2EHDgMPb7+gxyYKw03mOG2MB3KFzkK0LRQShaPt # ADJmIqAFlc9SJLkbo6ELMDl+ZnUU9OdC/P6YU5iBG71zx1PonMwuyJTWhlBwxWcG # +OB8aDBUALoe/Gb4za152I84cR08g58TgLnXNfEkCM8lnPfAug== # =Plwu # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 Sep 2024 14:01:34 BST # gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF # gpg: Good signature from "Song Gao <[email protected]>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF * tag 'pull-loongarch-20240912' of https://gitlab.com/gaosong/qemu: hw/loongarch: Add acpi SPCR table support hw/loongarch: virt: pass random seed to fdt hw/loongarch: virt: support up to 4 serial ports target/loongarch: Support QMP dump-guest-memory target/loongarch/kvm: Add vCPU reset function hw/loongarch: Remove default enable with VIRTIO_VGA device target/loongarch: Add compatible support about VM reboot Signed-off-by: Peter Maydell <[email protected]>

Convert the TYPE_CCW_DEVICE to three-phase reset. This is a device class which is subclassed, so it needs to be three-phase before we can convert the subclass. Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Nina Schoetterl-Glausch <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Acked-by: Thomas Huth <[email protected]> Message-id: [email protected]

Convert the virtio-ccw code to three-phase reset. This allows us to remove a call to device_class_set_parent_reset(), replacing it with the three-phase equivalent resettable_class_set_parent_phases(). Removing all the device_class_set_parent_reset() uses will allow us to remove some of the glue code that interworks between three-phase and legacy reset. This is a simple conversion, with no behavioural changes. Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Nina Schoetterl-Glausch <[email protected]> Acked-by: Thomas Huth <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Message-id: [email protected]

Convert the s390 CPU to the Resettable interface. This is slightly more involved than the other CPU types were (see commits 9130cad..d66e64d) because S390 has its own set of different kinds of reset with different behaviours that it needs to trigger. We handle this by adding these reset types to the Resettable ResetType enum. Now instead of having an underlying implementation of reset that is s390-specific and which might be called either directly or via the DeviceClass::reset method, we can implement only the Resettable hold phase method, and have the places that need to trigger an s390-specific reset type do so by calling resettable_reset(). The other option would have been to smuggle in the s390 reset type via, for instance, a field in the CPU state that we set in s390_do_cpu_initial_reset() etc and then examined in the reset method, but doing it this way seems cleaner. The motivation for this change is that this is the last caller of the legacy device_class_set_parent_reset() function, and removing that will let us clean up some glue code that we added for the transition to three-phase reset. Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Nina Schoetterl-Glausch <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Acked-by: Thomas Huth <[email protected]> Message-id: [email protected]

There are no callers of device_class_set_parent_reset() left in the tree, as they've all been converted to use three-phase reset and the corresponding resettable_class_set_parent_phases() function. Remove device_class_set_parent_reset(). Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Message-id: [email protected]

The Alpha and HPPA CPU class structs include a 'parent_reset' field which is never used; delete them. (These targets don't seem to implement reset at all; if they did they should do it using the three-phase reset mechanism, which uses a 'ResettablePhases parent_phases' field instead of the old 'DeviceReset parent_reset' field.) Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-id: [email protected]

Define a device_class_set_legacy_reset() function which sets the DeviceClass::reset field. This serves two purposes: * it makes it clearer to the person writing code that DeviceClass::reset is now legacy and they should look for the new alternative (which is Resettable) * it makes it easier to rename the reset field (which in turn makes it easier to find places that call it) The Coccinelle script can be used to automatically convert code that was doing an open-coded assignment to DeviceClass::reset to call device_class_set_legacy_reset() instead. Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Message-id: [email protected]

This is useful information when debugging memory issues so lets improve by: - include the ptr address for u8 fills (like the others) - indicate the number of operations for reads and writes - explicitly note when we are flushing - move the fill printf to after the reset Reviewed-by: Pierrick Bouvier <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

While the compilers will generally happily synthesise a 64 bit value for you on 32 bit systems it doesn't exercise anything on QEMU. It also makes it hard to accurately compare the accesses to test_data when instrumenting. Reviewed-by: Pierrick Bouvier <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

The multiarch system tests output serial data which should be redirected to the "output" chardev rather than echoed to the console. Comment the use of EXTFLAGS variable while we are at it. Acked-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

At first I thought I could compile the user-mode test for system mode however we already have a fairly comprehensive test case for system mode in "memory" so lets use that. As tracking every access will quickly build up with "print-access" we add a new mode to track groups of reads and writes to regions. Because the test_data is 16k aligned we can be sure all accesses to it are ones we can count. First we extend the test to report where the test_data region is. Then we expand the pdot() function to track the total number of reads and writes to the region. We have to add some addition pdot() calls to take into account multiple reads/writes in the test loops. Finally we add a python script to integrate the data from the plugin and the output of the test and validate they both agree on the total counts. As some boot codes clear the bss we also add a flag to add a regions worth of writes to the expected total. Signed-off-by: Alex Bennée <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]>

When we shut down a guest we disable the timers. However this can cause deadlock if the guest has queued some async work that is trying to advance system time and spins forever trying to wind time forward. Pay attention to the return code and bail early if we can't wind time forward. Reported-by: Elisha Hollander <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]>

SimPoint is a widely used tool to find the ideal microarchitecture simulation points so Valgrind[2] and Pin[3] support generating basic block vectors for use with them. Let's add a corresponding plugin to QEMU too. Note that this plugin has a different goal with tests/plugin/bb.c. This plugin creates a vector for each constant interval instead of counting the execution of basic blocks for the entire run and able to describe the change of execution behavior. Its output is also syntactically simple and better suited for parsing, while the output of tests/plugin/bb.c is more human-readable. [1] https://cseweb.ucsd.edu/~calder/simpoint/ [2] https://valgrind.org/docs/manual/bbv-manual.html [3] https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamic-binary-instrumentation-tool.html Signed-off-by: Yotaro Nada <[email protected]> Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

Signed-off-by: Rowan Hart <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]> [AJB: tweaked cpu_memory_rw_debug call] Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

Signed-off-by: Rowan Hart <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Tested-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]> [AJB: tweak fmt string for vaddr] Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

Although we asks for instructions per second we work in quanta and that cannot be 0. Fail to load the plugin instead and report the minimum IPS we can handle. Reported-by: Elisha Hollander <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Reviewed-by: Pierrick Bouvier <[email protected]> Message-Id: <[email protected]>

…quad/qemu into staging TCG plugin memory instrumentation updates - deprecate plugins on 32 bit hosts - deprecate plugins with TCI - extend memory API to save value - add check-tcg tests to exercise new memory API - fix timer deadlock with non-changing timer - add basic block vector plugin to contrib - add cflow plugin to contrib - extend syscall plugin to dump write memory - validate ips plugin arguments meet minimum slice value # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmbsPCUACgkQ+9DbCVqe # KkTm1gf9Hs5Zfdng0E+7sr5Dpa5F+cJOXU9QJhoTWJ4XC16CygWByqMXbyeX/kvm # HXJEm6OnkADJhikIUCoBko8uK4/96iWSrDL0sEdzASX4SM/tXu684KeL+j9G/Ql8 # iqxm6tIjaJqmbSZRMp0l5jD+ZBltRMCzBNdK1suJR2ppQgqfKj3qMLVLtq2hhqPH # qPgwKm44hk9BEpHYqXaivzSWN5GKCgvp5ECcFXCBhDcM+8W7Dl3Mv6X0pWOpYcKZ # d2a5KUt+Xp7WB2jkOgJYr0zKCOQCiCjGSfm/30qRDOUnwiLRWbfamRI9jUDNUtfy # RYR+GaspurGCwSkwICdlvj+vFp/16Q== # =5wfo # -----END PGP SIGNATURE----- # gpg: Signature made Thu 19 Sep 2024 15:58:45 BST # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full] # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * tag 'pull-tcg-plugin-memory-190924-1' of https://gitlab.com/stsquad/qemu: contrib/plugins: avoid hanging program plugins: add option to dump write argument to syscall plugin plugins: add plugin API to read guest memory contrib/plugins: Add a plugin to generate basic block vectors util/timer: avoid deadlock when shutting down tests/tcg: add a system test to check memory instrumentation tests/tcg: ensure s390x-softmmu output redirected tests/tcg: only read/write 64 bit words on 64 bit systems tests/tcg: clean up output of memory system test tests/tcg/multiarch: add test for plugin memory access tests/tcg/plugins/mem: add option to print memory accesses tests/tcg: allow to check output of plugins tests/tcg: add mechanism to run specific tests with plugins plugins: extend API to get latest memory value accessed plugins: save value during memory accesses contrib/plugins: control flow plugin deprecation: don't enable TCG plugins by default with TCI deprecation: don't enable TCG plugins by default on 32 bit hosts Signed-off-by: Peter Maydell <[email protected]>

The IGVM library allows Independent Guest Virtual Machine files to be parsed and processed. IGVM files are used to configure guest memory layout, initial processor state and other configuration pertaining to secure virtual machines. This adds the --enable-igvm configure option, enabled by default, which attempts to locate and link against the IGVM library via pkgconfig and sets CONFIG_IGVM if found. The library is added to the system_ss target in backends/meson.build where the IGVM parsing will be performed by the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

In preparation for supporting the processing of IGVM files to configure guests, this adds a set of functions to ConfidentialGuestSupport allowing configuration of secure virtual machines that can be implemented for each supported isolation platform type such as Intel TDX or AMD SEV-SNP. These functions will be called by IGVM processing code in subsequent patches. This commit provides a default implementation of the functions that either perform no action or generate an error when they are called. Targets that support ConfidentalGuestSupport should override these implementations. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

Adds an IGVM loader to QEMU which processes a given IGVM file and applies the directives within the file to the current guest configuration. The IGVM loader can be used to configure both confidential and non-confidential guests. For confidential guests, the ConfidentialGuestSupport object for the system is used to encrypt memory, apply the initial CPU state and perform other confidential guest operations. The loader is configured via a new IgvmCfg QOM object which allows the user to provide a path to the IGVM file to process. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

An IGVM file contains configuration of guest state that should be applied during configuration of the guest, before the guest is started. This patch allows the user to add an igvm-cfg object to an X86 machine configuration that allows an IGVM file to be configured that will be applied to the guest before it is started. If an IGVM configuration is provided then the IGVM file is processed at the end of the board initialization, before the state transition to PHASE_MACHINE_INITIALIZED. Signed-off-by: Roy Hopkins <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]>

…h IGVM When using an IGVM file the configuration of the system firmware is defined by IGVM directives contained in the file. In this case the user should not configure any pflash devices. This commit skips initialization of the ROM mode when pflash0 is not set then checks to ensure no pflash devices have been configured when using IGVM, exiting with an error message if this is not the case. Signed-off-by: Roy Hopkins <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

The class function and implementations for updating launch data return a code in case of error. In some cases an error message is generated and in other cases, just the error return value is used. This small refactor adds an 'Error **errp' parameter to all functions which consistently set an error condition if a non-zero value is returned. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

…ache() The x86 segment registers are identified by the X86Seg enumeration which includes LDTR and TR as well as the normal segment registers. The function 'cpu_x86_load_seg_cache()' uses the enum to determine which segment to set. However, specifying R_LDTR or R_TR results in an out-of-bounds access of the segment array. Possibly by coincidence, the function does correctly set LDTR or TR in this case as the structures for these registers immediately follow the array which is accessed out of bounds. This patch adds correct handling for R_LDTR and R_TR in the function. Signed-off-by: Roy Hopkins <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

When an SEV guest is started, the reset vector and state are extracted from metadata that is contained in the firmware volume. In preparation for using IGVM to setup the initial CPU state, the code has been refactored to populate vmcb_save_area for each CPU which is then applied during guest startup and CPU reset. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Stefano Garzarella <[email protected]>

The ConfidentialGuestSupport object defines a number of virtual functions that are called during processing of IGVM directives to query or configure initial guest state. In order to support processing of IGVM files, these functions need to be implemented by relevant isolation hardware support code such as SEV. This commit implements the required functions for SEV-ES and adds support for processing IGVM files for configuring the guest. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Stefano Garzarella <[email protected]>

IGVM support has been implemented for Confidential Guests that support AMD SEV and AMD SEV-ES. Add some documentation that gives some background on the IGVM format and how to use it to configure a confidential guest. Signed-off-by: Roy Hopkins <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]>

Create an enum entry within FirmwareDevice for 'igvm' to describe that an IGVM file can be used to map firmware into memory as an alternative to pre-existing firmware devices. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

For confidential guests a policy can be provided that defines the security level, debug status, expected launch measurement and other parameters that define the configuration of the confidential platform. This commit adds a new function named set_guest_policy() that can be implemented by each confidential platform, such as AMD SEV to set the policy. This will allow configuration of the policy from a multi-platform resource such as an IGVM file without the IGVM processor requiring specific implementation details for each platform. Signed-off-by: Roy Hopkins <[email protected]> Reviewed-by: Daniel P. Berrangé <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]>

The initialization sections in IGVM files contain configuration that should be applied to the guest platform before it is started. This includes guest policy and other information that can affect the security level and the startup measurement of a guest. This commit introduces handling of the initialization sections during processing of the IGVM file. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

Adds a handler for the guest policy initialization IGVM section and builds an SEV policy based on this information and the ID block directive if present. The policy is applied using by calling 'set_guest_policy()' on the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Stefano Garzarella <[email protected]>

The new cgs_set_guest_policy() function is provided to receive the guest policy flags, SNP ID block and SNP ID authentication from guest configuration such as an IGVM file and apply it to the platform prior to launching the guest. The policy is used to populate values for the existing 'policy', 'id_block' and 'id_auth' parameters. When provided, the guest policy is applied and the ID block configuration is used to verify the launch measurement and signatures. The guest is only successfully started if the expected launch measurements match the actual measurements and the signatures are valid. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Stefano Garzarella <[email protected]>

IGVM files can contain an initial VMSA that should be applied to each vcpu as part of the initial guest state. The sev_features flags are provided as part of the VMSA structure. However, KVM only allows sev_features to be set during initialization and not as the guest is being prepared for launch. This patch queries KVM for the supported set of sev_features flags and processes the IGVM file during kvm_init to determine any sev_features flags set in the IGVM file. These are then provided in the call to KVM_SEV_INIT2 to ensure the guest state matches that specified in the IGVM file. This does cause the IGVM file to be processed twice. Firstly to extract the sev_features then secondly to actually configure the guest. However, the first pass is largely ignored meaning the overhead is minimal. Signed-off-by: Roy Hopkins <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Stefano Garzarella <[email protected]>

Signed-off-by: Roy Hopkins <[email protected]>

Previously the VMSA could not be set directly. Instead the current CPU state was automatically populated into a VMSA within kvm as part of KVM_SEV_SNP_LAUNCH_FINISH. This meant that it was hard to ensure the VMSA provided by IGVM matched the resulting one in kvm. KVM has been updated to allow the VMSA to be provided via KVM_SEV_SNP_LAUNCH_UPDATE. In this case, kvm does not perform any specific synchronisation during FINISH and the VMSA is guaranteed to match that provided by QEMU. Signed-off-by: Roy Hopkins <[email protected]>

ribalda and others added 30 commits September 11, 2024 09:46

tests/acpi: pc: update golden masters for DSDT

a6896eb

Signed-off-by: Ricardo Ribalda <[email protected]> Message-Id: <[email protected]> Acked-by: Igor Mammedov <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

target/sparc: Add FSR_QNE to tb_flags

5a165e2

Signed-off-by: Richard Henderson <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Acked-by: Mark Cave-Ayland <[email protected]> Tested-by: Carl Hauser <[email protected]>

stsquad and others added 28 commits September 19, 2024 15:58

linux-headers: update to 6.11

33cbd8b

Signed-off-by: Roy Hopkins <[email protected]>

This was referenced Oct 16, 2024

Update to QEMU 9.0 including IGVM v4 patch series + direct VMSA #15

Closed

Update to linux-6.11 with SVSM support coconut-svsm/linux#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to QEMU 9.0 including IGVM v6 patch series + direct VMSA #16

Update to QEMU 9.0 including IGVM v6 patch series + direct VMSA #16

roy-hopkins commented Oct 16, 2024

Update to QEMU 9.0 including IGVM v6 patch series + direct VMSA #16

Are you sure you want to change the base?

Update to QEMU 9.0 including IGVM v6 patch series + direct VMSA #16

Conversation

roy-hopkins commented Oct 16, 2024

Overview

Compatibility

Reason for direct VMSA update

New Command line