UEFI Based KVM Virtualization
Important
This feature might not be applicable to all Platforms. Please check individual Platform pages, section Supported Features to confirm if this feature is listed as supported.
Overview of Virtualization support
Neoverse reference platforms support virtualization by providing architectural support of AArch64 virtualization host extension (VHE). The reference platform software stack uses Linux kernel based virtual machine (KVM) as the hypervisor and the userspace program kvmtool as the virtual machine manager (VMM) to leverage this hardware feature. The Virtualization document guides on how to validate virtualization on Neoverse reference platforms using a buildroot filesystem with Linux as the guest operating system. This setup helps in validating the architectural features, however lacks the support of a firmware to boot the platform. Booting a full fledged Linux distribution operating system (OS) such as Fedora or Ubuntu, etc. with UEFI firmware and grub boot-loader as the guest OS can help in validating more real-time virtualization use-cases. This setup also provides support for ACPI tables based platform resource control.
Objective
The purpose of validating virtualization with a Linux distribution is to prepare virtual machines (VM) on a host system that allow booting multiple guest operating systems running Linux distributions such as Ubuntu, Fedora, etc. with the UEFI firmware support. The virtualized platform is prepared and launched using KVM module of the host Linux kernel and kvmtool which is a standalone userspace tool. kvmtool allows booting either directly from a kernel or from a firmware, where firmware will initiate the bootloader for Linux distro OS boot. The firmware based booting allows inclusion of ACPI tables to communicate the hardware info to the OS and perform resource control. The firmware is built with the UEFI EDK2 ArmVirtKvmTool platform descriptor from ArmVirtPkg EDK2 package. The ArmVirtKvmTool takes help of DynamicTablesPkg EDK2 package to dynamically produce ACPI tables from device tree blob (dtb). The DynamicTablesPkg parses the harware information from the dtb that is prepared by the kvmtool for the spawned VMs.
The spawned virtual machine simulates the necessary hardware required for the guest to run. This hardware support includes, but not limited to:
Processor (vCPUs)
Interrupt controller (e.g. gic-v3, gic-v3-its)
Main memory or RAM
Timer (e.g. armv8/7-timer)
Flash memory (e.g. cfi-flash) required by UEFI firmware
UART controller (e.g. uart-16550) to setup console devices,
Real time clock (e.g. motorola,mc146818)
Block and net devices for disk access and network support both of which are realised using virtio devices.
It is important to note that for this validation all the virtio devices (block and net devices) use pci as their underlying transport mechanism and thus are enumerated as pci endpoint devices.
Overview of ArmVirtKvmTool
ArmVirtKvmTool firmware is sepcifically designed to initialize the hardware (h/w) that is described by the kvmtool using device tree during the VM launch. The ArmVirtKvmTool supports multiple libraries corresponding to the hardware devices emulated by kvmtool, e.g. flash memory, uart, rtc, timer, pci and virtio devices. Few common devices that require initalization by the firmware are parsed through flattened device tree (fdt) library. The firwmare also makes use of KvmtoolVirtMemInfoLib library to create a system memory map before doing the h/w initization. The ArmVirtKvmTool platform descriptor is originally based on ArmVirtPkg and borrows various base libraries to implement the pre-pi and dxe stage drivers.
EDK2 supports handling ACPI tables which are then passed to OS after firmware exits from bds stage. But as kvmtool provide h/w info as dtb and not as ACPI tables, another EDK2 package DynamicTablePkg is used to dynamically parse the dtb and generate appropriate ACPI tables. ArmVirtKvmTool implements a configuration manager protocol that holds a platform info repository. The fdt hardware parser from DynamicTablePkg is used to collect all the platform details as Arm Cmobjects and then to communicate these objects to the table factory of DynamicTablePkg. The table factory obtains a rich set of ACPI table generators from the main table manager and sequentially invokes each generator to create a table. The supported list of libraries include DBG2, FADT, GTDT, IORT, MADT, MCFG, PPTT, SPCR and many more.
It is equally important to align the firmware input based on the environment created by kvmtool with the help of KVM. Refer the Virtualization document for more details on configuring kvmtool for the required virtual platform.
Build the platform software
Note
This section assumes the user has completed the chapter Getting Started and has a functional working environment.
This section describes the procedure to prepare the necessary setup to validate UEFI firmware based booting of Linux distributions on the virtual machines. Following software packages from the Neoverse reference platform software stack are needed to do the validation:
ArmVirtKvmTool based firmware (built as part of UEFI build)
Kvmtool VMM
Skip this section if a Buildroot or Busybox build is already performed for the platform software stack
as the ArmVirtKvmTool
uefi firmware and kvmtool
binaries are already
built.
Build UEFI firmware for the host and for the guest OS (
ArmVirtKvmTool
) by running the appropriate script from software stack:./build-scripts/build-test-uefi.sh -p <platform name> <command>
Supported command line options are listed below
<platform name>
Lookup for a platform name in Platform Names.
<command>
Supported commands are
clean
build
package
all
(all of the three above)
Examples of the build command are
Command to clean, build and package the software stack needed for the UEFI firmware on RD-N2-Cfg1 platform:
./build-scripts/build-test-uefi.sh -p rdn2cfg1 all
Lastly, build the userspace hypervisor program
kvmtool
../build-scripts/build-kvmtool.sh -p <platform name> clean ./build-scripts/build-kvmtool.sh -p <platform name> build ./build-scripts/build-kvmtool.sh -p <platform name> package
<platform name>
Lookup for a platform name in Platform Names.
For examples to build kvmtool for rdn2cfg1 platform use the below command:
./build-scripts/build-kvmtool.sh -p rdn2cfg1 clean
./build-scripts/build-kvmtool.sh -p rdn2cfg1 build
./build-scripts/build-kvmtool.sh -p rdn2cfg1 package
Setup Satadisk Images
To use Linux distributions as the host and guest OS create disk images by following the guidelines from Distro Boot document. There can be a Ubuntu or Fedora as host OS and multiple distributions as guest. It is important to remember however, that the host disk image should be large enough to hold multiple guest disk images e.g. host of ~32GiB and multiple guest images of Ubuntu/Fedora with ~6GiB size. Guest disk images are used later to run KVM session.
Note
For simplicity the setup instructions where specific are given for Ubuntu v22.04 distro host OS.
Booting the platform for validation
Boot Host OS
Boot the host satadisk image on the FVP with network enabled as mentioned in Distro Boot. For example, to boot Ubuntu as the host OS give the follwing command to begin the distro boot from the
ubuntu.satadisk
image:./distro.sh -p rdn2cfg1 -d /absolute/path/to/ubuntu.satadisk -n true
After booting the host OS verify that the KVM and virtualization support is enabled. Each Linux distro has different ways to verify this but it is also possible to confirm by looking into the kernel boot logs.
dmesg | grep -i "kvm"
Above command puts out KVM related boot logs which should be similar to the logs shown below:
kvm [1]: IPA Size Limit: 48 bits kvm [1]: GICv4 support disabled kvm [1]: GICv3: no GICV resource entry kvm [1]: disabling GICv2 emulation kvm [1]: GIC system register CPU interface enabled kvm [1]: vgic interrupt IRQ1 kvm [1]: VHE mode initialized successfully
Also make sure
/dev/kvm
exists. If any of this is not met, please follow through for the solution mentioned in the below sections.
Network Support
Check if host OS has network access by running
ping -c 5 8.8.8.8
. If the ping doesn’t work as the network is unreachable then enable it usingdhclient
utility for dhcp discovery on the host OS:sudo dhclient -v
Check the available network interfaces on the host with below command:
ip link show
Check if the above command shows a virtual bridge
virbr#
already configured and running on host. This virtual bridge will help in giving network access to the guest OS.If the KVM support or the virtual bridge could not be found then try the below commands. For more details refer to the instructions in Ubuntu KVM Installation guide to resolve any issues.
sudo apt update sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils libfdt-dev -y
Now start the
libvirtd
service to initiate the communication between the KVM and the libvirt APIs. Use below commands to configure the system to start the service at every boot.sudo systemctl start libvirtd sudo systemctl enable libvirtd
The network acces to the guest OS can be given by creating a bridge and a tap interface. Follow commands shown below to create the tap interface and add it to virtual bridge
virbr#
as listed from executingip link show
.sudo ip tuntap add dev tap0 mode tap user $(whoami) sudo ip link set tap0 master virbr# up
Now create a workspace to begin with virtualization test example.
mkdir -p ~/kvm-test/
cd ~/kvm-test/
Emulate Flash Memory
ArmvirtKvmTool
UEFI firmware needs a flash memory while booting to store
various objects. Create an empty zero filled flash memory file which will be
presented by kvmtool as a flash device to the UEFI firmware and guest OS.
dd if=/dev/zero of=efivar.img bs=128M count=1
Enable PCIe pass-through based device virtualization
As mentioned in the Virtualization document PCIe pass-through (also called as direct device assignment) allows a device to be assigned to a guest such that the guest runs the driver for the device without intervention of the hypervisor/host. This is one of the device virtuali- zation technique that provides near near host device performance. This is achieved with the help of VFIO driver framework and IOMMU support. More about this can be read from Linux vfio.
Neoverse reference platforms have few smmu-test-engine devices that are the PCIe endpoint devices that can be used to demonstrate this feature Use the verbose
lspci
command to check the status of these devices for example, with pci BDF ids 08:00.0 and 08:00.1.sudo lspci -v sudo lspci -v -s 0000:08:00.1
Check if
vfio_pci
kernel module is already loaded or not.lsmod | grep -i "vfio"
if not then manually probe the kernel driver module
sudo modprobe vfio-pci
Unbind the pci endpoint device from its current driver if the device is attached to its class driver. If the driver doesn’t exist ignore the error produced on running below command
echo "0000:08:00.1" | sudo tee /sys/bus/pci/devices/0000\:08\:00.1/driver/unbind
Bind the device to
vfio-pci
driverecho "vfio-pci" | sudo tee /sys/bus/pci/devices/0000\:08\:00.1/driver_override echo "0000:08:00.1" | sudo tee /sys/bus/pci/drivers_probe
Confirm that device has been attached to
vfio-pci
driversudo lspci -v -s 0000:08:00.1 | grep -i "Kernel driver"
In order to use the device for direct assignment, it is required that all the devices sharing the iommu group with this particular device are attached to
vfio-pci
driver. So perform the above mentioned unbinding and binding for all the endpoint devices that shares the common iommu group. List out all the devices that are under that specific iommu groupls /sys/bus/pci/drivers/vfio-pci/0000\:08\:00.1/iommu_group/devices/
Obtain the built binaries
Running the KVM session will require the
ArmvirtKvmTool
UEFI firmware, a guest disk image with pre-installed Linux distro OS and thekvmtool
binary which were obtained in section Build the platform software. Copy these to the host OS through network using below commands in the workspace directorykvm-test
.rsync -Wa --progress user@server:absolute/path/to/guest-ubuntu.satadisk . rsync -Wa --progress user@server:TOP_DIR/output/<platform name>/components/css-common/KVMTOOL_EFI.bin . rsync -Wa --progress user@server:TOP_DIR/output/<platform name>/components/kvmtool/lkvm .
Launch VMs with multiple Linux distributions
Finally, launch the virtual machine with a Linux distribution image as the guest OS. As mentioned in the Virtualization document ‘screen’ utility can be used to multiplex console outputs.
Note
To switch back to host session detach from the screen by pressing
ctrl+a d
.
Run the below command from kvm-test
workspace directory to start a KVM
session with ArmvirtKvmTool binary KVMTOOL_EFI.bin
, kvmtool binary
lkvm
, flash image efivar.img
, the distribution disk image for
guest guest-ubuntu.satadisk
, tap0
tap inteface and the PCI device
with requester-ID (BDF) 0000:08:00.1
used for direct device assignment:
screen -md -S "virt0" sudo ./lkvm run -m 2048 -f KVMTOOL_EFI.bin -F efivar.img -d guest-ubuntu.satadisk -n tapif=tap0 --console serial --force-pci --vfio-pci 0000:08:00.1 --disable-mte;
The launched screens can be viewed from the target by using the following command:
screen -ls
Jump to the screen using:
screen -r virt0
The guest can be seen booting with logs as shown below:
# lkvm run --firmware ./KVMTOOL_EFI.bin -m 2048 -c 4 --name guest-3882 Info: Using IOMMU type 3 for VFIO container Info: 0000:08:00.1: assigned to device number 0x0 in group 3 Info: flash file size (134217728 bytes) is not a power of two Info: only using first 16777216 bytes UEFI firmware (version built at 14:51:31 on Apr 4 2022)
Notice the logs about PCIe device being setup using the Linux VFIO driver.
Info: Using IOMMU type 3 for VFIO container Info: 0000:08:00.1: assigned to device number 0x0 in group 9
Once the guest has booted. check if network is accessible and assigned pci device is listed in
lspci
.# If network is unreachable use dhclient: sudo dhclient -v ping -c 2 8.8.8.8 # Check the listed PCI devices lspci # Output of lspci 00:00.0 Unassigned class [ff00]: ARM Device ff80
To shutdown the guest execute the following command:
sudo poweroff
On completion of guest shutdown
kvmtool
prints a message denoting error free closing of KVM session.# KVM session ended normally.