.. _UEFI_supported_virtualization_label: UEFI Based KVM Virtualization ============================= .. important:: This feature might not be applicable to all Platforms. Please check individual Platform pages, section **Supported Features** to confirm if this feature is listed as supported. Overview of Virtualization support ---------------------------------- Neoverse reference platforms support virtualization by providing architectural support of AArch64 virtualization host extension (VHE). The reference platform software stack uses Linux kernel based virtual machine (KVM) as the hypervisor and the userspace program kvmtool as the virtual machine manager (VMM) to leverage this hardware feature. The :ref:`Virtualization document ` guides on how to validate virtualization on Neoverse reference platforms using a buildroot filesystem with Linux as the guest operating system. This setup helps in validating the architectural features, however lacks the support of a firmware to boot the platform. Booting a full fledged Linux distribution operating system (OS) such as Fedora or Ubuntu, etc. with UEFI firmware and grub boot-loader as the guest OS can help in validating more real-time virtualization use-cases. This setup also provides support for ACPI tables based platform resource control. Objective --------- The purpose of validating virtualization with a Linux distribution is to prepare virtual machines (VM) on a host system that allow booting multiple guest operating systems running Linux distributions such as Ubuntu, Fedora, etc. with the UEFI firmware support. The virtualized platform is prepared and launched using KVM module of the host Linux kernel and *kvmtool* which is a standalone userspace tool. *kvmtool* allows booting either directly from a kernel or from a firmware, where firmware will initiate the bootloader for Linux distro OS boot. The firmware based booting allows inclusion of ACPI tables to communicate the hardware info to the OS and perform resource control. The firmware is built with the UEFI EDK2 *ArmVirtKvmTool* platform descriptor from *ArmVirtPkg* EDK2 package. The ArmVirtKvmTool takes help of *DynamicTablesPkg* EDK2 package to dynamically produce ACPI tables from device tree blob (dtb). The *DynamicTablesPkg* parses the harware information from the dtb that is prepared by the kvmtool for the spawned VMs. The spawned virtual machine simulates the necessary hardware required for the guest to run. This hardware support includes, but not limited to: - Processor (vCPUs) - Interrupt controller (e.g. gic-v3, gic-v3-its) - Main memory or RAM - Timer (e.g. armv8/7-timer) - Flash memory (e.g. cfi-flash) required by UEFI firmware - UART controller (e.g. uart-16550) to setup console devices, - Real time clock (e.g. motorola,mc146818) - Block and net devices for disk access and network support both of which are realised using virtio devices. It is important to note that for this validation all the virtio devices (block and net devices) use pci as their underlying transport mechanism and thus are enumerated as pci endpoint devices. Overview of ArmVirtKvmTool -------------------------- ArmVirtKvmTool firmware is sepcifically designed to initialize the hardware (h/w) that is described by the kvmtool using device tree during the VM launch. The ArmVirtKvmTool supports multiple libraries corresponding to the hardware devices emulated by kvmtool, e.g. flash memory, uart, rtc, timer, pci and virtio devices. Few common devices that require initalization by the firmware are parsed through flattened device tree (fdt) library. The firwmare also makes use of *KvmtoolVirtMemInfoLib* library to create a system memory map before doing the h/w initization. The ArmVirtKvmTool platform descriptor is originally based on *ArmVirtPkg* and borrows various base libraries to implement the pre-pi and dxe stage drivers. EDK2 supports handling ACPI tables which are then passed to OS after firmware exits from bds stage. But as kvmtool provide h/w info as dtb and not as ACPI tables, another EDK2 package *DynamicTablePkg* is used to dynamically parse the dtb and generate appropriate ACPI tables. *ArmVirtKvmTool* implements a configuration manager protocol that holds a platform info repository. The fdt hardware parser from *DynamicTablePkg* is used to collect all the platform details as Arm Cmobjects and then to communicate these objects to the table factory of *DynamicTablePkg*. The table factory obtains a rich set of ACPI table generators from the main table manager and sequentially invokes each generator to create a table. The supported list of libraries include DBG2, FADT, GTDT, IORT, MADT, MCFG, PPTT, SPCR and many more. It is equally important to align the firmware input based on the environment created by *kvmtool* with the help of KVM. Refer the :ref:`Virtualization document ` for more details on configuring kvmtool for the required virtual platform. Build the platform software --------------------------- .. note:: This section assumes the user has completed the chapter :doc:`Getting Started ` and has a functional working environment. This section describes the procedure to prepare the necessary setup to validate UEFI firmware based booting of Linux distributions on the virtual machines. Following software packages from the Neoverse reference platform software stack are needed to do the validation: - ArmVirtKvmTool based firmware (built as part of UEFI build) - Kvmtool VMM Skip this section if a :ref:`Buildroot ` or :ref:`Busybox ` build is already performed for the platform software stack as the ``ArmVirtKvmTool`` uefi firmware and ``kvmtool`` binaries are already built. - Build UEFI firmware for the host and for the guest OS (``ArmVirtKvmTool``) by running the appropriate script from software stack: :: ./build-scripts/build-test-uefi.sh -p Supported command line options are listed below * - Lookup for a platform name in :ref:`Platform Names `. * - Supported commands are - ``clean`` - ``build`` - ``package`` - ``all`` (all of the three above) Examples of the build command are - Command to clean, build and package the software stack needed for the UEFI firmware on RD-N2-Cfg1 platform: :: ./build-scripts/build-test-uefi.sh -p rdn2cfg1 all - Lastly, build the userspace hypervisor program ``kvmtool``. :: ./build-scripts/build-kvmtool.sh -p clean ./build-scripts/build-kvmtool.sh -p build ./build-scripts/build-kvmtool.sh -p package * - Lookup for a platform name in :ref:`Platform Names `. For examples to build kvmtool for rdn2cfg1 platform use the below command: :: ./build-scripts/build-kvmtool.sh -p rdn2cfg1 clean ./build-scripts/build-kvmtool.sh -p rdn2cfg1 build ./build-scripts/build-kvmtool.sh -p rdn2cfg1 package Setup Satadisk Images --------------------- To use Linux distributions as the host and guest OS create disk images by following the guidelines from :ref:`Distro Boot ` document. There can be a Ubuntu or Fedora as host OS and multiple distributions as guest. It is important to remember however, that the host disk image should be large enough to hold multiple guest disk images e.g. host of ~32GiB and multiple guest images of Ubuntu/Fedora with ~6GiB size. Guest disk images are used later to run KVM session. .. note:: For simplicity the setup instructions where specific are given for Ubuntu v22.04 distro host OS. Booting the platform for validation ----------------------------------- Boot Host OS ^^^^^^^^^^^^ - Boot the host satadisk image on the FVP with network enabled as mentioned in :ref:`Distro Boot `. For example, to boot Ubuntu as the host OS give the follwing command to begin the distro boot from the ``ubuntu.satadisk`` image: :: ./distro.sh -p rdn2cfg1 -d /absolute/path/to/ubuntu.satadisk -n true - After booting the host OS verify that the KVM and virtualization support is enabled. Each Linux distro has different ways to verify this but it is also possible to confirm by looking into the kernel boot logs. :: dmesg | grep -i "kvm" Above command puts out KVM related boot logs which should be similar to the logs shown below: :: kvm [1]: IPA Size Limit: 48 bits kvm [1]: GICv4 support disabled kvm [1]: GICv3: no GICV resource entry kvm [1]: disabling GICv2 emulation kvm [1]: GIC system register CPU interface enabled kvm [1]: vgic interrupt IRQ1 kvm [1]: VHE mode initialized successfully Also make sure ``/dev/kvm`` exists. If any of this is not met, please follow through for the solution mentioned in the below sections. Network Support ^^^^^^^^^^^^^^^ - Check if host OS has network access by running ``ping -c 5 8.8.8.8``. If the ping doesn't work as the network is unreachable then enable it using ``dhclient`` utility for dhcp discovery on the host OS: :: sudo dhclient -v - Check the available network interfaces on the host with below command: :: ip link show Check if the above command shows a virtual bridge ``virbr#`` already configured and running on host. This virtual bridge will help in giving network access to the guest OS. - If the KVM support or the virtual bridge could not be found then try the below commands. For more details refer to the instructions in `Ubuntu KVM Installation guide`_ to resolve any issues. :: sudo apt update sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils libfdt-dev -y - Now start the ``libvirtd`` service to initiate the communication between the KVM and the libvirt APIs. Use below commands to configure the system to start the service at every boot. :: sudo systemctl start libvirtd sudo systemctl enable libvirtd - The network acces to the guest OS can be given by creating a bridge and a tap interface. Follow commands shown below to create the tap interface and add it to virtual bridge ``virbr#`` as listed from executing ``ip link show``. :: sudo ip tuntap add dev tap0 mode tap user $(whoami) sudo ip link set tap0 master virbr# up Now create a workspace to begin with virtualization test example. :: mkdir -p ~/kvm-test/ cd ~/kvm-test/ Emulate Flash Memory ^^^^^^^^^^^^^^^^^^^^ ``ArmvirtKvmTool`` UEFI firmware needs a flash memory while booting to store various objects. Create an empty zero filled flash memory file which will be presented by kvmtool as a flash device to the UEFI firmware and guest OS. :: dd if=/dev/zero of=efivar.img bs=128M count=1 Enable PCIe pass-through based device virtualization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As mentioned in the :ref:`Virtualization document ` PCIe pass-through (also called as direct device assignment) allows a device to be assigned to a guest such that the guest runs the driver for the device without intervention of the hypervisor/host. This is one of the device virtuali- zation technique that provides near near host device performance. This is achieved with the help of VFIO driver framework and IOMMU support. More about this can be read from `Linux vfio`_. - Neoverse reference platforms have few smmu-test-engine devices that are the PCIe endpoint devices that can be used to demonstrate this feature Use the verbose ``lspci`` command to check the status of these devices for example, with pci BDF ids 08:00.0 and 08:00.1. :: sudo lspci -v sudo lspci -v -s 0000:08:00.1 - Check if ``vfio_pci`` kernel module is already loaded or not. :: lsmod | grep -i "vfio" if not then manually probe the kernel driver module :: sudo modprobe vfio-pci - Unbind the pci endpoint device from its current driver if the device is attached to its class driver. If the driver doesn't exist ignore the error produced on running below command :: echo "0000:08:00.1" | sudo tee /sys/bus/pci/devices/0000\:08\:00.1/driver/unbind - Bind the device to ``vfio-pci`` driver :: echo "vfio-pci" | sudo tee /sys/bus/pci/devices/0000\:08\:00.1/driver_override echo "0000:08:00.1" | sudo tee /sys/bus/pci/drivers_probe - Confirm that device has been attached to ``vfio-pci`` driver :: sudo lspci -v -s 0000:08:00.1 | grep -i "Kernel driver" - In order to use the device for direct assignment, it is required that all the devices sharing the iommu group with this particular device are attached to ``vfio-pci`` driver. So perform the above mentioned unbinding and binding for all the endpoint devices that shares the common iommu group. List out all the devices that are under that specific iommu group :: ls /sys/bus/pci/drivers/vfio-pci/0000\:08\:00.1/iommu_group/devices/ Obtain the built binaries ^^^^^^^^^^^^^^^^^^^^^^^^^ - Running the KVM session will require the ``ArmvirtKvmTool`` UEFI firmware, a guest disk image with pre-installed Linux distro OS and the ``kvmtool`` binary which were obtained in section `Build the platform software`_. Copy these to the host OS through network using below commands in the workspace directory ``kvm-test``. :: rsync -Wa --progress user@server:absolute/path/to/guest-ubuntu.satadisk . rsync -Wa --progress user@server:TOP_DIR/output//components/css-common/KVMTOOL_EFI.bin . rsync -Wa --progress user@server:TOP_DIR/output//components/kvmtool/lkvm . Launch VMs with multiple Linux distributions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Finally, launch the virtual machine with a Linux distribution image as the guest OS. As mentioned in the :ref:`Virtualization document ` 'screen' utility can be used to multiplex console outputs. .. note:: To switch back to host session detach from the screen by pressing ``ctrl+a d``. Run the below command from ``kvm-test`` workspace directory to start a KVM session with ArmvirtKvmTool binary ``KVMTOOL_EFI.bin``, kvmtool binary ``lkvm``, flash image ``efivar.img``, the distribution disk image for guest ``guest-ubuntu.satadisk``, ``tap0`` tap inteface and the PCI device with requester-ID (BDF) ``0000:08:00.1`` used for direct device assignment: :: screen -md -S "virt0" sudo ./lkvm run -m 2048 -f KVMTOOL_EFI.bin -F efivar.img -d guest-ubuntu.satadisk -n tapif=tap0 --console serial --force-pci --vfio-pci 0000:08:00.1 --disable-mte; - The launched screens can be viewed from the target by using the following command: :: screen -ls - Jump to the screen using: :: screen -r virt0 - The guest can be seen booting with logs as shown below: :: # lkvm run --firmware ./KVMTOOL_EFI.bin -m 2048 -c 4 --name guest-3882 Info: Using IOMMU type 3 for VFIO container Info: 0000:08:00.1: assigned to device number 0x0 in group 3 Info: flash file size (134217728 bytes) is not a power of two Info: only using first 16777216 bytes UEFI firmware (version built at 14:51:31 on Apr 4 2022) - Notice the logs about PCIe device being setup using the Linux VFIO driver. :: Info: Using IOMMU type 3 for VFIO container Info: 0000:08:00.1: assigned to device number 0x0 in group 9 - Once the guest has booted. check if network is accessible and assigned pci device is listed in ``lspci``. :: # If network is unreachable use dhclient: sudo dhclient -v ping -c 2 8.8.8.8 # Check the listed PCI devices lspci # Output of lspci 00:00.0 Unassigned class [ff00]: ARM Device ff80 - To shutdown the guest execute the following command: :: sudo poweroff On completion of guest shutdown ``kvmtool`` prints a message denoting error free closing of KVM session. :: # KVM session ended normally. .. _Linux vfio: https://www.kernel.org/doc/Documentation/driver-api/vfio.rst .. _Ubuntu KVM Installation guide: https://help.ubuntu.com/community/KVM/Installation