pkvm.rst 4.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596
  1. .. SPDX-License-Identifier: GPL-2.0
  2. Protected virtual machines (pKVM)
  3. =================================
  4. Introduction
  5. ------------
  6. Protected KVM (pKVM) is a KVM/arm64 extension which uses the two-stage
  7. translation capability of the Armv8 MMU to isolate guest memory from the host
  8. system. This allows for the creation of a confidential computing environment
  9. without relying on whizz-bang features in hardware, but still allowing room for
  10. complementary technologies such as memory encryption and hardware-backed
  11. attestation.
  12. The major implementation change brought about by pKVM is that the hypervisor
  13. code running at EL2 is now largely independent of (and isolated from) the rest
  14. of the host kernel running at EL1 and therefore additional hypercalls are
  15. introduced to manage manipulation of guest stage-2 page tables, creation of VM
  16. data structures and reclamation of memory on teardown. An immediate consequence
  17. of this change is that the host itself runs with an identity mapping enabled
  18. at stage-2, providing the hypervisor code with a mechanism to restrict host
  19. access to an arbitrary physical page.
  20. Enabling pKVM
  21. -------------
  22. The pKVM hypervisor is enabled by booting the host kernel at EL2 with
  23. "``kvm-arm.mode=protected``" on the command-line. Once enabled, VMs can be spawned
  24. in either protected or non-protected state, although the hypervisor is still
  25. responsible for managing most of the VM metadata in either case.
  26. Limitations
  27. -----------
  28. Enabling pKVM places some significant limitations on KVM guests, regardless of
  29. whether they are spawned in protected state. It is therefore recommended only
  30. to enable pKVM if protected VMs are required, with non-protected state acting
  31. primarily as a debug and development aid.
  32. If you're still keen, then here is an incomplete list of caveats that apply
  33. to all VMs running under pKVM:
  34. - Guest memory cannot be file-backed (with the exception of shmem/memfd) and is
  35. pinned as it is mapped into the guest. This prevents the host from
  36. swapping-out, migrating, merging or generally doing anything useful with the
  37. guest pages. It also requires that the VMM has either ``CAP_IPC_LOCK`` or
  38. sufficient ``RLIMIT_MEMLOCK`` to account for this pinned memory.
  39. - GICv2 is not supported and therefore GICv3 hardware is required in order
  40. to expose a virtual GICv3 to the guest.
  41. - Read-only memslots are unsupported and therefore dirty logging cannot be
  42. enabled.
  43. - Memslot configuration is fixed once a VM has started running, with subsequent
  44. move or deletion requests being rejected with ``-EPERM``.
  45. - There are probably many others.
  46. Since the host is unable to tear down the hypervisor when pKVM is enabled,
  47. hibernation (``CONFIG_HIBERNATION``) and kexec (``CONFIG_KEXEC``) will fail
  48. with ``-EBUSY``.
  49. If you are not happy with these limitations, then please don't enable pKVM :)
  50. VM creation
  51. -----------
  52. When pKVM is enabled, protected VMs can be created by specifying the
  53. ``KVM_VM_TYPE_ARM_PROTECTED`` flag in the machine type identifier parameter
  54. passed to ``KVM_CREATE_VM``.
  55. Protected VMs are instantiated according to a fixed vCPU configuration
  56. described by the ID register definitions in
  57. ``arch/arm64/include/asm/kvm_pkvm.h``. Only a subset of the architectural
  58. features that may be available to the host are exposed to the guest and the
  59. capabilities advertised by ``KVM_CHECK_EXTENSION`` are limited accordingly,
  60. with the vCPU registers being initialised to their architecturally-defined
  61. values.
  62. Where not defined by the architecture, the registers of a protected vCPU
  63. are reset to zero with the exception of the PC and X0 which can be set
  64. either by the ``KVM_SET_ONE_REG`` interface or by a call to PSCI ``CPU_ON``.
  65. VM runtime
  66. ----------
  67. By default, memory pages mapped into a protected guest are inaccessible to the
  68. host and any attempt by the host to access such a page will result in the
  69. injection of an abort at EL1 by the hypervisor. For accesses originating from
  70. EL0, the host will then terminate the current task with a ``SIGSEGV``.
  71. pKVM exposes additional hypercalls to protected guests, primarily for the
  72. purpose of establishing shared-memory regions with the host for communication
  73. and I/O. These hypercalls are documented in hypercalls.rst.