Merge tag 'kvm-arm-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Changes for v4.12.

Changes include:
 - Using the common sysreg definitions between KVM and arm64
 - Improved hyp-stub implementation with support for kexec and kdump on the 32-bit side
 - Proper PMU exception handling
 - Performance improvements of our GIC handling
 - Support for irqchip in userspace with in-kernel arch-timers and PMU support
 - A fix for a race condition in our PSCI code

Conflicts:
	Documentation/virtual/kvm/api.txt
	include/uapi/linux/kvm.h
This commit is contained in:
Paolo Bonzini
2017-04-27 17:33:14 +02:00
50 changed files with 1163 additions and 927 deletions

View File

@@ -4120,3 +4120,44 @@ This capability indicates that guest using memory monotoring instructions
(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
spent while virtual CPU is halted in this way will then be accounted for as
guest running time on the host (as opposed to e.g. HLT).
8.9 KVM_CAP_ARM_USER_IRQ
Architectures: arm, arm64
This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
that if userspace creates a VM without an in-kernel interrupt controller, it
will be notified of changes to the output level of in-kernel emulated devices,
which can generate virtual interrupts, presented to the VM.
For such VMs, on every return to userspace, the kernel
updates the vcpu's run->s.regs.device_irq_level field to represent the actual
output level of the device.
Whenever kvm detects a change in the device output level, kvm guarantees at
least one return to userspace before running the VM. This exit could either
be a KVM_EXIT_INTR or any other exit event, like KVM_EXIT_MMIO. This way,
userspace can always sample the device output level and re-compute the state of
the userspace interrupt controller. Userspace should always check the state
of run->s.regs.device_irq_level on every kvm exit.
The value in run->s.regs.device_irq_level can represent both level and edge
triggered interrupt signals, depending on the device. Edge triggered interrupt
signals will exit to userspace with the bit in run->s.regs.device_irq_level
set exactly once per edge signal.
The field run->s.regs.device_irq_level is available independent of
run->kvm_valid_regs or run->kvm_dirty_regs bits.
If KVM_CAP_ARM_USER_IRQ is supported, the KVM_CHECK_EXTENSION ioctl returns a
number larger than 0 indicating the version of this capability is implemented
and thereby which bits in in run->s.regs.device_irq_level can signal values.
Currently the following bits are defined for the device_irq_level bitmap:
KVM_CAP_ARM_USER_IRQ >= 1:
KVM_ARM_DEV_EL1_VTIMER - EL1 virtual timer
KVM_ARM_DEV_EL1_PTIMER - EL1 physical timer
KVM_ARM_DEV_PMU - ARM PMU overflow interrupt signal
Future versions of kvm may implement additional events. These will get
indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
listed above.

View File

@@ -0,0 +1,53 @@
* Internal ABI between the kernel and HYP
This file documents the interaction between the Linux kernel and the
hypervisor layer when running Linux as a hypervisor (for example
KVM). It doesn't cover the interaction of the kernel with the
hypervisor when running as a guest (under Xen, KVM or any other
hypervisor), or any hypervisor-specific interaction when the kernel is
used as a host.
On arm and arm64 (without VHE), the kernel doesn't run in hypervisor
mode, but still needs to interact with it, allowing a built-in
hypervisor to be either installed or torn down.
In order to achieve this, the kernel must be booted at HYP (arm) or
EL2 (arm64), allowing it to install a set of stubs before dropping to
SVC/EL1. These stubs are accessible by using a 'hvc #0' instruction,
and only act on individual CPUs.
Unless specified otherwise, any built-in hypervisor must implement
these functions (see arch/arm{,64}/include/asm/virt.h):
* r0/x0 = HVC_SET_VECTORS
r1/x1 = vectors
Set HVBAR/VBAR_EL2 to 'vectors' to enable a hypervisor. 'vectors'
must be a physical address, and respect the alignment requirements
of the architecture. Only implemented by the initial stubs, not by
Linux hypervisors.
* r0/x0 = HVC_RESET_VECTORS
Turn HYP/EL2 MMU off, and reset HVBAR/VBAR_EL2 to the initials
stubs' exception vector value. This effectively disables an existing
hypervisor.
* r0/x0 = HVC_SOFT_RESTART
r1/x1 = restart address
x2 = x0's value when entering the next payload (arm64)
x3 = x1's value when entering the next payload (arm64)
x4 = x2's value when entering the next payload (arm64)
Mask all exceptions, disable the MMU, move the arguments into place
(arm64 only), and jump to the restart address while at HYP/EL2. This
hypercall is not expected to return to its caller.
Any other value of r0/x0 triggers a hypervisor-specific handling,
which is not documented here.
The return value of a stub hypercall is held by r0/x0, and is 0 on
success, and HVC_STUB_ERR on error. A stub hypercall is allowed to
clobber any of the caller-saved registers (x0-x18 on arm64, r0-r3 and
ip on arm). It is thus recommended to use a function call to perform
the hypercall.