kselftests exposed a problem in the s390 handling for memory slots.
Right now we only do proper memory slot handling for creation of new
memory slots. Neither MOVE, nor DELETION are handled properly. Let us
implement those.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull KVM updates from Paolo Bonzini:
"ARM:
- support for SVE and Pointer Authentication in guests
- PMU improvements
POWER:
- support for direct access to the POWER9 XIVE interrupt controller
- memory and performance optimizations
x86:
- support for accessing memory not backed by struct page
- fixes and refactoring
Generic:
- dirty page tracking improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits)
kvm: fix compilation on aarch64
Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU"
kvm: x86: Fix L1TF mitigation for shadow MMU
KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible
KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device
KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing"
KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs
kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete
tests: kvm: Add tests for KVM_SET_NESTED_STATE
KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state
tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID
tests: kvm: Add tests to .gitignore
KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one
KVM: Fix the bitmap range to copy during clear dirty
KVM: arm64: Fix ptrauth ID register masking logic
KVM: x86: use direct accessors for RIP and RSP
KVM: VMX: Use accessors for GPRs outside of dedicated caching logic
KVM: x86: Omit caching logic for always-available GPRs
kvm, x86: Properly check whether a pfn is an MMIO or not
...
Pull pidfd updates from Christian Brauner:
"This patchset makes it possible to retrieve pidfds at process creation
time by introducing the new flag CLONE_PIDFD to the clone() system
call. Linus originally suggested to implement this as a new flag to
clone() instead of making it a separate system call.
After a thorough review from Oleg CLONE_PIDFD returns pidfds in the
parent_tidptr argument. This means we can give back the associated pid
and the pidfd at the same time. Access to process metadata information
thus becomes rather trivial.
As has been agreed, CLONE_PIDFD creates file descriptors based on
anonymous inodes similar to the new mount api. They are made
unconditional by this patchset as they are now needed by core kernel
code (vfs, pidfd) even more than they already were before (timerfd,
signalfd, io_uring, epoll etc.). The core patchset is rather small.
The bulky looking changelist is caused by David's very simple changes
to Kconfig to make anon inodes unconditional.
A pidfd comes with additional information in fdinfo if the kernel
supports procfs. The fdinfo file contains the pid of the process in
the callers pid namespace in the same format as the procfs status
file, i.e. "Pid:\t%d".
To remove worries about missing metadata access this patchset comes
with a sample/test program that illustrates how a combination of
CLONE_PIDFD and pidfd_send_signal() can be used to gain race-free
access to process metadata through /proc/<pid>.
Further work based on this patchset has been done by Joel. His work
makes pidfds pollable. It finished too late for this merge window. I
would prefer to have it sitting in linux-next for a while and send it
for inclusion during the 5.3 merge window"
* tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
samples: show race-free pidfd metadata access
signal: support CLONE_PIDFD with pidfd_send_signal
clone: add CLONE_PIDFD
Make anon_inodes unconditional
KVM: s390: Features and fixes for 5.2
- VSIE crypto fixes
- new guest features for gen15
- disable halt polling for nested virtualization with overcommit
Add an extra parameter for airq handlers to recognize
floating vs. directed interrupts.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We do track the current steal time of the host CPUs. Let us use
this value to disable halt polling if the steal time goes beyond
a configured value.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Instead of adding a new machine option to disable/enable the keywrapping
options of pckmo (like for AES and DEA) we can now use the CPU model to
decide. As ECC is also wrapped with the AES key we need that to be
enabled.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
This enables stfle.151 and adds the subfunctions for DFLTCC. Bit 151 is
added to the list of facilities that will be enabled when there is no
cpu model involved as DFLTCC requires no additional handling from
userspace, e.g. for migration.
Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Make the anon_inodes facility unconditional so that it can be used by core
VFS code and pidfd code.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[christian@brauner.io: adapt commit message to mention pidfds]
Signed-off-by: Christian Brauner <christian@brauner.io>
This enables stfle.150 and adds the subfunctions for SORTL. Bit 150 is
added to the list of facilities that will be enabled when there is no
cpu model involved as sortl requires no additional handling from
userspace, e.g. for migration.
Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Some of the new features have a 32byte response for the query function.
Provide a new wrapper similar to __cpacf_query. We might want to factor
this out if other users come up, as of today there is none. So let us
keep the function within KVM.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
This enables stfle.155 and adds the subfunctions for KDSA. Bit 155 is
added to the list of facilities that will be enabled when there is no
cpu model involved as MSA9 requires no additional handling from
userspace, e.g. for migration.
Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
If vector support is enabled, the vector BCD enhancements facility
might also be enabled.
We can directly forward this facility to the guest if available
and VX is requested by user space.
Please note that user space can and will have the final decision
on the facility bits for a guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
If vector support is enabled, the vector enhancements facility 2
might also be enabled.
We can directly forward this facility to the guest if available
and VX is requested by user space.
Please note that user space can and will have the final decision
on the facility bits for a guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
All architectures except MIPS were defining it in the same way,
and memory slots are handled entirely by common code so there
is no point in keeping the definition per-architecture.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull KVM updates from Paolo Bonzini:
"ARM:
- some cleanups
- direct physical timer assignment
- cache sanitization for 32-bit guests
s390:
- interrupt cleanup
- introduction of the Guest Information Block
- preparation for processor subfunctions in cpu models
PPC:
- bug fixes and improvements, especially related to machine checks
and protection keys
x86:
- many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations
- AVIC fixes
Generic:
- memcg accounting"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
kvm: vmx: fix formatting of a comment
KVM: doc: Document the life cycle of a VM and its resources
MAINTAINERS: Add KVM selftests to existing KVM entry
Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
KVM: PPC: Fix compilation when KVM is not enabled
KVM: Minor cleanups for kvm_main.c
KVM: s390: add debug logging for cpu model subfunctions
KVM: s390: implement subfunction processor calls
arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
KVM: arm/arm64: Remove unused timer variable
KVM: PPC: Book3S: Improve KVM reference counting
KVM: PPC: Book3S HV: Fix build failure without IOMMU support
Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
x86: kvmguest: use TSC clocksource if invariant TSC is exposed
KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
...
As userspace can now get/set the subfunctions we want to trace those.
This will allow to also check QEMUs cpu model vs. what the real
hardware provides.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
While we will not implement interception for query functions yet, we can
and should disable functions that have a control bit based on the given
CPU model.
Let us start with enabling the subfunction interface.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
When facility.76 MSAX3 is present for the guest we must issue a validity
interception if the CRYCBD is not valid.
The bit CRYCBD.31 is an effective field and tested at each guest level
and has for effect to mask the facility.76
It follows that if CRYCBD.31 is clear and AP is not in use we do not
have to test the CRYCBD validatity even if facility.76 is present in the
host.
Fixes: 6ee7409820 ("KVM: s390: vsie: allow CRYCB FORMAT-0")
Cc: stable@vger.kernel.org
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1549876849-32680-1-git-send-email-pmorel@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The patch implements a handler for GIB alert interruptions
on the host. Its task is to alert guests that interrupts are
pending for them.
A GIB alert interrupt statistic counter is added as well:
$ cat /proc/interrupts
CPU0 CPU1
...
GAL: 23 37 [I/O] GIB Alert
...
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190131085247.13826-14-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Add the Interruption Alert Mask (IAM) to the architecture specific
kvm struct. This mask in the GISA is used to define for which ISC
a GIB alert will be issued.
The functions kvm_s390_gisc_register() and kvm_s390_gisc_unregister()
are used to (un)register a GISC (guest ISC) with a virtual machine and
its GISA.
Upon successful completion, kvm_s390_gisc_register() returns the
ISC to be used for GIB alert interruptions. A negative return code
indicates an error during registration.
Theses functions will be used by other adapter types like AP and PCI to
request pass-through interruption support.
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-12-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The Guest Information Block (GIB) links the GISA of all guests
that have adapter interrupts pending. These interrupts cannot be
delivered because all vcpus of these guests are currently in WAIT
state or have masked the respective Interruption Sub Class (ISC).
If enabled, a GIB alert is issued on the host to schedule these
guests to run suitable vcpus to consume the pending interruptions.
This mechanism allows to process adapter interrupts for currently
not running guests.
The GIB is created during host initialization and associated with
the Adapter Interruption Facility in case an Adapter Interruption
Virtualization Facility is available.
The GIB initialization and thus the activation of the related code
will be done in an upcoming patch of this series.
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-10-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull Kconfig updates from Masahiro Yamada:
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
kconfig: add static qualifiers to fix gconf warnings
kconfig: split the lexer out of zconf.y
kconfig: split some C files out of zconf.y
kconfig: convert to SPDX License Identifier
kconfig: remove keyword lookup table entirely
kconfig: update current_pos in the second lexer
kconfig: switch to ASSIGN_VAL state in the second lexer
kconfig: stop associating kconf_id with yylval
kconfig: refactor end token rules
kconfig: stop supporting '.' and '/' in unquoted words
treewide: surround Kconfig file paths with double quotes
microblaze: surround string default in Kconfig with double quotes
kconfig: use T_WORD instead of T_VARIABLE for variables
kconfig: use specific tokens instead of T_ASSIGN for assignments
kconfig: refactor scanning and parsing "option" properties
kconfig: use distinct tokens for type and default properties
kconfig: remove redundant token defines
kconfig: rename depends_list to comment_option_list
...
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.
I do not see a good reason to complicate Kconfig for the room of
ambiguity.
The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.
Make it treewide consistent now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Relocate #define statement for kvm related kernel messages
before the include of printk to become effective.
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
kvm_arch_crypto_set_masks is a new function to centralize
the setup the APCB masks inside the CRYCB SIE satellite.
To trace APCB mask changes, we add KVM_EVENT() tracing to
both kvm_arch_crypto_set_masks and kvm_arch_crypto_clear_masks.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <1538728270-10340-2-git-send-email-pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
KVM: s390: Features for 4.20
- Initial version of AP crypto virtualization via vfio-mdev
- Set the host program identifier
- Optimize page table locking