KVM: race-free exit from KVM_RUN without POSIX signals
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" a VCPU out of KVM_RUN through a POSIX signal. A signal is attached to a dummy signal handler; by blocking the signal outside KVM_RUN and unblocking it inside, this possible race is closed: VCPU thread service thread -------------------------------------------------------------- check flag set flag raise signal (signal handler does nothing) KVM_RUN However, one issue with KVM_SET_SIGNAL_MASK is that it has to take tsk->sighand->siglock on every KVM_RUN. This lock is often on a remote NUMA node, because it is on the node of a thread's creator. Taking this lock can be very expensive if there are many userspace exits (as is the case for SMP Windows VMs without Hyper-V reference time counter). As an alternative, we can put the flag directly in kvm_run so that KVM can see it: VCPU thread service thread -------------------------------------------------------------- raise signal signal handler set run->immediate_exit KVM_RUN check run->immediate_exit Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Este commit está contenido en:
@@ -3389,7 +3389,18 @@ struct kvm_run {
|
||||
Request that KVM_RUN return when it becomes possible to inject external
|
||||
interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
|
||||
|
||||
__u8 padding1[7];
|
||||
__u8 immediate_exit;
|
||||
|
||||
This field is polled once when KVM_RUN starts; if non-zero, KVM_RUN
|
||||
exits immediately, returning -EINTR. In the common scenario where a
|
||||
signal is used to "kick" a VCPU out of KVM_RUN, this field can be used
|
||||
to avoid usage of KVM_SET_SIGNAL_MASK, which has worse scalability.
|
||||
Rather than blocking the signal outside KVM_RUN, userspace can set up
|
||||
a signal handler that sets run->immediate_exit to a non-zero value.
|
||||
|
||||
This field is ignored if KVM_CAP_IMMEDIATE_EXIT is not available.
|
||||
|
||||
__u8 padding1[6];
|
||||
|
||||
/* out */
|
||||
__u32 exit_reason;
|
||||
|
Referencia en una nueva incidencia
Block a user