drm/amdkfd: Provide SMI events watch

When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong. This patch provides an event watch interface for the user
space to register devices and subscribe events they are interested. After
registered, the user can use annoymous file descriptor's poll function
with wait-time specified and wait for events to happen. Once an event
happens, the user can use read() to retrieve information related to the
event.

VM fault event is done in this patch.

v2: - remove UNREGISTER and add event ENABLE/DISABLE
    - correct kfifo usage
    - move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
v6: sparse fix

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Amber Lin
2020-05-13 08:19:29 -04:00
committed by Alex Deucher
parent 85e7151baa
commit 938a0650aa
9 changed files with 293 additions and 1 deletions

View File

@@ -442,6 +442,17 @@ struct kfd_ioctl_import_dmabuf_args {
__u32 dmabuf_fd; /* to KFD */
};
/*
* KFD SMI(System Management Interface) events
*/
/* Event type (defined by bitmask) */
#define KFD_SMI_EVENT_VMFAULT 0x0000000000000001
struct kfd_ioctl_smi_events_args {
__u32 gpuid; /* to KFD */
__u32 anon_fd; /* from KFD */
};
/* Register offset inside the remapped mmio page
*/
enum kfd_mmio_remap {
@@ -546,7 +557,10 @@ enum kfd_mmio_remap {
#define AMDKFD_IOC_ALLOC_QUEUE_GWS \
AMDKFD_IOWR(0x1E, struct kfd_ioctl_alloc_queue_gws_args)
#define AMDKFD_IOC_SMI_EVENTS \
AMDKFD_IOWR(0x1F, struct kfd_ioctl_smi_events_args)
#define AMDKFD_COMMAND_START 0x01
#define AMDKFD_COMMAND_END 0x1F
#define AMDKFD_COMMAND_END 0x20
#endif