cxl: Adapter failure handling

Check the AFU state whenever an API is called. The hypervisor may issue a reset of the adapter when it detects a fault. When it happens, it launches an error recovery which will either move the AFU to a permanent failure state, or in the disabled state. If the AFU is found to be disabled, detach all existing contexts from it before issuing a AFU reset to re-enable it. Before detaching contexts, notify any kernel driver through the EEH callbacks of the AFU pci device. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-04 12:26:41 +01:00
parent d601ea918b
commit 0d400f77c1
7 changed files with 198 additions and 35 deletions
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -220,7 +220,7 @@ int __detach_context(struct cxl_context *ctx)
 	 * If detach fails when hw is down, we don't care.
 	 */
 	WARN_ON(cxl_ops->detach_process(ctx) &&
-		cxl_ops->link_ok(ctx->afu->adapter));
+		cxl_ops->link_ok(ctx->afu->adapter, ctx->afu));
 	flush_work(&ctx->fault_work); /* Only needed for dedicated process */

 	/* release the reference to the group leader and mm handling pid */