scsi: lpfc: Fix random heartbeat timeouts during heavy IO

xiaomi-sm8450/android_kernel_xiaomi_sm8450

NVME targets appear to randomly disconnect from the initiator when
running heavy IO.

The error is due to the host aggregate (across all controllers) io load
was beyond the maximum exchange count for nvme on the adapter. The
driver was properly returning a resource busy status, but the io load
was so great heartbeat commands would be bounced and not have a
successful retry within the fuzz amount for the nvme heartbeat (yes, a
very high io load!). Thus the target was terminating the controller due
to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be used
when the adapter is out of normal exchanges and the command is a NVME
heartbeat command. As counters are used, while the reserved command is
outstanding, as soon as any other exchange completes, the counters are
adjusted and the reserved count is replenished. The heartbeat completes
execution in a normal fashion.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

This commit is contained in:

James Smart

2017-12-08 17:18:03 -08:00

committed by

Martin K. Petersen

parent 4d0951ee70

commit cf1a1d3e2d

4 changed files with 63 additions and 22 deletions

									
										2

drivers/scsi/lpfc/lpfc.h
									
												View File
												
				@@ -945,6 +945,8 @@ struct lpfc_hba {

					struct list_head lpfc_nvme_buf_list_get;

					struct list_head lpfc_nvme_buf_list_put;

					uint32_t total_nvme_bufs;

					uint32_t get_nvme_bufs;

					uint32_t put_nvme_bufs;

					struct list_head lpfc_iocb_list;

					uint32_t total_iocbq_bufs;

					struct list_head active_rrq_list;

scsi: lpfc: Fix random heartbeat timeouts during heavy IO

2 drivers/scsi/lpfc/lpfc.h Unescape Escape View File

2

drivers/scsi/lpfc/lpfc.h

View File