Multipath errors were seen during failback due to login timeout. The
remote device sent LOGO, the local host tore down the session and did
relogin. The RSCN arrived indicates remote device is going through failover
after which the relogin is in a 20s timeout phase. At this point the
driver is stuck in the relogin process. Add a fix to delete the session as
part of abort/flush the login.
Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc,
hpsa, zfcp, scsi_debug) and minor bug fixes.
We also have a huge docbook fix update like most other subsystems and
no major update to the core (the few non trivial updates are either
minor fixes or removing an unused feature [scsi_sdb_cache])"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits)
scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences
scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices
scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM"
scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged
scsi: scsi_debug: Implement tur_ms_to_ready parameter
scsi: scsi_debug: Fix request sense
scsi: lpfc: Fix typo in comment for ULP
scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC
scsi: iscsi: Do not put host in iscsi_set_flashnode_param()
scsi: hpsa: Correct ctrl queue depth
scsi: target: tcmu: Make TMR notification optional
scsi: target: tcmu: Implement tmr_notify callback
scsi: target: tcmu: Fix and simplify timeout handling
scsi: target: tcmu: Factor out new helper ring_insert_padding
scsi: target: tcmu: Do not queue aborted commands
scsi: target: tcmu: Use priv pointer in se_cmd
scsi: target: Add tmr_notify backend function
scsi: target: Modify core_tmr_abort_task()
scsi: target: iscsi: Fix inconsistent debug message
scsi: target: iscsi: Fix login error when receiving
...
The request_t 'handle' member is 32-bits wide, hence use wrt_reg_dword().
Change the cast in the wrt_reg_byte() call to make it clear that a regular
pointer is casted to an __iomem pointer.
Note: 'pkt' points to I/O memory for the qlafx00 adapter family and to
coherent memory for all other adapter families.
This patch fixes the following Coverity complaint:
CID 358864 (#1 of 1): Reliance on integer endianness (INCOMPATIBLE_CAST)
incompatible_cast: Pointer &pkt->handle points to an object whose effective
type is unsigned int (32 bits, unsigned) but is dereferenced as a narrower
unsigned short (16 bits, unsigned). This may lead to unexpected results
depending on machine endianness.
Link: https://lore.kernel.org/r/20200629225454.22863-7-bvanassche@acm.org
Fixes: 8ae6d9c7eb ("[SCSI] qla2xxx: Enhancements to support ISPFx00.")
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver performs SCR (state change registration) in all modes including
pure target mode.
For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the
port mentioned in the RSCN and fabric rescan is scheduled. During the
rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of the
port that caused the RSCN.
In target mode, the session deletion has an impact on ATIO handler,
qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O incoming
from the deleted session. qlt_handle_cmd_for_atio() and
qlt_handle_task_mgmt() return -EFAULT if they are not able to find session
of the command/TMF, and that results in invocation of qlt_send_busy():
qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014
qla_target(0): Unable to send command to target, sending BUSY status
Such response causes command timeout on the initiator. Error handler thread
on the initiator will be spawned to abort the commands:
scsi 23:0:0:0: tag#0 abort scheduled
scsi 23:0:0:0: tag#0 aborting command
qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command.
qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003.
Command abort is rejected by target and fails (2003), error handler then
tries to perform DEVICE RESET and TARGET RESET but they're also doomed to
fail because TMFs are ignored for the deleted sessions.
Then initiator makes BUS RESET that resets the link via
qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port up,
SAN switch detects that and sends RSCN to the target port and it fails
again the same way as described above. It never goes out of the loop.
The change breaks the RSCN loop by keeping initiator sessions mentioned in
RSCN payload in all modes, including dual and pure target mode.
Link: https://lore.kernel.org/r/20200605144435.27023-1-r.bolshakov@yadro.com
Fixes: 2037ce49d3 ("scsi: qla2xxx: Fix stale session")
Cc: Quinn Tran <qutran@marvell.com>
Cc: Arun Easi <aeasi@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # v5.4+
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Shyam Sundar <ssundar@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The qla2xxx driver knows when request was processed successfully or
not. But it always sets the NVMe status code to 0/NVME_SC_SUCCESS. The
upper layer needs to figure out from the rcv_rsplen and transferred_length
variables if the request was transferred successfully. This is not always
possible, e.g. when the request data length is 0, the transferred_length is
also set 0 which is interpreted as success in nvme_fc_fcpio_done(). Let's
inform the upper layer (nvme_fc_fcpio_done()) when something went wrong.
nvme_fc_fcpio_done() maps all non-NVME_SC_SUCCESS status codes to
NVME_SC_HOST_PATH_ERROR. There isn't any benefit to map the QLA status code
to the NVMe status code. Therefore, use NVME_SC_INTERNAL to indicate an
error which aligns it with the lpfc driver.
Link: https://lore.kernel.org/r/20200604100745.89250-1-dwagner@suse.de
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull SCSI updates from James Bottomley:
:This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
of other minor updates.
There are no major core changes in this series apart from a
refactoring in scsi_lib.c"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
scsi: cxgb3i: Fix some leaks in init_act_open()
scsi: ibmvscsi: Make some functions static
scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
scsi: ufs: Fix WriteBooster flush during runtime suspend
scsi: ufs: Fix index of attributes query for WriteBooster feature
scsi: ufs: Allow WriteBooster on UFS 2.2 devices
scsi: ufs: Remove unnecessary memset for dev_info
scsi: ufs-qcom: Fix scheduling while atomic issue
scsi: mpt3sas: Fix reply queue count in non RDPQ mode
scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
scsi: vhost: Notify TCM about the maximum sg entries supported per command
scsi: qla2xxx: Remove return value from qla_nvme_ls()
scsi: qla2xxx: Remove an unused function
scsi: iscsi: Register sysfs for iscsi workqueue
scsi: scsi_debug: Parser tables and code interaction
scsi: core: Refactor scsi_mq_setup_tags function
scsi: core: Fix incorrect usage of shost_for_each_device
scsi: qla2xxx: Fix endianness annotations in source files
...
Annotate members of FC protocol and firmware dump data structures as big
endian. Annotate members of RISC control structures as little endian.
Annotate mailbox registers as little endian. Annotate the mb[] arrays as
CPU-endian because communication of the mb[] values with the hardware
happens through the readw() and writew() functions. readw() converts from
__le16 to u16 and writew() converts from u16 to __le16. Annotate 'handles'
as CPU-endian because for the firmware these are opaque values.
Link: https://lore.kernel.org/r/20200518211712.11395-15-bvanassche@acm.org
CC: Hannes Reinecke <hare@suse.de>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of passing an argument to the firmware dumping functions that tells
these functions whether or not to obtain the hardware lock, obtain that
lock before calling these functions. This patch fixes the following
recently introduced C=2 build error:
CHECK drivers/scsi/qla2xxx/qla_tmpl.c
drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: Expected ; at end of statement
drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: got }
drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: Expected } at end of function
drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: got end-of-input
Link: https://lore.kernel.org/r/20200518211712.11395-4-bvanassche@acm.org
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: Arun Easi <aeasi@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>