Commit Graph

19387 Commits

Author SHA1 Message Date
Xiang Chen
547fde8b5a scsi: hisi_sas: Return directly if init hardware failed
Need to return directly if init hardware failed.

Fixes: 73a4925d15 ("scsi: hisi_sas: Update all the registers after suspend and resume")
Link: https://lore.kernel.org/r/1573551059-107873-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:34 -05:00
Xiang Chen
8c39673d54 scsi: hisi_sas: Check sas_port before using it
Need to check the structure sas_port before using it.

Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
3b294c0fb9 scsi: lpfc: Update lpfc version to 12.6.0.2
Update lpfc version to 12.6.0.2

Link: https://lore.kernel.org/r/20191111230401.12958-7-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
542ddc9b34 scsi: lpfc: revise nvme max queues to be hdwq count
Driver is setting the initiator nvme template with a max hw queues value of
the present cpu count which is odd. It should be registering the number of
hdwq queues (queues created on the adapter).

Change to set nvme tempate, in all cases, to the number of hardware queues.

Link: https://lore.kernel.org/r/20191111230401.12958-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
bc227dde0d scsi: lpfc: Initialize cpu_map for not present cpus
Currently, cpu_map[cpu#]->hdwq is left to equal LPFC_VECTOR_MAP_EMPTY for
not present CPUs.  If a CPU is dynamically hot-added, it is possible we may
crash due to not assigning an allocated hdwq.

Correct by assigning a hdwq at initialization for all not-present CPUs.

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191111230401.12958-5-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
d480e57809 scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list()
Compilation can fail due to having an inline function reference where the
function body is not present.

Fix by removing the inline tag.

Fixes: 93a4d6f401 ("scsi: lpfc: Add registration for CPU Offline/Online events")

Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
6c6d59e0fe scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
Coverity reported the following:

*** CID 101747:  Null pointer dereferences  (FORWARD_NULL)
/drivers/scsi/lpfc/lpfc_els.c: 4439 in lpfc_cmpl_els_rsp()
4433     			kfree(mp);
4434     		}
4435     		mempool_free(mbox, phba->mbox_mem_pool);
4436     	}
4437     out:
4438     	if (ndlp && NLP_CHK_NODE_ACT(ndlp)) {
vvv     CID 101747:  Null pointer dereferences  (FORWARD_NULL)
vvv     Dereferencing null pointer "shost".
4439     		spin_lock_irq(shost->host_lock);
4440     		ndlp->nlp_flag &= ~(NLP_ACC_REGLOGIN | NLP_RM_DFLT_RPI);
4441     		spin_unlock_irq(shost->host_lock);
4442
4443     		/* If the node is not being used by another discovery thread,
4444     		 * and we are sending a reject, we are done with it.

Fix by adding a check for non-null shost in line 4438.
The scenario when shost is set to null is when ndlp is null.
As such, the ndlp check present was sufficient. But better safe
than sorry so add the shost check.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 101747 ("Null pointer dereferences")
Fixes: 2e0fef85e0 ("[SCSI] lpfc: NPIV: split ports")

CC: James Bottomley <James.Bottomley@SteelEye.com>
CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
CC: linux-next@vger.kernel.org
Link: https://lore.kernel.org/r/20191111230401.12958-3-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
6f23f8c5c9 scsi: lpfc: fix: Coverity: lpfc_get_scsi_buf_s3(): Null pointer dereferences
Coverity reported the following:

*** CID 1487391:  Null pointer dereferences  (FORWARD_NULL)
/drivers/scsi/lpfc/lpfc_scsi.c: 614 in lpfc_get_scsi_buf_s3()
608     		spin_unlock(&phba->scsi_buf_list_put_lock);
609     	}
610     	spin_unlock_irqrestore(&phba->scsi_buf_list_get_lock, iflag);
611
612     	if (lpfc_ndlp_check_qdepth(phba, ndlp)) {
613     		atomic_inc(&ndlp->cmd_pending);
vvv     CID 1487391:  Null pointer dereferences  (FORWARD_NULL)
vvv     Dereferencing null pointer "lpfc_cmd".
614     		lpfc_cmd->flags |= LPFC_SBUF_BUMP_QDEPTH;
615     	}
616     	return  lpfc_cmd;
617     }
618     /**
619      * lpfc_get_scsi_buf_s4 - Get a scsi buffer from io_buf_list of the HBA

Fix by checking lpfc_cmd to be non-NULL as part of line 612

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1487391 ("Null pointer dereferences")
Fixes: 2a5b7d626e ("scsi: lpfc: Limit tracking of tgt queue depth in fast path")

CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
CC: linux-next@vger.kernel.org
Link: https://lore.kernel.org/r/20191111230401.12958-2-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
Vignesh Raghavendra
6979e56cec scsi: ufs: Add driver for TI wrapper for Cadence UFS IP
TI's J721e SoC has a Cadence UFS IP with a TI specific wrapper. This is a
minimal driver to configure the wrapper. It releases the UFS slave device
out of reset and sets up registers to indicate PHY reference clock input
frequency before probing child Cadence UFS driver.

Link: https://lore.kernel.org/r/20191108164857.11466-3-vigneshr@ti.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
Christoph Hellwig
d41003513e block: rework zone reporting
Avoid the need to allocate a potentially large array of struct blk_zone
in the block layer by switching the ->report_zones method interface to
a callback model. Now the caller simply supplies a callback that is
executed on each reported zone, and private data for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-12 19:12:07 -07:00
Damien Le Moal
23a50861ad scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer()
There is no need to arbitrarily limit the size of a report zone to the
number of zones defined by SD_ZBC_REPORT_MAX_ZONES. Rather, simply
calculate the report buffer size needed for the requested number of
zones without exceeding the device total number of zones. This buffer
size limitation to the hardware maximum transfer size and page mapping
capabilities is kept unchanged. Starting with this initial buffer size,
the allocation is optimized by iterating over decreasing buffer size
until the allocation succeeds (each iteration is allowed to fail fast
using the __GFP_NORETRY flag). This ensures forward progress for zone
reports and avoids failures of zones revalidation under memory pressure.

While at it, also replace the hard coded 512 B sector size with the
SECTOR_SIZE macro.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-12 19:12:04 -07:00
Damien Le Moal
d9dd73087a block: Enhance blk_revalidate_disk_zones()
For ZBC and ZAC zoned devices, the scsi driver revalidation processing
implemented by sd_revalidate_disk() includes a call to
sd_zbc_read_zones() which executes a full disk zone report used to
check that all zones of the disk are the same size. This processing is
followed by a call to blk_revalidate_disk_zones(), used to initialize
the device request queue zone bitmaps (zone type and zone write lock
bitmaps). To do so, blk_revalidate_disk_zones() also executes a full
device zone report to obtain zone types. As a result, the entire
zoned block device revalidation process includes two full device zone
report.

By moving the zone size checks into blk_revalidate_disk_zones(), this
process can be optimized to a single full device zone report, leading to
shorter device scan and revalidation times. This patch implements this
optimization, reducing the original full device zone report implemented
in sd_zbc_check_zones() to a single, small, report zones command
execution to obtain the size of the first zone of the device. Checks
whether all zones of the device are the same size as the first zone
size are moved to the generic blk_check_zone() function called from
blk_revalidate_disk_zones().

This optimization also has the following benefits:
1) fewer memory allocations in the scsi layer during disk revalidation
   as the potentailly large buffer for zone report execution is not
   needed.
2) Implement zone checks in a generic manner, reducing the burden on
   device driver which only need to obtain the zone size and check that
   this size is a power of 2 number of LBAs. Any new type of zoned
   block device will benefit from this.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-12 19:11:52 -07:00
Jens Axboe
0788c4eda0 Merge branch 'for-5.5/drivers-post' into for-5.5/zoned
* for-5.5/drivers-post:
  scsi: sd_zbc: add zone open, close, and finish support
  scsi: core: Handle drivers which set sg_tablesize to zero
  scsi: qla2xxx: fix NPIV tear down process
  scsi: sd_zbc: Fix sd_zbc_complete()
  scsi: qla2xxx: stop timer in shutdown path
  scsi: sd: define variable dif as unsigned int instead of bool
  scsi: target: cxgbit: Fix cxgbit_fw4_ack()
  scsi: qla2xxx: Fix partial flash write of MBI
  scsi: qla2xxx: Initialized mailbox to prevent driver load failure
  scsi: lpfc: Honor module parameter lpfc_use_adisc
  scsi: ufs-bsg: Wake the device before sending raw upiu commands
  scsi: lpfc: Check queue pointer before use
  scsi: qla2xxx: fixup incorrect usage of host_byte
2019-11-12 19:11:33 -07:00
Damien Le Moal
9237f04e12 scsi: core: Fix scsi_get/set_resid() interface
struct scsi_cmnd cmd->req.resid_len which is returned and set respectively
by the helper functions scsi_get_resid() and scsi_set_resid() is an
unsigned int. Reflect this fact in the interface of these helper functions.

Also fix compilation errors due to min() and max() type mismatch introduced
by this change in scsi debug code, usb transport code and in the USB ENE
card reader driver.

Link: https://lore.kernel.org/r/20191030090847.25650-1-damien.lemoal@wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:34:49 -05:00
Bart Van Assche
61951a6d31 scsi: lpfc: Fix lpfc_cpumask_of_node_init()
Fix the following kernel warning:

cpumask_of_node(-1): (unsigned)node >= nr_node_ids(1)

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191108225947.1395-1-jsmart2021@gmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:29:42 -05:00
Bart Van Assche
eea2d396aa scsi: lpfc: Fix a kernel warning triggered by lpfc_sli4_enable_intr()
Fix the following lockdep warning:

============================================
WARNING: possible recursive locking detected
5.4.0-rc6-dbg+ #2 Not tainted
--------------------------------------------
systemd-udevd/130 is trying to acquire lock:
ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: irq_calc_affinity_vectors+0x63/0x90

but task is already holding lock:

ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]

other info that might help us debug this:

 Possible unsafe locking scenario:
       CPU0
       ----
  lock(cpu_hotplug_lock.rw_sem);
  lock(cpu_hotplug_lock.rw_sem);

*** DEADLOCK ***
 May be due to missing lock nesting notation
2 locks held by systemd-udevd/130:
 #0: ffff8880d53fe210 (&dev->mutex){....}, at: __device_driver_lock+0x4a/0x70
 #1: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]

stack backtrace:
CPU: 1 PID: 130 Comm: systemd-udevd Not tainted 5.4.0-rc6-dbg+ #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0xa5/0xe6
 __lock_acquire.cold+0xf7/0x23a
 lock_acquire+0x106/0x240
 cpus_read_lock+0x41/0xe0
 irq_calc_affinity_vectors+0x63/0x90
 __pci_enable_msix_range+0x10a/0x950
 pci_alloc_irq_vectors_affinity+0x144/0x210
 lpfc_sli4_enable_intr+0x4b2/0xd50 [lpfc]
 lpfc_pci_probe_one+0x1411/0x22b0 [lpfc]
 local_pci_probe+0x7c/0xc0
 pci_device_probe+0x25d/0x390
 really_probe+0x170/0x510
 driver_probe_device+0x127/0x190
 device_driver_attach+0x98/0xa0
 __driver_attach+0xb6/0x1a0
 bus_for_each_dev+0x100/0x150
 driver_attach+0x31/0x40
 bus_add_driver+0x246/0x300
 driver_register+0xe0/0x170
 __pci_register_driver+0xde/0xf0
 lpfc_init+0x134/0x1000 [lpfc]
 do_one_initcall+0xda/0x47e
 do_init_module+0x10a/0x3b0
 load_module+0x4318/0x47c0
 __do_sys_finit_module+0x134/0x1d0
 __x64_sys_finit_module+0x47/0x50
 do_syscall_64+0x6f/0x2e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191107052158.25788-4-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:28:53 -05:00
Bart Van Assche
765ab6cdac scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()
Fix the following kernel bug report:

BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/954

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Link: https://lore.kernel.org/r/20191107052158.25788-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:28:23 -05:00
Martin Wilck
a10c8803d0 scsi: qla2xxx: don't use zero for FC4_PRIORITY_NVME
Avoid an uninitialized value (0) for ha->fc4_type_priority being falsely
interpreted as NVMe priority. Not strictly needed any more after the
previous patch, but makes the fc4_type_priority handling more explicit.

Link: https://lore.kernel.org/r/20191107224839.32417-3-martin.wilck@suse.com
Tested-by: David Bond <dbond@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:23:23 -05:00
Martin Wilck
f5a2b219a7 scsi: qla2xxx: initialize fc4_type_priority
ha->fc4_type_priority is currently initialized only in
qla81xx_nvram_config(). That makes it default to NVMe for other adapters.
Fix it.

Fixes: 84ed362ac4 ("scsi: qla2xxx: Dual FCP-NVMe target port support")
Link: https://lore.kernel.org/r/20191107224839.32417-2-martin.wilck@suse.com
Tested-by: David Bond <dbond@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:23:11 -05:00
Bart Van Assche
162b805e38 scsi: qla2xxx: Fix a dma_pool_free() call
This patch fixes the following kernel warning:

DMA-API: qla2xxx 0000:00:0a.0: device driver frees DMA memory with different size [device address=0x00000000c7b60000] [map size=4088 bytes] [unmap size=512 bytes]
WARNING: CPU: 3 PID: 1122 at kernel/dma/debug.c:1021 check_unmap+0x4d0/0xbd0
CPU: 3 PID: 1122 Comm: rmmod Tainted: G           O      5.4.0-rc1-dbg+ #1
RIP: 0010:check_unmap+0x4d0/0xbd0
Call Trace:
 debug_dma_free_coherent+0x123/0x173
 dma_free_attrs+0x76/0xe0
 qla2x00_mem_free+0x329/0xc40 [qla2xxx_scst]
 qla2x00_free_device+0x170/0x1c0 [qla2xxx_scst]
 qla2x00_remove_one+0x4f0/0x6d0 [qla2xxx_scst]
 pci_device_remove+0xd5/0x1f0
 device_release_driver_internal+0x159/0x280
 driver_detach+0x8b/0xf2
 bus_remove_driver+0x9a/0x15a
 driver_unregister+0x51/0x70
 pci_unregister_driver+0x2d/0x130
 qla2x00_module_exit+0x1c/0xbc [qla2xxx_scst]
 __x64_sys_delete_module+0x22a/0x300
 do_syscall_64+0x6f/0x2e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 3f006ac342 ("scsi: qla2xxx: Secure flash update support for ISP28XX") # v5.2-rc1~130^2~270.
Cc: Michael Hernandez <mhernandez@marvell.com>
Cc: Himanshu Madhani <hmadhani@marvell.com>
Link: https://lore.kernel.org/r/20191106044226.5207-3-bvanassche@acm.org
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:21:31 -05:00
Bart Van Assche
47140a20a8 scsi: qla2xxx: Remove an include directive
Since the code in qla_init.c is initiator code, remove the SCSI target core
include directive.

Cc: Himanshu Madhani <hmadhani@marvell.com>
Link: https://lore.kernel.org/r/20191106044226.5207-2-bvanassche@acm.org
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:21:18 -05:00
Himanshu Madhani
b3f7456841 scsi: qla2xxx: Update driver version to 10.01.00.21-k
Link: https://lore.kernel.org/r/20191105150657.8092-9-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:42 -05:00
Arun Easi
65e9200938 scsi: qla2xxx: Fix device connect issues in P2P configuration
P2P needs to take the alternate plogi route.

Link: https://lore.kernel.org/r/20191105150657.8092-8-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:42 -05:00
Arun Easi
2f856d4e8c scsi: qla2xxx: Fix memory leak when sending I/O fails
On heavy loads, a memory leak of the srb_t structure is observed.  This
would make the qla2xxx_srbs cache gobble up memory.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-7-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:42 -05:00
Quinn Tran
f45bca8c50 scsi: qla2xxx: Fix double scsi_done for abort path
Current code assumes abort will remove the original command from the active
list where scsi_done will not be called. Instead, the eh_abort thread will
do the scsi_done. That is not the case.  Instead, we have a double
scsi_done calls triggering use after free.

Abort will tell FW to release the command from FW possesion. The original
command will return to ULP with error in its normal fashion via scsi_done.
eh_abort path would wait for the original command completion before
returning.  eh_abort path will not perform the scsi_done call.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-6-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:42 -05:00
Quinn Tran
dd322b7f3e scsi: qla2xxx: Fix driver unload hang
This patch fixes driver unload hang by removing msleep()

Fixes: d74595278f ("scsi: qla2xxx: Add multiple queue pair functionality.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191105150657.8092-5-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:42 -05:00
Quinn Tran
af2a0c51b1 scsi: qla2xxx: Fix SRB leak on switch command timeout
when GPSC/GPDB switch command fails, driver just returns without doing a
proper cleanup. This patch fixes this memory leak by calling sp->free() in
the error path.

Link: https://lore.kernel.org/r/20191105150657.8092-4-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:41 -05:00
Quinn Tran
71c80b75ce scsi: qla2xxx: Do command completion on abort timeout
On switch, fabric and mgt command timeout, driver send Abort to tell FW to
return the original command.  If abort is timeout, then return both Abort
and original command for cleanup.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-3-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:41 -05:00
Quinn Tran
983f127603 scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure
Current code will send PRLI with FC-NVMe bit set for the targets which
support only FCP. This may result into issue with targets which do not
understand NVMe and will go into a strange state. This patch would restart
the login process by going back to PLOGI state. The PLOGI state will force
the target to respond to correct PRLI request.

Fixes: c76ae845ea ("scsi: qla2xxx: Add error handling for PLOGI ELS passthrough")
Cc: stable@vger.kernel.org # 5.4
Link: https://lore.kernel.org/r/20191105150657.8092-2-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:15:41 -05:00
Ajay Joshi
ad512f2023 scsi: sd_zbc: add zone open, close, and finish support
Implement REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH
support to allow explicit control of zone states.

Contains contributions from Matias Bjorling, Hans Holmberg,
Keith Busch and Damien Le Moal.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 06:46:02 -07:00
Jens Axboe
6d1ec7814d Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi into for-5.5/drivers-post
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi:
  scsi: core: Handle drivers which set sg_tablesize to zero
  scsi: qla2xxx: fix NPIV tear down process
  scsi: sd_zbc: Fix sd_zbc_complete()
2019-11-07 06:45:53 -07:00
Jens Axboe
e16381720a Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi into for-5.5/drivers-post
SCSI fixes on 20191101

Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4]
and one core change in sd that fixes an I/O failure on DIF type 3
devices.

Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits)
  scsi: qla2xxx: stop timer in shutdown path
  scsi: sd: define variable dif as unsigned int instead of bool
  scsi: target: cxgbit: Fix cxgbit_fw4_ack()
  scsi: qla2xxx: Fix partial flash write of MBI
  scsi: qla2xxx: Initialized mailbox to prevent driver load failure
  scsi: lpfc: Honor module parameter lpfc_use_adisc
  scsi: ufs-bsg: Wake the device before sending raw upiu commands
  scsi: lpfc: Check queue pointer before use
  scsi: qla2xxx: fixup incorrect usage of host_byte
  scsi: lpfc: remove left-over BUILD_NVME defines
  scsi: core: try to get module before removing device
  scsi: hpsa: add missing hunks in reset-patch
  scsi: target: core: Do not overwrite CDB byte 1
  scsi: ch: Make it possible to open a ch device multiple times again
  scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
  scsi: sni_53c710: fix compilation error
  scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
  scsi: qla2xxx: fix a potential NULL pointer dereference
  scsi: MAINTAINERS: Update qla2xxx driver
  scsi: zfcp: fix reaction on bit error threshold notification
  ...
2019-11-07 06:43:18 -07:00
Michael Schmitz
9393c8de62 scsi: core: Handle drivers which set sg_tablesize to zero
In scsi_mq_setup_tags(), cmd_size is calculated based on zero size for the
scatter-gather list in case the low level driver uses SG_NONE in its host
template.

cmd_size is passed on to the block layer for calculation of the request
size, and we've seen NULL pointer dereference errors from the block layer
in drivers where SG_NONE is used and a mq IO scheduler is active,
apparently as a consequence of this (see commit 68ab2d76e4 ("scsi:
cxlflash: Set sg_tablesize to 1 instead of SG_NONE"), and a recent patch by
Finn Thain converting the three m68k NFR5380 drivers to avoid setting
SG_NONE).

Try to avoid these errors by accounting for at least one sg list entry when
calculating cmd_size, regardless of whether the low level driver set a zero
sg_tablesize.

Tested on 030 m68k with the atari_scsi driver - setting sg_tablesize to
SG_NONE no longer results in a crash when loading this driver.

CC: Finn Thain <fthain@telegraphics.com.au>
Link: https://lore.kernel.org/r/1572922150-4358-1-git-send-email-schmitzmic@gmail.com
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:44:34 -05:00
Bart Van Assche
f6b8540f40 scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
According to SBC-2 a TRANSFER LENGTH field of zero means that 256 logical
blocks must be transferred. Make the SCSI tracing code follow SBC-2.

Fixes: bf81623542 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20191105215553.185018-1-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:06:03 -05:00
James Smart
aff6ab9e72 scsi: lpfc: Update lpfc version to 12.6.0.1
Update lpfc version to 12.6.0.1

Link: https://lore.kernel.org/r/20191105005708.7399-12-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
171f6c4194 scsi: lpfc: Add enablement of multiple adapter dumps
Some adapters support the ability to hold multiple adapter dumps on the
adapter flash. Some adapters default to enabling this feature while others
default to single-dump.

Make support uniform by enabling dual dump by default.

Link: https://lore.kernel.org/r/20191105005708.7399-11-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
dcaa213679 scsi: lpfc: Change default IRQ model on AMD architectures
The current driver attempts to allocate an interrupt vector per cpu using
the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ
allocator will either provide the per-cpu vector, or return fewer
vectors. When fewer vectors, they are evenly spread between the numa nodes
on the system.  When run on an AMD architecture, if interrupts occur to a
cpu that is not in the same numa node as the adapter generating the
interrupt, there are extreme costs and overheads in performance.  Thus, if
1:1 vector allocation is used, or the "balanced" vectors in the other numa
nodes, performance can be hit significantly.

A much more performant model is to allocate interrupts only on the cpus
that are in the numa node where the adapter resides.  I/O completion is
still performed by the cpu where the I/O was generated. Unfortunately,
there is no flag to request the managed IRQ subsystem allocate vectors only
for the CPUs in the numa node as the adapter.

On AMD architecture, revert the irq allocation to the normal style
(non-managed) and then use irq_set_affinity_hint() to set the cpu
affinity and disable user-space rebalancing.

Tie the support into CPU offline/online. If the cpu being offlined owns a
vector, the vector is re-affinitized to one of the other CPUs on the same
numa node. If there are no more CPUs on the numa node, the vector has all
affinity removed and lets the system determine where it's serviced.
Similarly, when the cpu that owned a vector comes online, the vector is
reaffinitized to the cpu.

Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
93a4d6f401 scsi: lpfc: Add registration for CPU Offline/Online events
The recent affinitization didn't address cpu offlining/onlining.  If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
b9da814cd5 scsi: lpfc: Clarify FAWNN error message
Current message on FAWWN events is rather cryptic.

Expand the message to clarify its meaning.

Link: https://lore.kernel.org/r/20191105005708.7399-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
69641627c6 scsi: lpfc: Sync with FC-NVMe-2 SLER change to require Conf with SLER
Prior to the last FC-NVME-2 draft, SLER and CONF were independent.  SLER
now requires CONF to be set.

Revise the NVME PRLI checking to look for both inorder to enable SLER.

Link: https://lore.kernel.org/r/20191105005708.7399-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
dda5bdf074 scsi: lpfc: Fix dynamic fw log enablement check
The recently posted patch had a typo that incorrectly tested the receiving
function.

Fix the typo (change == to !=)

Fixes: 95bfc6d8ad ("scsi: lpfc: Make FW logging dynamically configurable")
Link: https://lore.kernel.org/r/20191105005708.7399-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
2332e6e475 scsi: lpfc: Fix unexpected error messages during RSCN handling
During heavy RCN activity and log_verbose = 0 we see these messages:

  2754 PRLI failure DID:521245 Status:x9/xb2c00, data: x0
  0231 RSCN timeout Data: x0 x3
  0230 Unexpected timeout, hba link state x5

This is due to delayed RSCN activity.

Correct by avoiding the timeout thus the messages by restarting the
discovery timeout whenever an rscn is received.

Filter PRLI responses such that severity depends on whether expected for
the configuration or not. For example, PRLI errors on a fabric will be
informational (they are expected), but Point-to-Point errors are not
necessarily expected so they are raised to an error level.

Link: https://lore.kernel.org/r/20191105005708.7399-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
6c1e803eac scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounce
When reading sysfs nvme_info file while a remote port leaves and comes
back, a NULL pointer is encountered. The issue is due to ndlp list
corruption as the the nvme_info_show does not use the same lock as the rest
of the code.

Correct by removing the rcu_xxx_lock calls and replace by the host_lock and
phba->hbaLock spinlocks that are used by the rest of the driver.  Given
we're called from sysfs, we are safe to use _irq rather than _irqsave.

Link: https://lore.kernel.org/r/20191105005708.7399-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
James Smart
6bfb162082 scsi: lpfc: Fix configuration of BB credit recovery in service parameters
The driver today is reading service parameters from the firmware and then
overwriting the firmware-provided values with values of its own.  There are
some switch features that require preliminary FLOGI's that are
switch-specific and done prior to the actual fabric FLOGI for traffic.  The
fw will perform those FLOGIs and will revise the service parameters for the
features configured. As the driver later overwrites those values with its
own values, it misconfigures things like BBSCN use by doing so.

Correct by eliminating the driver-overwrite of firmware values. The driver
correctly re-reads the service parameters after each link up to obtain the
latest values from firmware.

Link: https://lore.kernel.org/r/20191105005708.7399-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
James Smart
7cfd5639d9 scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
If the driver receives a login that is later then LOGO'd by the remote port
(aka ndlp), the driver, upon the completion of the LOGO ACC transmission,
will logout the node and unregister the rpi that is being used for the
node.  As part of the unreg, the node's rpi value is replaced by the
LPFC_RPI_ALLOC_ERROR value.  If the port is subsequently offlined, the
offline walks the nodes and ensures they are logged out, which possibly
entails unreg'ing their rpi values.  This path does not validate the node's
rpi value, thus doesn't detect that it has been unreg'd already.  The
replaced rpi value is then used when accessing the rpi bitmask array which
tracks active rpi values.  As the LPFC_RPI_ALLOC_ERROR value is not a valid
index for the bitmask, it may fault the system.

Revise the rpi release code to detect when the rpi value is the replaced
RPI_ALLOC_ERROR value and ignore further release steps.

Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
1feefb7ec2 scsi: sg: sg_ioctl(): get rid of access_ok()
simply not needed there - neither sg_new_read() nor sg_new_write() need
it.

Link: https://lore.kernel.org/r/20191017193925.25539-8-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
a64e5a8685 scsi: sg: sg_write(): get rid of access_ok()/__copy_from_user()/__get_user()
Just use plain copy_from_user() and get_user().  Note that while a
buf-derived pointer gets stored into ->dxferp, all places that actually use
the resulting value feed it either to import_iovec() or to
import_single_range(), and both will do validation.

Link: https://lore.kernel.org/r/20191017193925.25539-7-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
c8c12792d5 scsi: sg: sg_read(): get rid of access_ok()/__copy_..._user()
Use copy_..._user() instead, both in sg_read() and in sg_read_oxfer().  And
don't open-code memdup_user()...

Link: https://lore.kernel.org/r/20191017193925.25539-6-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
d9fc5617bc scsi: sg: sg_new_write(): don't bother with access_ok
... just use copy_from_user().  We copy only SZ_SG_IO_HDR bytes, so that
would, strictly speaking, loosen the check.  However, for call chains via
->write() the caller has actually checked the entire range and SG_IO passes
exactly SZ_SG_IO_HDR for count.  So no visible behaviour changes happen if
we check only what we really need for copyin.

Link: https://lore.kernel.org/r/20191017193925.25539-5-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
c35a5cfb41 scsi: sg: sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t
We don't need to allocate a temporary buffer and read the entire structure
in it, only to fetch a single field and free what we'd allocated.  Just use
get_user() and be done with it...

Link: https://lore.kernel.org/r/20191017193925.25539-4-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00