Several SCSI transport and LLD drivers surround code that does not
tolerate concurrent calls of .queuecommand() with scsi_target_block() /
scsi_target_unblock(). These last two functions use
blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
queues to prevent concurrent .queuecommand() calls. However, that is
not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
Hence surround the .queuecommand() call from the SCSI error handler with
code that avoids that .queuecommand() gets called in the blocked state.
Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
code that calls blk_get_request() + blk_execute_rq() is not an option
since scsi_send_eh_cmnd() must be able to make forward progress even
if all requests have been allocated.
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ability to modify the SCSI device state was introduced by commit
638127e579a4 ("[PATCH] Fix error handler offline behaviour"; v2.6.12). That
same commit introduced the following device states:
{ SDEV_CREATED, "created" },
{ SDEV_RUNNING, "running" },
{ SDEV_CANCEL, "cancel" },
{ SDEV_DEL, "deleted" },
{ SDEV_QUIESCE, "quiesce" },
{ SDEV_OFFLINE, "offline" },
The SDEV_BLOCK state was introduced later to avoid that an FC cable pull
would immediately result in an I/O error (commit 1094e682310e; "[PATCH]
suspending I/Os to a device"; v2.6.12). That same patch introduced the
ability to set the SDEV_BLOCK state from user space. I'm not sure whether
that ability was introduced on purpose or accidentally.
Since there is agreement that only writing "running" or "offline" into
the SCSI sysfs device state attribute makes sense, restrict sysfs writes
to these values.
This patch makes sure that SDEV_BLOCK is only used for its original
purpose, namely to allow transport drivers and LLDs to block further
.queuecommand() calls while transport layer or adapter recovery is in
progress.
Note: a web search for "/sys/class/scsi_device" AND "device/state"
revealed several storage configuration guides. The instructions I found
in these guides tell users to write the value "running" or "offline" in
the SCSI device state sysfs attribute and no other values.
[mkp: typo]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: James Smart <james.smart@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
IEEE_8021QAZ_APP_SEL_STREAM is a valid selector for iSCSI connections, so
add code to use IEEE_8021QAZ_APP_SEL_STREAM selector to get priority mask.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where
we are expecting to fall through.
This patch fixes the following warning:
drivers/scsi/mpt3sas/mpt3sas_base.c: In function _base_update_ioc_page1_inlinewith_perf_mode :
drivers/scsi/mpt3sas/mpt3sas_base.c:4510:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (ioc->high_iops_queues) {
^
drivers/scsi/mpt3sas/mpt3sas_base.c:4530:2: note: here
case MPT_PERF_MODE_LATENCY:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Fixes: 30cb97023f38 ("scsi: mpt3sas: Introduce perf_mode module parameter")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Many times in libsas, and in LLDDs which use libsas, the check for an
expander device is re-implemented or open coded.
Use dev_is_expander() instead. We rename this from
sas_dev_type_is_expander() to not spill so many lines in referencing.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi_mq_setup_tags() preallocates a big buffer for the IO SGL. The size is
based on scsi_mq_sgl_size() which is determined based on
shost->sg_tablesize and SG_CHUNK_SIZE.
Modern DMA engines are often capable of dealing with very big segments so
the resulting scsi_mq_sgl_size() is often too big. SG_CHUNK_SIZE results in
a static 4KB SGL allocation per command.
If an HBA has lots of deep queues, preallocation for the sg list can
consume substantial amounts of memory. For lpfc, nr_hw_queues can be 70
and each queue's depth 3781. This means the resulting preallocation for
the data SGL is 70*3781*2K = 517MB.
Switch to runtime allocation for SGL for lists longer than 2 entries. This
is the approach used by NVMe PCI so it should be reasonable for SCSI as
well. Runtime SGL allocation has always been the case for the legacy I/O
path so this is nothing new.
[mkp: attempted to clarify commit desc]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi_mq_setup_tags() currently preallocates a big buffer for protection
SGL entries. scsi_mq_sgl_size() is used to determine the size for both data
and protection information scatterlists but the protection buffer is
usually much smaller. For example, one 512-byte sector needs 8 bytes of
protection information. Given that the maximum number of sectors for one
request is 2560 (BLK_DEF_MAX_SECTORS) sectors, the max protection
information buffer size is just 20K.
The protection information segment count generally matches the number of
bios in the request. As a result, the typical actual number of segments
won't be very big. And should the need arise, allocating a bigger SGL from
slab is fast enough.
Pre-allocate only one SGL entry for protection information and switch to
runtime allocation in case that the protection information segment number
is bigger than 1. This reduces memory tied up by static command
allocations. For example, 500+ MB is saved on single lpfc HBA.
[mkp: attempted to clarify commit desc]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sg_alloc_table_chained() currently allows the caller to provide one
preallocated SGL and returns if the requested number isn't bigger than
size of that SGL. This is used to inline an SGL for an IO request.
However, scattergather code only allows that size of the 1st preallocated
SGL to be SG_CHUNK_SIZE(128). This means a substantial amount of memory
(4KB) is claimed for the SGL for each IO request. If the I/O is small, it
would be prudent to allocate a smaller SGL.
Introduce an extra parameter to sg_alloc_table_chained() and
sg_free_table_chained() for specifying size of the preallocated SGL.
Both __sg_free_table() and __sg_alloc_table() assume that each SGL has the
same size except for the last one. Change the code to allow both functions
to accept a variable size for the 1st preallocated SGL.
[mkp: attempted to clarify commit desc]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Guenter Roeck <linux@roeck-us.net>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Cc: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
Finn added the change to replace SCp.buffers_residual with
sg_is_last() for fixing updating it, and the similar change has been
applied on NCR5380.c
[mkp: clarified commit message]
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message and folded in build fix reported by zeroday]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unlike the legacy I/O path, scsi-mq preallocates a large array to hold
the scatterlist for each request. This static allocation can consume
substantial amounts of memory on modern controllers which support a
large number of concurrently outstanding requests.
To facilitate a switch to a smaller static allocation combined with a
dynamic allocation for requests that need it, we need to make sure all
SCSI drivers handle chained scatterlists correctly.
Convert remaining drivers that directly dereference the scatterlist
array to using the iterator functions.
[mkp: clarified commit message]
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license this program
is distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 51 franklin st fifth floor boston ma 02110
1301 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 1 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081207.195075312@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 2 normalized pattern(s):
this file is licensed under gplv2 this file is part of the [aic94xx]
driver the [aic94xx] driver is free software you can redistribute it
and or modify it under the terms of the gnu general public license
as published by the free software foundation version 2 of the
license the [aic94xx] driver is distributed in the hope that it will
be useful but without any warranty without even the implied warranty
of merchantability or fitness for a particular purpose see the gnu
general public license for more details you should have received a
copy of the gnu general public license along with [aic94xx] driver
if not write to the free software foundation inc 51 franklin st
fifth floor boston ma 02110 1301 usa
this file is licensed under gplv2 this file is part of the
[88se64xx] [88se94xx] driver the [88se64xx] [88se94xx] driver is
free software you can redistribute it and or modify it under the
terms of the gnu general public license as published by the free
software foundation version 2 of the license the [88se64xx]
[88se94xx] driver is distributed in the hope that it will be useful
but without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details you should have received a
copy of the gnu general public license along with [88se64xx]
[88se94xx] driver if not write to the free software foundation inc
51 franklin st fifth floor boston ma 02110 1301 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 2 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081201.638289549@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UFS runtime suspend can be triggered after pm_runtime_enable() is invoked
in ufshcd_pltfrm_init(). However if the first runtime suspend is triggered
before binding ufs_hba structure to ufs device structure via
platform_set_drvdata(), then UFS runtime suspend will be no longer
triggered in the future because its dev->power.runtime_error was set in the
first triggering and does not have any chance to be cleared.
To be more clear, dev->power.runtime_error is set if hba is NULL in
ufshcd_runtime_suspend() which returns -EINVAL to rpm_callback() where
dev->power.runtime_error is set as -EINVAL. In this case, any future
rpm_suspend() for UFS device fails because rpm_check_suspend_allowed()
fails due to non-zero
dev->power.runtime_error.
To resolve this issue, make sure the first UFS runtime suspend get valid
"hba" in ufshcd_runtime_suspend(): Enable UFS runtime PM only after hba is
successfully bound to UFS device structure.
Fixes: 62694735ca ([SCSI] ufs: Add runtime PM support for UFS host controller driver)
Cc: stable@vger.kernel.org
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1. Introduce module parameter perf_mode for only Aero/Sea generation HBAs.
2. Update IOC page1 fields according to performance mode.
Below are the performance modes that can be enabled with module parameter
perf_mode:
0: Balanced - Few high iops reply queues will be enabled. Interrupt
coalescing will be enabled only for these high iops reply descriptor
queues.
1: Iops - Interrupt coalescing will be enabled on all reply queues.
Coalescing timeout is set to 0x20.This is default value for Aero.
2: Latency - Interrupt coalescing will be enabled on all reply queues.
Coalescing timeout is set to 0xA. This is a legacy behavior similar to
Ventura & Invader HBA series.
Default perf mode set by driver will be balanced mode if the following
conditions are met:
- CPU vendor = Intel;
- Aero controller working in 16GT/s pcie speed
Performance mode will be set to latency mode for all other cases.
4k Random Read IO performance numbers on 24 SAS SSD drives for above three
permormance modes. Performance data is from Intel Skylake and HGST SS300
(drive model SDLL1DLR400GCCA1).
IOPs:
-----------------------------------------------------------------------
|perf_mode | qd = 1 | qd = 64 | note |
|-------------|--------|---------|-------------------------------------
|balanced | 259K | 3061k | Provides max performance numbers |
| | | | both on lower QD workload & |
| | | | also on higher QD workload |
|-------------|--------|---------|-------------------------------------
|iops | 220K | 3100k | Provides max performance numbers |
| | | | only on higher QD workload. |
|-------------|--------|---------|-------------------------------------
|latency | 246k | 2226k | Provides good performance numbers |
| | | | only on lower QD worklaod. |
-----------------------------------------------------------------------
Avarage Latency:
-----------------------------------------------------
|perf_mode | qd = 1 | qd = 64 |
|-------------|--------------|----------------------|
|balanced | 92.05 usec | 501.12 usec |
|-------------|--------------|----------------------|
|iops | 108.40 usec | 498.10 usec |
|-------------|--------------|----------------------|
|latency | 97.10 usec | 689.26 usec |
-----------------------------------------------------
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable interrupt coalescing only on high iops queues.
In ioc config page 1, offset 0x14 (ProductSpecific field) is used to
determine interrupt coalescing enabled/disabled on per reply descriptor
post queue group(8) basis. If 31st bit is zero, then interrupt coalescing
is enabled for all reply descriptor post queues. If 31st bit is set to one,
then user can enable/disable interrupt coalescing on per reply descriptor
post queue group(8) basis. So to enable interrupt coalescing only on first
reply descriptor post queue group (i.e. on high iops queues), set bit 0 and
31.
This configuration should reset during driver unload or shutdown to the
default settings. For this, the driver takes copy of default ioc page 1 and
copies back the default or unmodified ioc page1 during unload and
shutdown. This means that on next driver load (e.g. if older version driver
is loaded by user), current modified changes on ioc page1 won't take
effect.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
High iops queues are mapped to non-managed irqs. Set affinity of
non-managed irqs to local numa node. Low latency queues are mapped to
managed irqs.
Driver reserves some reply queues for max iops (through
pci_alloc_irq_vectors_affinity and .pre_vectors interface). The rest of
queues are for low latency.
Based on io workload in io submission path, driver will decide which group
of reply queues (either high iops queues or low latency queues) to be
used. High iops queues will be mapped to local numa node of controller and
low latency queues will be mapped to cpus across numa nodes. In general,
high iops and low latency queues should fit into 128 reply queues
which is the max number of reply queues supported by Aero/Sea.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the IO submission path _base_get_msix_index is called twice. Initially
while getting the smid and subsequently while posting the request
descriptor (RD).
Refactor code to query msix index only while posting the request
descriptor. Save determined msix index in msix_io field.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver will use round-robin method for io submission in batches within
the high iops queues when the number of in-flight ios on the target device
is larger than 8. Otherwise the driver will use low latency reply queues.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code refactoring.
In function _base_get_msix_index, add scmd as second argument. This change
is made in preparation for the next patch where we introduce a new function
to get the MSI-X index for high iops queues.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Aero controllers support balanced performance mode through the ability to
configure queues with different properties.
Reply queues with interrupt coalescing enabled are called "high iops reply
queues" and reply queues with interrupt coalescing disabled are called "low
latency reply queues".
The driver configures a combination of high iops and low latency reply
queues if:
- HBA is an AERO controller;
- MSI-X vectors supported by the HBA is 128;
- Total CPU count in the system more than high iops queue count;
- Driver is loaded with default max_msix_vectors module parameter; and
- System booted in non-kdump mode.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the Aero HBA supports Atomic Request Descriptors, it sets the Atomic
Request Descriptor Capable bit in the IOCCapabilities field of the IOCFacts
Reply message. Driver uses an Atomic Request Descriptor as an alternative
method for posting an entry onto a request queue.
The posting of an Atomic Request Descriptor is an atomic operation,
providing a safe mechanism for multiple processors on the host to post
requests without synchronization. This Atomic Request Descriptor format is
identical to first 32 bits of Default Request Descriptor and uses only 32
bits.
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This code refactoring introduces function pointers.
Host uses Request Descriptors of different types for posting an entry onto
a request queue. Based on controller type and capabilities, host can also
use atomic descriptors other than normal descriptors. Using function
pointer will avoid if-else statements
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Checkpatch emits a warning when using symbolic permissions. Use octal
permissions instead. No functional change.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warnings:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_fw_crash_buffer_show:
drivers/scsi/megaraid/megaraid_sas_base.c:3138:16: warning: variable buff_addr set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_get_pd_list:
drivers/scsi/megaraid/megaraid_sas_base.c:4426:13: warning: variable ci_h set but not used [-Wunused-but-set-variable]
'buff_addr' is never used since inroduction in commit fc62b3fc90
("megaraid_sas : Firmware crash dump feature support")
'ci_h' is not used since commit 9b3d028f34 ("scsi: megaraid_sas:
Pre-allocate frequently used DMA buffers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_create_frame_pool:
drivers/scsi/megaraid/megaraid_sas_base.c:4124:6: warning: variable sge_sz set but not used [-Wunused-but-set-variable]
It's not used any more since commit 200aed582d ("megaraid_sas: endianness
related bug fixes and code optimization")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When building powerpc pseries_defconfig or powernv_defconfig:
drivers/scsi/lpfc/lpfc_nvmet.c:224:1: error: unused function
'lpfc_nvmet_get_ctx_for_xri' [-Werror,-Wunused-function]
drivers/scsi/lpfc/lpfc_nvmet.c:246:1: error: unused function
'lpfc_nvmet_get_ctx_for_oxid' [-Werror,-Wunused-function]
These functions are only compiled when CONFIG_NVME_TARGET_FC is enabled.
Use that same condition so there is no more warning. While the fixes commit
did not introduce these functions, it caused these warnings.
Fixes: 4064b27417a7 ("scsi: lpfc: Make some symbols static")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>