Add NCQ encapsulation for ZAC MANAGEMENT OUT and evaluate
NCQ Non-Data log pages to figure out if NCQ encapsulation
is supported.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
ZAC drives implement a 'ZAC Management Out' command template,
which maps onto the ZBC OUT command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
ZAC drives implement a 'ZAC Management In' command template,
which maps onto the ZBC IN command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
libata device disabling is ... curious. So add the correct
definitions that we can disable ZAC devices properly.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Some commands like FPDMA RECEIVE or NCQ NON DATA can encapsulate
other commands to NCQ transport. So decode the subcmds, too.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
When reading the NCQ Send/Recv log it might actually not
supported, thereby causing irritating messages
'READ LOG DMA EXT failed'.
Instead we should be reading the log directory first to
figure out if the log is actually supported before trying
to access it.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Define the NCQ NON DATA command and update libsas to handle it
correctly.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Some code waits for a metadata update by:
1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN)
2. setting MD_CHANGE_PENDING and waking the management thread
3. waiting for MD_CHANGE_PENDING to be cleared
If the first two are done without locking, the code in md_update_sb()
which checks if it needs to repeat might test if an update is needed
before step 1, then clear MD_CHANGE_PENDING after step 2, resulting
in the wait returning early.
So make sure all places that set MD_CHANGE_PENDING are atomicial, and
bit_clear_unless (suggested by Neil) is introduced for the purpose.
Cc: Martin Kepplinger <martink@posteo.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: <linux-kernel@vger.kernel.org>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Patch summary:
When showing a cgroupfs entry in mountinfo, show the path of the mount
root dentry relative to the reader's cgroup namespace root.
Short explanation (courtesy of mkerrisk):
If we create a new cgroup namespace, then we want both /proc/self/cgroup
and /proc/self/mountinfo to show cgroup paths that are correctly
virtualized with respect to the cgroup mount point. Previous to this
patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo
does not.
Long version:
When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup
namespace, and then mounts a new instance of the freezer cgroup, the new
mount will be rooted at /a/b. The root dentry field of the mountinfo
entry will show '/a/b'.
cat > /tmp/do1 << EOF
mount -t cgroup -o freezer freezer /mnt
grep freezer /proc/self/mountinfo
EOF
unshare -Gm bash /tmp/do1
> 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
> 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer
The task's freezer cgroup entry in /proc/self/cgroup will simply show
'/':
grep freezer /proc/self/cgroup
9:freezer:/
If instead the same task simply bind mounts the /a/b cgroup directory,
the resulting mountinfo entry will again show /a/b for the dentry root.
However in this case the task will find its own cgroup at /mnt/a/b,
not at /mnt:
mount --bind /sys/fs/cgroup/freezer/a/b /mnt
130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer
In other words, there is no way for the task to know, based on what is
in mountinfo, which cgroup directory is its own.
Example (by mkerrisk):
First, a little script to save some typing and verbiage:
echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup | grep freezer)"
cat /proc/self/mountinfo | grep freezer |
awk '{print "\tmountinfo:\t\t" $4 "\t" $5}'
Create cgroup, place this shell into the cgroup, and look at the state
of the /proc files:
2653
2653 # Our shell
14254 # cat(1)
/proc/self/cgroup: 10:freezer:/a/b
mountinfo: / /sys/fs/cgroup/freezer
Create a shell in new cgroup and mount namespaces. The act of creating
a new cgroup namespace causes the process's current cgroups directories
to become its cgroup root directories. (Here, I'm using my own version
of the "unshare" utility, which takes the same options as the util-linux
version):
Look at the state of the /proc files:
/proc/self/cgroup: 10:freezer:/
mountinfo: / /sys/fs/cgroup/freezer
The third entry in /proc/self/cgroup (the pathname of the cgroup inside
the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which
is rooted at /a/b in the outer namespace.
However, the info in /proc/self/mountinfo is not for this cgroup
namespace, since we are seeing a duplicate of the mount from the
old mount namespace, and the info there does not correspond to the
new cgroup namespace. However, trying to create a new mount still
doesn't show us the right information in mountinfo:
# propagating to other mountns
/proc/self/cgroup: 7:freezer:/
mountinfo: /a/b /mnt/freezer
The act of creating a new cgroup namespace caused the process's
current freezer directory, "/a/b", to become its cgroup freezer root
directory. In other words, the pathname directory of the directory
within the newly mounted cgroup filesystem should be "/",
but mountinfo wrongly shows us "/a/b". The consequence of this is
that the process in the cgroup namespace cannot correctly construct
the pathname of its cgroup root directory from the information in
/proc/PID/mountinfo.
With this patch, the dentry root field in mountinfo is shown relative
to the reader's cgroup namespace. So the same steps as above:
/proc/self/cgroup: 10:freezer:/a/b
mountinfo: / /sys/fs/cgroup/freezer
/proc/self/cgroup: 10:freezer:/
mountinfo: /../.. /sys/fs/cgroup/freezer
/proc/self/cgroup: 10:freezer:/
mountinfo: / /mnt/freezer
cgroup.clone_children freezer.parent_freezing freezer.state tasks
cgroup.procs freezer.self_freezing notify_on_release
3164
2653 # First shell that placed in this cgroup
3164 # Shell started by 'unshare'
14197 # cat(1)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
use d_alloc_parallel() for sillyunlink/lookup exclusion and
explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() -
a reader) for rmdir/sillyunlink one.
That ought to make lookup/readdir/!O_CREAT atomic_open really
parallel on NFS.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
MAX77620/MAX20024 are Power Management IC from the MAXIM.
It supports RTC, multiple GPIOs, multiple DCDC and LDOs,
watchdog, clock etc.
Add MFD drier to provides common support for accessing the
device; additional drivers is developed on respected subsystem
in order to use the functionality of the device.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mallikarjun Kasoju <mkasoju@nvidia.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Merge "omap legacy boot vs dt boot fixes for v4.7 merge window"
from Tony Lindgren:
Legacy booting vs device tree booting fixes for omaps for
v4.7 merge window. These are not considered urgent fixes enough
for the v4.6-rc cycle, but we need them in v4.7 in order to drop
the last remaining board-*.c files for omap3 for v4.8 merge window.
On Nokia N900, we need to pass the MMC slot names for the legacy
user space to work. Let's do that using auxdata as the driver is
setting up things already with the pdata for legacy booting. Then
we can later on discuss if we may want to have some generic binding
describing where the MMC slots are on the device.
N900 also has had the ir-rx51 device driver unusable with multiarch
for a long time. Let's pass the dmtimer data in pdata for the driver
to get it going again. Then once things are working, we can eventually
change the driver to use just hrtimer and PWM framework. The driver
changes will be queued separately.
* tag 'omap-for-v4.7/legacy-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: n900 needs MMC slot names for legacy user space
ARM: OMAP2+: Add more functions to pwm pdata for ir-rx51
Merge "Reset controller changes for v4.7" from Philipp Zabel:
- add missing stub for device_reset
- add support for OXNAS SoCs
* tag 'reset-for-4.7-2' of git://git.pengutronix.de/git/pza/linux:
reset: Add missing function stub for device_reset
dt-bindings: Add Oxford Semiconductor Reset Controller bindings
reset: Add Oxford Semiconductor Reset Controller driver
Merge "phy: tegra: Changes for v4.7-rc1" from Thierry Reding:
This set of patches adds support for the Tegra XUSB pad controller. The
controller provides a set of pads (lanes) that are used for I/O by other
IP blocks within Tegra SoCs (PCIe, SATA and XUSB).
* tag 'tegra-for-4.7-phy' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
phy: tegra: Add Tegra210 support
phy: Add Tegra XUSB pad controller support
dt-bindings: phy: tegra-xusb-padctl: Add Tegra210 support
dt-bindings: phy: Add NVIDIA Tegra XUSB pad controller binding
phy: core: Allow children node to be overridden
clk: tegra: Add interface to enable hardware control of SATA/XUSB PLLs
Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Many IOMMUs support multiple page table formats, meaning that any given
domain may only support a subset of the hardware page sizes presented in
iommu_ops->pgsize_bitmap. There are also certain use-cases where the
creator of a domain may want to control which page sizes are used, for
example to force the use of hugepage mappings to reduce pagetable walk
depth.
To this end, add a per-domain pgsize_bitmap to represent the subset of
page sizes actually in use, to make it possible for domains with
different requirements to coexist.
Signed-off-by: Will Deacon <will.deacon@arm.com>
[rm: hijacked and rebased original patch with new commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The priv field from iommu_ops is a hangover from the of_dma_configure
series and isn't actually used. Remove it before it has chance to
spread.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Sometimes we might have a RCU managed credential pointer and don't want
to use locking to handle it. Add a function that will take a reference
to the cred iff the refcount is not already zero. Callers can dereference
the pointer under the rcu_read_lock and use that function to take a
reference only if the cred is not on its way to destruction.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Add function rpc_lookup_generic_cred, which allows lookups of a generic
credential that's not current_cred().
[jlayton: add gfp_t parm]
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
We need to be able to call the generic_cred creator from different
contexts. Add a gfp_t parm to the crcreate operation and to
rpcauth_lookup_credcache. For now, we just push the gfp_t parms up
one level to the *_lookup_cred functions.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Commit ea2cf22 created nfs_commit_info and saved &inode->i_lock inside
this NFS specific structure. This obscures the usage of i_lock.
Instead, save struct inode * so later it's clear the spinlock taken is
i_lock.
Should be no functional change.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
As described in 'can: m_can: tag current CAN FD controllers as non-ISO'
(6cfda7fbeb) it is possible to define fixed configuration options by
setting the according bit in 'ctrlmode' and clear it in 'ctrlmode_supported'.
This leads to the incovenience that the fixed configuration bits can not be
passed by netlink even when they have the correct values (e.g. non-ISO, FD).
This patch fixes that issue and not only allows fixed set bit values to be set
again but now requires(!) to provide these fixed values at configuration time.
A valid CAN FD configuration consists of a nominal/arbitration bittiming, a
data bittiming and a control mode with CAN_CTRLMODE_FD set - which is now
enforced by a new can_validate() function. This fix additionally removed the
inconsistency that was prohibiting the support of 'CANFD-only' controller
drivers, like the RCar CAN FD.
For this reason a new helper can_set_static_ctrlmode() has been introduced to
provide a proper interface to handle static enabled CAN controller options.
Reported-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reviewed-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Cc: <stable@vger.kernel.org> # >= 3.18
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
There was a static checker warning in wm8400_reg_read() because we were
returning u16 and that can't hold the negative error codes. The
function isn't used, so let's just delete it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Add an implementation of Qualcomm's IPC router protocol, used to
communicate with service providing remote processors.
Signed-off-by: Courtney Cavin <courtney.cavin@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
[bjorn: Cope with 0 being a valid node id and implement RTM_NEWADDR]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the atomic modesetting path, each driver simply wants to grab a ref
to the exclusive fence from a reservation object to store in the incoming
drm_plane_state, without doing the whole RCU dance. Since each driver
will need to do this, lets make a helper.
v2: rename to _rcu instead of _unlocked to be more consistent
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
This patch allows fscrypto to handle a second key prefix given by filesystem.
The main reason is to provide backward compatibility, since previously f2fs
used "f2fs:" as a crypto prefix instead of "fscrypt:".
Later, ext4 should also provide key_prefix() to give "ext4:".
One concern decribed by Ted would be kinda double check overhead of prefixes.
In x86, for example, validate_user_key consumes 8 ms after boot-up, which turns
out derive_key_aes() consumed most of the time to load specific crypto module.
After such the cold miss, it shows almost zero latencies, which treats as a
negligible overhead.
Note that request_key() detects wrong prefix in prior to derive_key_aes() even.
Cc: Ted Tso <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.6
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them. Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done. This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.
This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
allow cls_bpf and act_bpf programs access skb->data and skb->data_end pointers.
The bpf helpers that change skb->data need to update data_end pointer as well.
The verifier checks that programs always reload data, data_end pointers
after calls to such bpf helpers.
We cannot add 'data_end' pointer to struct qdisc_skb_cb directly,
since it's embedded as-is by infiniband ipoib, so wrapper struct is needed.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for problems introduced or discovered recently (intel_pstate,
sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
generic device properties framework) and one fix for a hotplug-related
deadlock in ACPICA that's been there forever, but is nasty enough.
Specifics:
- Fix for a recent regression in the intel_pstate driver causing it
to fail to restore the HWP (HW-managed P-states) configuration of
the boot CPU after suspend-to-RAM (Rafael Wysocki).
- Fix for two recent regressions in the intel_pstate driver, one that
can trigger a divide by zero if the driver is accessed via sysfs
before it manages to take the first sample and one causing it to
fail to update a structure field used in a trace point, so the
information coming from it is less useful (Rafael Wysocki).
- Fix for a problem in the sti-cpufreq driver introduced during the
4.5 cycle that causes it to break CPU PM in multi-platform kernels
by registering cpufreq-dt (which subsequently doesn't work)
unconditionally and preventing the driver that would actually work
from registering (Sudeep Holla).
- Stable-candidate fix for an ARM64 cpuidle issue causing idle state
usage counters to be incorrectly updated for idle states that were
not entered due to errors (James Morse).
- Fix for a recently introduced issue in the OPP (Operating
Performance Points) framework causing it to print bogus error
messages for missing optional regulators (Viresh Kumar).
- Fix for a recently introduced issue in the generic device
properties framework that may cause it to attempt to dereferece and
invalid pointer in some cases (Heikki Krogerus).
- Fix for a deadlock in the ACPICA core that may be triggered by
device (eg Thunderbolt) hotplug (Prarit Bhargava)"
* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / OPP: Remove useless check
ACPICA: Dispatcher: Update thread ID for recursive method calls
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
device property: Avoid potential dereferences of invalid pointers
The nvm_dev->max_pages_per_blk variable was removed in favor of the new
nvm->sec_per_blk variable. The ->max_pages_per_blk variable was still
used in rrpc_capacity, reporting the reserved capacity to zero. Replace
with ->sec_per_blk to calculate the reserved area again.
Signed-off-by: Javier González <javier@cnexlabs.com>
Updated patch description. Was "lightnvm: eliminate redundant variable"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The number of ppas contained on a request is not necessarily the number
of pages that it maps to neither on the target nor on the device side.
In order to avoid confusion, rename nr_pages to nr_ppas since it is what
the variable actually contains.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A target requires a method to identify PPAs that are either cached in
memory or on disk. This can efficiently be maintained within the PPA.
The target host-side translation table can then lookup a PPA and know
from the PPA if it is cached or on disk. In the case it is cached, it is
the responsibility of the target to maintain this cache.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Targets can update a block state when having a reference to an
in-memory virtual block. In the case that a target does not keep the
block metadata in memory, it does not have a way to update this
structure.
Therefore, expose gennvm_mark_blk() through the media managers
->mark_blk() callback and let targets update the state structure through
this callback.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Targets associated with a device manager are not freed on device
removal. They have to be manually removed before shutdown. Make sure
any outstanding targets are freed upon shutdown.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Until now, the dma pool have been exclusively used to allocate the ppa
list being sent to the device. In pblk (upcoming), we use these pools to
allocate metadata too. Thus, we generalize the names of some variables
on the dma helper functions to make the code more readable.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>