Remove the V4L2 AS3645A sub-device driver in favour of the LED flash class
driver for the same hardware, drivers/leds/leds-as3645a.c. The latter uses
the V4L2 flash LED class framework to provide V4L2 sub-device interface.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Pull networking fixes from David Miller:
1) IPv6 gre tunnels end up with different default features enabled
depending upon whether netlink or ioctls are used to bring them up.
Fix from Alexey Kodanev.
2) Fix read past end of user control message in RDS< from Avinash
Repaka.
3) Missing RCU barrier in mini qdisc code, from Cong Wang.
4) Missing policy put when reusing per-cpu route entries, from Florian
Westphal.
5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G.
Piccoli.
6) Run nested transport mode IPSEC packets via tasklet, from Herbert
Xu.
7) Fix handling poll() for stream sockets in tipc, from Parthasarathy
Bhuvaragan.
8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert.
9) Another zerocopy ubuf handling fix, from Willem de Bruijn.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
strparser: Call sock_owned_by_user_nocheck
sock: Add sock_owned_by_user_nocheck
skbuff: in skb_copy_ubufs unclone before releasing zerocopy
tipc: fix hanging poll() for stream sockets
sctp: Replace use of sockets_allocated with specified macro.
bnx2x: Improve reliability in case of nested PCI errors
tg3: Enable PHY reset in MTU change path for 5720
tg3: Add workaround to restrict 5762 MRRS to 2048
tg3: Update copyright
net: fec: unmap the xmit buffer that are not transferred by DMA
tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
tipc: error path leak fixes in tipc_enable_bearer()
RDS: Check cmsg_len before dereferencing CMSG_DATA
tcp: Avoid preprocessor directives in tracepoint macro args
tipc: fix memory leak of group member when peer node is lost
net: sched: fix possible null pointer deref in tcf_block_put
tipc: base group replicast ack counter on number of actual receivers
net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
ip6_gre: fix device features for ioctl setup
...
Pull rdma fixes from Jason Gunthorpe:
"This is the next batch of for-rc patches from RDMA. It includes the
fix for the ipoib regression I mentioned last time, and the result of
a fairly major debugging effort to get iser working reliably on cxgb4
hardware - it turns out the cxgb4 driver was not handling QP error
flushing properly causing iser to fail.
- cxgb4 fix for an iser testing failure as debugged by Steve and
Sagi. The problem was a driver bug in the handling of shutting down
a QP.
- Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on
error unwind and a use after free bug
- Improper congestion counter values on mlx5 when link aggregation is
enabled
- ipoib lockdep regression introduced in this merge window
- hfi1 regression supporting the device in a VM introduced in a
recent patch
- Typo that breaks future uAPI compatibility in the verbs core
- More SELinux related oops fixing
- Fix an oops during error unwind in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/mlx5: Fix mlx5_ib_alloc_mr error flow
IB/core: Verify that QP is security enabled in create and destroy
IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
IB/mlx5: Serialize access to the VMA list
IB/hfi: Only read capability registers if the capability exists
IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
IB/mlx5: Fix congestion counters in LAG mode
RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
iw_cxgb4: when flushing, complete all wrs in a chain
iw_cxgb4: reflect the original WR opcode in drain cqes
iw_cxgb4: Only validate the MSN for successful completions
Saeed Mahameed says:
====================
Mellanox, mlx5 E-Switch updates 2017-12-19
This series includes updates for mlx5 E-Switch infrastructures,
to be merged into net-next and rdma-next trees.
Mark's patches provide E-Switch refactoring that generalize the mlx5
E-Switch vf representors interfaces and data structures. The serious is
mainly focused on moving ethernet (netdev) specific representors logic out
of E-Switch (eswitch.c) into mlx5e representor module (en_rep.c), which
provides better separation and allows future support for other types of vf
representors (e.g. RDMA).
Gal's patches at the end of this serious, provide a simple syntax fix and
two other patches that handles vport ingress/egress ACL steering name
spaces to be aligned with the Firmware/Hardware specs.
V1->V2:
- Addressed coding style comments in patches #1 and #7
- The series is still based on rc4, as now I see net-next is also @rc4.
V2->V3:
- Fixed compilation warning, reported by Dave.
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
divider_recalc_rate() is an helper function used by clock divider of
different types, so the structure containing the 'hw' pointer is not
always a 'struct clk_divider'
At the following line:
> div = _get_div(table, val, flags, divider->width);
in several cases, the value of 'divider->width' is garbage as the actual
structure behind this memory is not a 'struct clk_divider'
Fortunately, this width value is used by _get_val() only when
CLK_DIVIDER_MAX_AT_ZERO flag is set. This has never been the case so
far when the structure is not a 'struct clk_divider'. This is probably
why we did not notice this bug before
Fixes: afe76c8fd0 ("clk: allow a clk divider with max divisor when zero")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Each vport has its own root flow table for the ACL flow tables and root
flow table is per namespace, therefore we should create a namespace for
each vport.
Fixes: efdc810ba3 ("net/mlx5: Flow steering, Add vport ACL support")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This allows checking socket lock ownership with producing lockdep
warnings.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables QP creation with a given BF index, this allows the
user space driver to share same BF between few QPs or alternatively have
a dedicated BF per QP.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch extends the alloc context flow to be prepared for working
with dynamic UAR allocations.
Currently upon alloc context there is some fix size of UARs that are
allocated (named 'static allocation') and there is no option to user
application to ask for more or control which UAR will be used by which
QP.
In this patch the driver prepares its data structures to manage both the
static and the dynamic allocations and let the user driver knows about
the max value of dynamic blue-flame registers that are allowed.
Downstream patches from this series will enable the dynamic allocation
and the association as part of QP creation.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add missing inner RSS support capability as part of
the RSS supported fields.
In addition change MLX5_RX_HASH_INNER to 1UL << 31 in
order to define it as unsigned.
Fixes: 309fa3470f ("IB/mlx5: Add support for RSS on the inner packet")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Support RSS hash for inner headers according to a new flag,
MLX4_IB_RX_HASH_INNER provided by the vendor channel.
In case the flag is set, RSS hash will be done on the inner headers of
VXLAN packets (which are encapsulated).
Non-encapsulated packets will be hashed according to the outer headers.
Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The "reserved" field was a way, used at V4L2 API, to add new
data to existing structs without breaking userspace. However,
there are now clever ways of doing that, without needing to add
an uneeded overhead. So, get rid of them.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Adds a new uAPI for DVB to use streaming I/O which is implemented
based on videobuf2, using those new ioctls:
- DMX_REQBUFS: Request kernel to allocate buffers which count and size
are dedicated by user.
- DMX_QUERYBUF: Get the buffer information like a memory offset which
will mmap() and be shared with user-space.
- DMX_EXPBUF: Just for testing whether buffer-exporting success or not.
- DMX_QBUF: Pass the buffer to kernel-space.
- DMX_DQBUF: Get back the buffer which may contain TS data.
Originally developed by: Junghak Sung <jh1009.sung@samsung.com>, as
seen at:
https://patchwork.linuxtv.org/patch/31613/https://patchwork.kernel.org/patch/7334301/
The original patch was written before merging VB2-core functionalities
upstream. When such series was added, several adjustments were made,
fixing some issues with V4L2, causing the original patch to be
non-trivially rebased.
After rebased, a few bugs in the patch were fixed. The patch was
also enhanced it and polling functionality got added.
The main changes over the original patch are:
dvb_vb2_fill_buffer():
- Set the size of the outgoing buffer after while loop using
vb2_set_plane_payload;
- Added NULL check for source buffer as per normal convention
of demux driver, this is called twice, first time with valid
buffer second time with NULL pointer, if its not handled,
it will result in crash
- Restricted spinlock for only list_* operations
dvb_vb2_init():
- Restricted q->io_modes to only VB2_MMAP as its the only
supported mode
dvb_vb2_release():
- Replaced the && in if condiion with &, because otherwise
it was always getting satisfied.
dvb_vb2_stream_off():
- Added list_del code for enqueud buffers upon stream off
dvb_vb2_poll():
- Added this new function in order to support polling
dvb_demux_poll() and dvb_dvr_poll()
- dvb_vb2_poll() is now called from these functions
- Ported this patch and latest videobuf2 to lower kernel versions and
tested auto scan.
Co-developed-by: Junghak Sung <jh1009.sung@samsung.com>
[mchehab@s-opensource.com: checkpatch fixes]
Signed-off-by: Junghak Sung <jh1009.sung@samsung.com>
Signed-off-by: Geunyoung Kim <nenggun.kim@samsung.com>
Acked-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
forward the operation context to ttm_tt_populate as well,
and the ultimate goal is swapout enablement for reserved BOs.
v2: squash in fix for vboxvideo
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since the recent remote cpufreq callback work, its possible that a cpufreq
update is triggered from a remote CPU. For single policies however, the current
code uses the local CPU when trying to determine if the remote sg_cpu entered
idle or is busy. This is incorrect. To remedy this, compare with the nohz tick
idle_calls counter of the remote CPU.
Fixes: 674e75411f (sched: cpufreq: Allow remote cpufreq callbacks)
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Adds a start argument to the shm_register callback to allow the callback
to check memory type of the passed pages.
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Patches for 4.16 that are dependent on patches sent to 4.15-rc.
These are small clean ups for the vmw_pvrdma and i40iw drivers.
* 'from-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git:
RDMA/vmw_pvrdma: Remove usage of BIT() from UAPI header
RDMA/vmw_pvrdma: Use refcount_t instead of atomic_t
RDMA/vmw_pvrdma: Use more specific sizeof in kcalloc
RDMA/vmw_pvrdma: Clarify QP and CQ is_kernel logic
RDMA/vmw_pvrdma: Add UAR SRQ macros in ABI header file
i40iw: Change accelerated flag to bool
BIT() should not be used in the UAPI header. Remove it.
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Support for SRQs were added in the vmw_pvrdma userlevel library
before two necessary macros were added into the kernel ABI header
file. Add the two UAR SRQ macros that are required by the userlevel
library so that the library can rely on the kernel ABI header file
for these SRQ macro definitions.
Fixes: 8b10ba783c ("RDMA/vmw_pvrdma: Add shared receive queue support")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2017-12-28
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Fix incorrect state pruning related to recognition of zero initialized
stack slots, where stacksafe exploration would mistakenly return a
positive pruning verdict too early ignoring other slots, from Gianluca.
2) Various BPF to BPF calls related follow-up fixes. Fix an off-by-one
in maximum call depth check, and rework maximum stack depth tracking
logic to fix a bypass of the total stack size check reported by Jann.
Also fix a bug in arm64 JIT where prog->jited_len was uninitialized.
Addition of various test cases to BPF selftests, from Alexei.
3) Addition of a BPF selftest to test_verifier that is related to BPF to
BPF calls which demonstrates a late caller stack size increase and
thus out of bounds access. Fixed above in 2). Test case from Jann.
4) Addition of correlating BPF helper calls, BPF to BPF calls as well
as BPF maps to bpftool xlated dump in order to allow for better
BPF program introspection and debugging, from Daniel.
5) Fixing several bugs in BPF to BPF calls kallsyms handling in order
to get it actually to work for subprogs, from Daniel.
6) Extending sparc64 JIT support for BPF to BPF calls and fix a couple
of build errors for libbpf on sparc64, from David.
7) Allow narrower context access for BPF dev cgroup typed programs in
order to adapt to LLVM code generation. Also adjust memlock rlimit
in the test_dev_cgroup BPF selftest, from Yonghong.
8) Add netdevsim Kconfig entry to BPF selftests since test_offload.py
relies on netdevsim device being available, from Jakub.
9) Reduce scope of xdp_do_generic_redirect_map() to being static,
from Xiongwei.
10) Minor cleanups and spelling fixes in BPF verifier, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a space character missed in the printk messages.
Put the message into one line could simplify searching for
the messages in the kernel source.
Fixes: 563e0bb0dc74("net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint")
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Allow internal page allocation to fail (Chris)
- More improvements on logs, dumps, and trace (Chris, Michal)
- Coffee Lake important fix for stolen memory (Lucas)
- Continue to make GPU reset more robust as well
improving selftest coverage for it (Chris)
- Unifying debugfs return codes (Michal)
- Using existing helper for testing obj pages (Matthew)
- Organize and improve gem_request tracepoints (Lionel)
- Protect DDI port to DPLL map from theoretical race (Rodrigo)
- ... and consequently fixing the indentation on this DDI clk selection function (Chris)
- ... and consequently properly serializing non-blocking modesets (Ville)
- Add support for horizontal plane flipping on Cannonlake (Joonas)
- Two Cannonlake Workarounds for better stability (Rafael)
- Fix mess around PSR registers (DK)
- More Coffee Lake PCI IDs (Rodrigo)
- Remove CSS modifiers on pipe C of Geminilake (Krisman)
- Disable all planes for load detection (Ville)
- Reorg on i915 display headers (Michal)
- Avoid enabling movntdqa optimization on hypervisor guest (Changbin)
GVT:
- more mmio switch optimization (Weinan)
- cleanup i915_reg_t vs. offset usage (Zhenyu)
- move write protect handler out of mmio handler (Zhenyu)
* tag 'drm-intel-next-2017-12-22' of git://anongit.freedesktop.org/drm/drm-intel: (55 commits)
drm/i915: Update DRIVER_DATE to 20171222
drm/i915: Show HWSP in intel_engine_dump()
drm/i915: Assert that the request is on the execution queue before being removed
drm/i915/execlists: Show preemption progress in GEM_TRACE
drm/i915: Put all non-blocking modesets onto an ordered wq
drm/i915: Disable GMBUS clock gating around GMBUS transfers on gen9+
drm/i915: Clean up the PNV bit banging vs. GMBUS clock gating w/a
drm/i915: No need to power up PG2 for GMBUS on BXT
drm/i915: Disable DC states around GMBUS on GLK
drm/i915: Do not enable movntdqa optimization in hypervisor guest
drm/i915: Dump device info at once
drm/i915: Add pretty printer for runtime part of intel_device_info
drm/i915: Update intel_device_info_runtime_init() parameter
drm/i915: Move intel_device_info definitions to its own header
drm/i915: Move opregion definitions to dedicated intel_opregion.h
drm/i915: Move display related definitions to dedicated header
drm/i915: Move some utility functions to i915_util.h
drm/i915/gvt: move write protect handler out of mmio emulation function
drm/i915/gvt: cleanup usage for typed mmio reg vs. offset
drm/i915/gvt: Fix pipe A enable as default for vgpu
...
The build of mips bcm47xx_defconfig is failing with the error:
net/sched/sch_fq_codel.c: In function 'fq_codel_init':
net/sched/sch_fq_codel.c:487:8: error:
too many arguments to function 'tcf_block_get'
While adding the extack support, the commit missed adding it in the
headers when CONFIG_NET_CLS is not defined.
Fixes: 8d1a77f974 ("net: sch: api: add extack support in tcf_block_get")
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of computing max stack depth for current call chain
during the main verifier pass track stack depth of each
function independently and after do_check() is done do
another pass over all instructions analyzing depth
of all possible call stacks.
Fixes: f4d7e40a5b ("bpf: introduce function calls (verification)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Introduce peer_offset parameter in order to add the capability
to specify two different values for payload offset on tx/rx side.
If just offset is provided by userspace use it for rx side as well
in order to maintain compatibility with older l2tp versions
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
forward the operation context to ttm_mem_global_alloc_page as well,
and the ultimate goal is swapout enablement for reserved BOs.
Here reserved BOs refer to all the BOs which share same reservation object
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2017-12-22
1) Separate ESP handling from segmentation for GRO packets.
This unifies the IPsec GSO and non GSO codepath.
2) Add asynchronous callbacks for xfrm on layer 2. This
adds the necessary infrastructure to core networking.
3) Allow to use the layer2 IPsec GSO codepath for software
crypto, all infrastructure is there now.
4) Also allow IPsec GSO with software crypto for local sockets.
5) Don't require synchronous crypto fallback on IPsec offloading,
it is not needed anymore.
6) Check for xdo_dev_state_free and only call it if implemented.
From Shannon Nelson.
7) Check for the required add and delete functions when a driver
registers xdo_dev_ops. From Shannon Nelson.
8) Define xfrmdev_ops only with offload config.
From Shannon Nelson.
9) Update the xfrm stats documentation.
From Shannon Nelson.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-12-22
1) Check for valid id proto in validate_tmpl(), otherwise
we may trigger a warning in xfrm_state_fini().
From Cong Wang.
2) Fix a typo on XFRMA_OUTPUT_MARK policy attribute.
From Michal Kubecek.
3) Verify the state is valid when encap_type < 0,
otherwise we may crash on IPsec GRO .
From Aviv Heller.
4) Fix stack-out-of-bounds read on socket policy lookup.
We access the flowi of the wrong address family in the
IPv4 mapped IPv6 case, fix this by catching address
family missmatches before we do the lookup.
5) fix xfrm_do_migrate() with AEAD to copy the geniv
field too. Otherwise the state is not fully initialized
and migration fails. From Antony Antony.
6) Fix stack-out-of-bounds with misconfigured transport
mode policies. Our policy template validation is not
strict enough. It is possible to configure policies
with transport mode template where the address family
of the template does not match the selectors address
family. Fix this by refusing such a configuration,
address family can not change on transport mode.
7) Fix a policy reference leak when reusing pcpu xdst
entry. From Florian Westphal.
8) Reinject transport-mode packets through tasklet,
otherwise it is possible to reate a recursion
loop. From Herbert Xu.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the hwlock id 0 is valid for hardware spinlock core, but now id 0
is treated as one invalid value for regmap. Thus we should add one extra
flag for regmap config to indicate if a hardware spinlock should be used,
then id 0 can be valid for regmap to request.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Using a preprocessor directive to check for CONFIG_IPV6 in the middle of
a DECLARE_EVENT_CLASS macro's arg list causes sparse to report a series
of errors:
./include/trace/events/tcp.h:68:1: error: directive in argument list
./include/trace/events/tcp.h:75:1: error: directive in argument list
./include/trace/events/tcp.h:144:1: error: directive in argument list
./include/trace/events/tcp.h:151:1: error: directive in argument list
./include/trace/events/tcp.h:216:1: error: directive in argument list
./include/trace/events/tcp.h:223:1: error: directive in argument list
./include/trace/events/tcp.h:274:1: error: directive in argument list
./include/trace/events/tcp.h:281:1: error: directive in argument list
Once sparse finds an error, it stops printing warnings for the file it
is checking. This masks any sparse warnings that would normally be
reported for the core TCP code.
Instead, handle the preprocessor conditionals in a couple of auxiliary
macros. This also has the benefit of reducing duplicate code.
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ASSERT_RTNL() macro is actual open-coded variant of WARN_ONCE() with
two exceptions. First, it prints stack for multiple hits and not only
once as WARN_ONCE() does. Second, the user can disable prints of
WARN_ONCE by setting CONFIG_BUG to N.
The multiple prints of dump stack are actually not needed, because calls
without rtnl lock are programming errors and user can't do anything
about them except to complain to the mailing list after first occurrence
of such failure.
The user who disabled BUG/WARN prints did it explicitly because by default
in upstream kernel and distributions this option is enabled. It means
that user doesn't want to see prints about missing locks too.
This patch replaces open-coded variant in favor of already existing
macro and change error prints to be once only.
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original purpose of the per-superblock d_anon list was to
keep disconnected dentries in the cache between consecutive
requests to the NFS server. Dentries can be disconnected if
a client holds a file open and repeatedly performs IO on it,
and if the server drops the dentry, whether due to memory
pressure, server restart, or "echo 3 > /proc/sys/vm/drop_caches".
This purpose was thwarted by commit 75a6f82a0d ("freeing unlinked
file indefinitely delayed") which caused disconnected dentries
to be freed as soon as their refcount reached zero.
This means that, when a dentry being used by nfsd gets disconnected, a
new one needs to be allocated for every request (unless requests
overlap). As the dentry has no name, no parent, and no children,
there is little of value to cache. As small memory allocations are
typically fast (from per-cpu free lists) this likely has little cost.
This means that the original purpose of s_anon is no longer relevant:
there is no longer any need to keep disconnected dentries on a list so
they appear to be hashed.
However, s_anon now has a new use. When you mount an NFS filesystem,
the dentry stored in s_root is just a placebo. The "real" root dentry
is allocated using d_obtain_root() and so it kept on the s_anon list.
I don't know the reason for this, but suspect it related to NFSv4
where a mount of "server:/some/path" require NFS to look up the root
filehandle on the server, then walk down "/some" and "/path" to get
the filehandle to mount.
Whatever the reason, NFS depends on the s_anon list and on
shrink_dcache_for_umount() pruning all dentries on this list. So we
cannot simply remove s_anon.
We could just leave the code unchanged, but apart from that being
potentially confusing, the (unfair) bit-spin-lock which protects
s_anon can become a bottle neck when lots of disconnected dentries are
being created.
So this patch renames s_anon to s_roots, and stops storing
disconnected dentries on the list. Only dentries obtained with
d_obtain_root() are now stored on this list. There are many fewer of
these (only NFS and NILFS2 use the call, and only during filesystem
mount) so contention on the bit-lock will not be a problem.
Possibly an alternate solution should be found for NFS and NILFS2, but
that would require understanding their needs first.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull x86 PTI preparatory patches from Thomas Gleixner:
"Todays Advent calendar window contains twentyfour easy to digest
patches. The original plan was to have twenty three matching the date,
but a late fixup made that moot.
- Move the cpu_entry_area mapping out of the fixmap into a separate
address space. That's necessary because the fixmap becomes too big
with NRCPUS=8192 and this caused already subtle and hard to
diagnose failures.
The top most patch is fresh from today and cures a brain slip of
that tall grumpy german greybeard, who ignored the intricacies of
32bit wraparounds.
- Limit the number of CPUs on 32bit to 64. That's insane big already,
but at least it's small enough to prevent address space issues with
the cpu_entry_area map, which have been observed and debugged with
the fixmap code
- A few TLB flush fixes in various places plus documentation which of
the TLB functions should be used for what.
- Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for
more than sysenter now and keeping the name makes backtraces
confusing.
- Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(),
which is only invoked on fork().
- Make vysycall more robust.
- A few fixes and cleanups of the debug_pagetables code. Check
PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the
C89 initialization of the address hint array which already was out
of sync with the index enums.
- Move the ESPFIX init to a different place to prepare for PTI.
- Several code moves with no functional change to make PTI
integration simpler and header files less convoluted.
- Documentation fixes and clarifications"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit
init: Invoke init_espfix_bsp() from mm_init()
x86/cpu_entry_area: Move it out of the fixmap
x86/cpu_entry_area: Move it to a separate unit
x86/mm: Create asm/invpcid.h
x86/mm: Put MMU to hardware ASID translation in one place
x86/mm: Remove hard-coded ASID limit checks
x86/mm: Move the CR3 construction functions to tlbflush.h
x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what
x86/mm: Remove superfluous barriers
x86/mm: Use __flush_tlb_one() for kernel memory
x86/microcode: Dont abuse the TLB-flush interface
x86/uv: Use the right TLB-flush API
x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack
x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation
x86/mm/64: Improve the memory map documentation
x86/ldt: Prevent LDT inheritance on exec
x86/ldt: Rework locking
arch, mm: Allow arch_dup_mmap() to fail
x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
...
SA queries SM for class port info when there is a LID_CHANGE event.
When a base lid is configured before fm is started ie when smlid is
not yet assigned, SA handles the LID_CHANGE event and tries query SM
with lid 0. This will cause an hang.
[ 1106.958820] INFO: task kworker/2:0:23 blocked for more than 120 seconds.
[ 1106.965082] Tainted: G O 4.12.0+ #1
[ 1106.969602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 1106.977227] kworker/2:0 D 0 23 2 0x00000000
[ 1106.977250] Workqueue: infiniband update_ib_cpi [ib_core]
[ 1106.977261] Call Trace:
[ 1106.977273] __schedule+0x28e/0x860
[ 1106.977285] schedule+0x36/0x80
[ 1106.977298] schedule_timeout+0x1a3/0x2e0
[ 1106.977310] ? radix_tree_iter_tag_clear+0x1b/0x20
[ 1106.977322] ? idr_alloc+0x64/0x90
[ 1106.977334] wait_for_completion+0xe3/0x140
[ 1106.977347] ? wake_up_q+0x80/0x80
[ 1106.977369] update_ib_cpi+0x163/0x210 [ib_core]
[ 1106.977381] process_one_work+0x147/0x370
[ 1106.977394] worker_thread+0x4a/0x390
[ 1106.977406] kthread+0x109/0x140
[ 1106.977418] ? process_one_work+0x370/0x370
[ 1106.977430] ? kthread_park+0x60/0x60
[ 1106.977443] ret_from_fork+0x22/0x30
Always ensure a proper smlid is assigned before querying SM for cpi.
Fixes: ee1c60b1bf ("IB/SA: Modify SA to implicitly cache Class Port info")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pull xen fixes from Juergen Gross:
"This contains two fixes for running under Xen:
- a fix avoiding resource conflicts between adding mmio areas and
memory hotplug
- a fix setting NX bits in page table entries copied from Xen when
running a PV guest"
* tag 'for-linus-4.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/balloon: Mark unallocated host memory as UNUSABLE
x86-64/Xen: eliminate W+X mappings
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- fix chacha20 crash on zero-length input due to unset IV
- fix potential race conditions in mcryptd with spinlock
- only wait once at top of algif recvmsg to avoid inconsistencies
- fix potential use-after-free in algif_aead/algif_skcipher"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - fix race accessing cipher request
crypto: mcryptd - protect the per-CPU queue with a lock
crypto: af_alg - wait for data at beginning of recvmsg
crypto: skcipher - set walk.iv for zero-length inputs
Pull clk fixes from Stephen Boyd:
"Here's a trio of fixes:
- The runtime PM clk patches that landed this merge window forgot to
runtime resume devices that may be off while recalculating and
setting rates of child clks of whatever clk is changing rates.
- We had a NULL pointer deref in an old clk tracepoint when
clk_set_parent() is called with a NULL parent pointer. This
shouldn't really happen, but it's best to avoid this regardless.
- The sun9i-mmc clk driver didn't provide 'reset' support, just
'assert' and 'deassert' so the MMC driver stopped probing when the
probe was changed to do a reset instead of assert/deassert pair.
This implements the reset so things work again"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi: sun9i-mmc: Implement reset callback for reset controls
clk: fix a panic error caused by accessing NULL pointer
clk: Manage proper runtime PM state in clk_change_rate()