Commit Graph

85471 Commits

Author SHA1 Message Date
Matthias Schiffer
5da0aef5e9 batman-adv: add netlink command to query generic mesh information files
BATADV_CMD_GET_MESH_INFO is used to query basic information about a
batman-adv softif (name, index and MAC address for both the softif and
the primary hardif; routing algorithm; batman-adv version).

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_MESH_INFO, add missing kerneldoc, add policy for attributes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-07-04 12:37:17 +02:00
Matthias Schiffer
09748a22f4 batman-adv: add generic netlink family for batman-adv
debugfs is currently severely broken virtually everywhere in the kernel
where files are dynamically added and removed (see
http://lkml.iu.edu/hypermail/linux/kernel/1506.1/02196.html for some
details). In addition to that, debugfs is not namespace-aware.

Instead of adding new debugfs entries, the whole infrastructure should be
moved to netlink. This will fix the long standing problem of large buffers
for debug tables and hard to parse text files.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Strip down patch to only add genl family,
add missing kerneldoc]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-07-04 12:37:17 +02:00
Thierry Escande
7854a44526 NFC: digital: Add a delay between poll cycles
This replaces the polling work struct with a delayed work struct and add
a 10 ms delay between 2 poll cycles. This avoids to flood the device
with 'switch off'/'switch on' commands.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-04 12:26:27 +02:00
Thomas Gleixner
8658be133b Merge branch 'irq/for-block' into irq/core
Pull the irq affinity managing code which is in a seperate branch for block
developers to pull.
2016-07-04 12:26:05 +02:00
Christoph Hellwig
5e385a6ef3 genirq: Add a helper to spread an affinity mask for MSI/MSI-X vectors
This is lifted from the blk-mq code and adopted to use the affinity mask
concept just introduced in the irq handling code.  It tries to keep the
algorithm the same as the one current used by blk-mq, but improvements
like assining vectors on a per-node basis instead of just per sibling
are possible with this simple move and refactoring.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-7-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04 12:25:14 +02:00
Thomas Gleixner
0972fa57f5 genirq/msi: Make use of affinity aware allocations
Allow the MSI code to provide affinity hints per MSI descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-6-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04 12:25:14 +02:00
Thomas Gleixner
06ee6d571f genirq: Add affinity hint to irq allocation
Add an extra argument to the irq(domain) allocation functions, so we can hand
down affinity hints to the allocator. Thats necessary to implement proper
support for multiqueue devices.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-4-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04 12:25:13 +02:00
Thomas Gleixner
9c2555835b genirq: Introduce IRQD_AFFINITY_MANAGED flag
Interupts marked with this flag are excluded from user space interrupt
affinity changes. Contrary to the IRQ_NO_BALANCING flag, the kernel internal
affinity mechanism is not blocked.

This flag will be used for multi-queue device interrupts.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-3-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04 12:25:13 +02:00
Thomas Gleixner
b6140914fd genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAP
No user and we definitely don't want to grow one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-2-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04 12:25:12 +02:00
Denys Vlasenko
f86dec94e3 NFC: hci: delete unused nfc_llc_get_rx_head_tail_room()
It used to be EXPORTed, but then EXPORT usage was cleaned up
(in 2012), without noticing that the function has no users at all
(and curiously, never had any users).

Delete it.

While at it, remove non-static "inline" hints on nearby functions:
these hints don't work across compilation units anyway,
and these functions are not used in their .c file, thus they are
never inlined. IOW: "inline" here does not help in any way.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Samuel Ortiz <sameo@linux.intel.com>
CC: Christophe Ricard <christophe.ricard@gmail.com>
CC: linux-wireless@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-04 12:14:05 +02:00
Chris Wilson
bc3d674462 drm/i915: Allow userspace to request no-error-capture upon GPU hangs
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accommodate this
allow userspace to set a flag on the context that any hang emanating
from that context will not be recorded. We still do the error capture
(otherwise how do we find the guilty context and know its intent?) as
part of the reason for random GPU hang injection is to exercise the race
conditions between the error capture and normal execution.

v2: Split out the request->ringbuf error capture changes.
v3: Move the flag defines next to the intel_context->flags definition

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:24 +01:00
Marc Zyngier
50926d82fa KVM: arm/arm64: The GIC is dead, long live the GIC
I don't think any single piece of the KVM/ARM code ever generated
as much hatred as the GIC emulation.

It was written by someone who had zero experience in modeling
hardware (me), was riddled with design flaws, should have been
scrapped and rewritten from scratch long before having a remote
chance of reaching mainline, and yet we supported it for a good
three years. No need to mention the names of those who suffered,
the git log is singing their praises.

Thankfully, we now have a much more maintainable implementation,
and we can safely put the grumpy old GIC to rest.

Fellow hackers, please raise your glass in memory of the GIC:

	The GIC is dead, long live the GIC!

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-07-03 23:09:37 +02:00
Linus Torvalds
0b295dd5b8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fix from Miklos Szeredi:
 "This makes sure userspace filesystems are not broken by the parallel
  lookups and readdir feature"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: serialize dirops by default
2016-07-03 12:02:00 -07:00
Joe Perches
c37a2dfa67 netfilter: Convert FWINV<[foo]> macros and uses to NF_INVF
netfilter uses multiple FWINV #defines with identical form that hide a
specific structure variable and dereference it with a invflags member.

$ git grep "#define FWINV"
include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg))
net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg))
net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg)))
net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg)))
net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg)))
net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg)))

Consolidate these macros into a single NF_INVF macro.

Miscellanea:

o Neaten the alignment around these uses
o A few lines are > 80 columns for intelligibility

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-03 10:55:07 +02:00
Theodore Ts'o
e192be9d9a random: replace non-blocking pool with a Chacha20-based CRNG
The CRNG is faster, and we don't pretend to track entropy usage in the
CRNG any more.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-07-03 00:57:23 -04:00
Linus Walleij
90efe05562 iio: st_sensors: harden interrupt handling
Leonard Crestez observed the following phenomenon: when using
hard interrupt triggers (the DRDY line coming out of an ST
sensor) sometimes a new value would arrive while reading the
previous value, due to latencies in the system.

We discovered that the ST hardware as far as can be observed
is designed for level interrupts: the DRDY line will be held
asserted as long as there are new values coming. The interrupt
handler should be re-entered until we're out of values to
handle from the sensor.

If interrupts were handled as occurring on the edges (usually
low-to-high) new values could appear and the line be held
asserted after that, and these values would be missed, the
interrupt handler would also lock up as new data was
available, but as no new edges occurs on the DRDY signal,
nothing happens: the edge detector only detects edges.

To counter this, do the following:

- Accept interrupt lines to be flagged as level interrupts
  using IRQF_TRIGGER_HIGH and IRQF_TRIGGER_LOW. If the line
  is marked like this (in the device tree node or ACPI
  table or similar) it will be utilized as a level IRQ.
  We mark the line with IRQF_ONESHOT and mask the IRQ
  while processing a sample, then the top half will be
  entered again if new values are available.

- If we are flagged as using edge interrupts with
  IRQF_TRIGGER_RISING or IRQF_TRIGGER_FALLING: remove
  IRQF_ONESHOT so that the interrupt line is not
  masked while running the thread part of the interrupt.
  This way we will never miss an interrupt, then introduce
  a loop that polls the data ready registers repeatedly
  until no new samples are available, then exit the
  interrupt handler. This way we know no new values are
  available when the interrupt handler exits and
  new (edge) interrupts will be triggered when data arrives.
  Take some extra care to update the timestamp in the poll
  loop if this happens. The timestamp will not be 100%
  perfect, but it will at least be closer to the actual
  events. Usually the extra poll loop will handle the new
  samples, but once in a blue moon, we get a new IRQ
  while exiting the loop, before returning from the
  thread IRQ bottom half with IRQ_HANDLED. On these rare
  occasions, the removal of IRQF_ONESHOT means the
  interrupt will immediately fire again.

- If no interrupt type is indicated from the DT/ACPI,
  choose IRQF_TRIGGER_RISING as default, as this is necessary
  for legacy boards.

Tested successfully on the LIS331DL and L3G4200D by setting
sampling frequency to 400Hz/800Hz and stressing the system:
extra reads in the threaded interrupt handler occurs.

Cc: Giuseppe Barba <giuseppe.barba@st.com>
Cc: Denis Ciocca <denis.ciocca@st.com>
Tested-by: Crestez Dan Leonard <cdleonard@gmail.com>
Reported-by: Crestez Dan Leonard <cdleonard@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-07-02 20:40:15 +01:00
Hadar Hen Zion
b50d292b43 net/mlx5e: Create NIC global resources only once
To allow creating more than one netdev over the same PCI function, we
change the driver such that global NIC resources are created once and
later be shared amongst all the mlx5e netdevs running over that port.

Move the CQ UAR, PD (pdn), Transport Domain (tdn), MKey resources from
being kept in the mlx5e priv part to a new resources structure
(mlx5e_resources) placed under the mlx5_core device.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz
08f4b5918b net/devlink: Add E-Switch mode control
Add the commands to set and show the mode of SRIOV E-Switch, two modes
are supported:

* legacy: operating in the "old" L2 based mode (DMAC --> VF vport)

* switchdev: the E-Switch is referred to as whitebox switch configured
using standard tools such as tc, bridge, openvswitch etc. To allow
working with the tools, for each VF, a VF representor netdevice is
created by the E-Switch manager vendor device driver instance (e.g PF).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz
acbc2004d7 net/mlx5: Introduce offloads steering namespace
Add a new namespace (MLX5_FLOW_NAMESPACE_OFFLOADS) to be populated
with flow steering rules that deal with rules that have have to
be executed before the EN NIC steering rules are matched.

The namespace is located after the bypass name-space and before the
kernel name-space. Therefore, it precedes the HW processing done for
rules set for the kernel NIC name-space.

Under SRIOV, it would allow us to match on e-switch missed packet
and forward them to the relevant VF representor TIR.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Dave Airlie
dac2c48ca5 Merge branch 'for-next' of http://git.agner.ch/git/linux-drm-fsl-dcu into drm-next
The patchset contains a new helper in drm_fb_cma_helper.c for suspend/
resume when using cma backed framebuffers.

* 'for-next' of http://git.agner.ch/git/linux-drm-fsl-dcu:
  drm/fsl-dcu: disable vblank events on CRTC disable
  drm/fsl-dcu: implement suspend/resume using atomic helpers
  drm/fsl-dcu: use clk helpers
  drm/fsl-dcu: move layer initialization to plane file
  drm/fsl-dcu: store layer registers in soc_data
  drm/fb_cma_helper: add suspend helper
2016-07-02 16:21:35 +10:00
Dave Airlie
542d972221 Back-merge tag 'v4.7-rc5' into drm-next
Linux 4.7-rc5

The fsl-dcu pull needs -rc3 so go to -rc5 for now.
2016-07-02 15:56:01 +10:00
Dave Airlie
88c087109b Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes
here's a batch of i915 fixes for 4.7.

* tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Fix missing unlock on error in i915_ppgtt_info()
  drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
  drm/i915: Add more Kabylake PCI IDs.
  drm/i915: Avoid early timeout during AUX transfers
  drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
  drm/i915/lpt: Avoid early timeout during FDI PHY reset
  drm/i915/bxt: Avoid early timeout during PLL enable
  drm/i915: Refresh cached DP port register value on resume
2016-07-02 15:50:41 +10:00
Venkat Reddy Talla
1b6cf31010 extcon: adc-jack: add suspend/resume support
adding suspend and resume funtionality for extcon-adc-jack
driver to configure system wake up for extcon events,
also adding support to enable/disable system wakeup
through flag wakeup_source based on platform requirement.

Signed-off-by: Venkat Reddy Talla <vreddytalla@nvidia.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-07-02 14:31:34 +09:00
Dong Aisheng
a4b3518d14 clk: core: support clocks which requires parents enable (part 1)
On Freescale i.MX7D platform, all clocks operations, including
enable/disable, rate change and re-parent, requires its parent
clock enable. Current clock core can not support it well.
This patch introduce a new flag CLK_OPS_PARENT_ENABLE to handle this
special case in clock core that enable its parent clock firstly for
each operation and disable it later after operation complete.

The patch part 1 fixes the possible disabling clocks while its parent
is off during kernel booting phase in clk_disable_unused_subtree().

Before the completion of kernel booting, clock tree is still not built
completely, there may be a case that the child clock is on but its
parent is off which could be caused by either HW initial reset state
or bootloader initialization.

Taking bootloader as an example, we may enable all clocks in HW by default.
And during kernel booting time, the parent clock could be disabled in its
driver probe due to calling clk_prepare_enable and clk_disable_unprepare.
Because it's child clock is only enabled in HW while its SW usecount
in clock tree is still 0, so clk_disable of parent clock will gate
the parent clock in both HW and SW usecount ultimately. Then there will
be a child clock is still on in HW but its parent is already off.

Later in clk_disable_unused(), this clock disable accessing while its
parent off will cause system hang due to the limitation of HW which
must require its parent on.

This patch simply enables the parent clock first before disabling
if flag CLK_OPS_PARENT_ENABLE is set in clk_disable_unused_subtree().
This is a simple solution and only affects booting time.

After kernel booting up the clock tree is already created, there will
be no case that child is off but its parent is off.
So no need do this checking for normal clk_disable() later.

Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-07-01 17:40:23 -07:00
Stephen Boyd
582e2405b2 Merge tag 'v4.8-rockchip-clk1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-next
Pull rockchip clk driver updates from Heiko Stuebner:

Placeholder for the rk3399 watchdog pclk, some newly exported
rk3228 clockids and a small fix for the not yet used spdif to
displayport clock on the rk3399.

* tag 'v4.8-rockchip-clk1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  clk: rockchip: fix incorrect rk3399 spdif-DPTX divider bits
  clk: rockchip: export rk3228 MAC clocks
  clk: rockchip: rename rk3228 sclk_macphy_50m to sclk_mac_extclk
  clk: rockchip: export rk3228 audio clocks
  clk: rockchip: include rk3228 downstream muxes into fractional dividers
  clk: rockchip: fix incorrect rk3228 clock registers
  clk: rockchip: add clock-ids for rk3228 MAC clocks
  clk: rockchip: add clock-ids for rk3228 audio clocks
  clk: rockchip: add a dummy clock for the watchdog pclk on rk3399
2016-07-01 17:30:42 -07:00
Stephen Boyd
345c42964c Merge tag 'tegra-for-4.8-clk' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into clk-next
Pull tegra clk driver updates from Thierry Reding:

Fixes and enhancements mostly for Tegra210 clocks that allow DSI and
HDMI to work on Tegra X1. There's also a refactoring, including fixes,
the USB PLL.

* tag 'tegra-for-4.8-clk' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  clk: tegra: Initialize UTMI PLL when enabling PLLU
  clk: tegra: Micro-optimize Tegra210 clock setup
  clk: tegra: Make sor_safe the parent of dpaux and dpaux1
  clk: tegra: Mark timer clock as critical
  clk: tegra: Enable sor1 and sor1_src on Tegra210
  clk: tegra: Squash sor1 safe/brick/src into a single mux
  clk: tegra: Disable spread spectrum on pll_d2
  clk: tegra: Fixup post dividers on Tegra210
2016-07-01 17:27:14 -07:00
Sinan Kaya
487cf917ed Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()"
Trying to make the ISA and PCI init functionality common turned out
to be a bad idea, because the ISA path depends on external
functionality.

Restore the previous behavior and limit the refactoring to PCI
interrupts only.

Fixes: 1fcb6a813c "ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()"
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Tested-by: Wim Osterholt <wim@djo.tudelft.nl>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-02 01:38:34 +02:00
Rhyland Klein
e380538529 power_supply: fix return value of get_property
power_supply_get_property() should ideally return -EAGAIN if it is
called while the power_supply is being registered. There was no way
previously to determine if use_cnt == 0 meant that the power_supply
wasn't fully registered yet, or if it had already been unregistered.

Add a new boolean to the power_supply struct to simply show if
registration is completed. Lastly, modify the check in
power_supply_show_property() to also ignore -EAGAIN when so it
doesn't complain about not returning the property.

Signed-off-by: Rhyland Klein <rklein@nvidia.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
2016-07-01 22:44:34 +02:00
Martin KaFai Lau
4a482f34af cgroup: bpf: Add bpf_skb_in_cgroup_proto
Adds a bpf helper, bpf_skb_in_cgroup, to decide if a skb->sk
belongs to a descendant of a cgroup2.  It is similar to the
feature added in netfilter:
commit c38c4597e4 ("netfilter: implement xt_cgroup cgroup2 path match")

The user is expected to populate a BPF_MAP_TYPE_CGROUP_ARRAY
which will be used by the bpf_skb_in_cgroup.

Modifications to the bpf verifier is to ensure BPF_MAP_TYPE_CGROUP_ARRAY
and bpf_skb_in_cgroup() are always used together.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:32:13 -04:00
Martin KaFai Lau
4ed8ec521e cgroup: bpf: Add BPF_MAP_TYPE_CGROUP_ARRAY
Add a BPF_MAP_TYPE_CGROUP_ARRAY and its bpf_map_ops's implementations.
To update an element, the caller is expected to obtain a cgroup2 backed
fd by open(cgroup2_dir) and then update the array with that fd.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:30:38 -04:00
Martin KaFai Lau
1f3fe7ebf6 cgroup: Add cgroup_get_from_fd
Add a helper function to get a cgroup2 from a fd.  It will be
stored in a bpf array (BPF_MAP_TYPE_CGROUP_ARRAY) which will
be introduced in the later patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:30:38 -04:00
WANG Cong
82a31b9231 net_sched: fix mirrored packets checksum
Similar to commit 9b368814b3 ("net: fix bridge multicast packet checksum validation")
we need to fixup the checksum for CHECKSUM_COMPLETE when
pushing skb on RX path. Otherwise we get similar splats.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:19:34 -04:00
David S. Miller
eb70db8756 packet: Use symmetric hash for PACKET_FANOUT_HASH.
People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
they want packets going in both directions on a flow to hash to the
same bucket.

The core kernel SKB hash became non-symmetric when the ipv6 flow label
and other entities were incorporated into the standard flow hash order
to increase entropy.

But there are no users of PACKET_FANOUT_HASH who want an assymetric
hash, they all want a symmetric one.

Therefore, use the flow dissector to compute a flat symmetric hash
over only the protocol, addresses and ports.  This hash does not get
installed into and override the normal skb hash, so this change has
no effect whatsoever on the rest of the stack.

Reported-by: Eric Leblond <eric@regit.org>
Tested-by: Eric Leblond <eric@regit.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:07:50 -04:00
Daniel Borkmann
113214be7f bpf: refactor bpf_prog_get and type check into helper
Since bpf_prog_get() and program type check is used in a couple of places,
refactor this into a small helper function that we can make use of. Since
the non RO prog->aux part is not used in performance critical paths and a
program destruction via RCU is rather very unlikley when doing the put, we
shouldn't have an issue just doing the bpf_prog_get() + prog->type != type
check, but actually not taking the ref at all (due to being in fdget() /
fdput() section of the bpf fd) is even cleaner and makes the diff smaller
as well, so just go for that. Callsites are changed to make use of the new
helper where possible.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:00:47 -04:00
Daniel Borkmann
1aacde3d22 bpf: generally move prog destruction to RCU deferral
Jann Horn reported following analysis that could potentially result
in a very hard to trigger (if not impossible) UAF race, to quote his
event timeline:

 - Set up a process with threads T1, T2 and T3
 - Let T1 set up a socket filter F1 that invokes another filter F2
   through a BPF map [tail call]
 - Let T1 trigger the socket filter via a unix domain socket write,
   don't wait for completion
 - Let T2 call PERF_EVENT_IOC_SET_BPF with F2, don't wait for completion
 - Now T2 should be behind bpf_prog_get(), but before bpf_prog_put()
 - Let T3 close the file descriptor for F2, dropping the reference
   count of F2 to 2
 - At this point, T1 should have looked up F2 from the map, but not
   finished executing it
 - Let T3 remove F2 from the BPF map, dropping the reference count of
   F2 to 1
 - Now T2 should call bpf_prog_put() (wrong BPF program type), dropping
   the reference count of F2 to 0 and scheduling bpf_prog_free_deferred()
   via schedule_work()
 - At this point, the BPF program could be freed
 - BPF execution is still running in a freed BPF program

While at PERF_EVENT_IOC_SET_BPF time it's only guaranteed that the perf
event fd we're doing the syscall on doesn't disappear from underneath us
for whole syscall time, it may not be the case for the bpf fd used as
an argument only after we did the put. It needs to be a valid fd pointing
to a BPF program at the time of the call to make the bpf_prog_get() and
while T2 gets preempted, F2 must have dropped reference to 1 on the other
CPU. The fput() from the close() in T3 should also add additionally delay
to the reference drop via exit_task_work() when bpf_prog_release() gets
called as well as scheduling bpf_prog_free_deferred().

That said, it makes nevertheless sense to move the BPF prog destruction
generally after RCU grace period to guarantee that such scenario above,
but also others as recently fixed in ceb5607035 ("bpf, perf: delay release
of BPF prog after grace period") with regards to tail calls won't happen.
Integrating bpf_prog_free_deferred() directly into the RCU callback is
not allowed since the invocation might happen from either softirq or
process context, so we're not permitted to block. Reviewing all bpf_prog_put()
invocations from eBPF side (note, cBPF -> eBPF progs don't use this for
their destruction) with call_rcu() look good to me.

Since we don't know whether at the time of attaching the program, we're
already part of a tail call map, we need to use RCU variant. However, due
to this, there won't be severely more stress on the RCU callback queue:
situations with above bpf_prog_get() and bpf_prog_put() combo in practice
normally won't lead to releases, but even if they would, enough effort/
cycles have to be put into loading a BPF program into the kernel already.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:00:47 -04:00
Sinclair Yeh
94477bff39 drm/ttm: Make ttm_bo_mem_compat available
There are cases where it is desired to see if a proposed placement
is compatible with a buffer object before calling ttm_bo_validate().

Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: <stable@vger.kernel.org>
---
This is the first of a 3-patch series to fix a black screen
issue observed on Ubuntu 16.04 server.
2016-07-01 10:47:49 -07:00
Linus Torvalds
0232b23d08 Merge tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and PHY fixes from Greg KH:
 "Here are a number of small USB and PHY driver fixes for 4.7-rc6.

  Nothing major here, all are described in the shortlog below.  All have
  been in linux-next with no reported issues"

* tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: don't free bandwidth_mutex too early
  USB: EHCI: declare hostpc register as zero-length array
  phy-sun4i-usb: Fix irq free conditions to match request conditions
  phy: bcm-ns-usb2: checking the wrong variable
  phy-sun4i-usb: fix missing __iomem *
  phy: phy-sun4i-usb: Fix optional gpios failing probe
  phy: rockchip-dp: fix return value check in rockchip_dp_phy_probe()
  phy: rcar-gen3-usb2: fix unexpected repeat interrupts of VBUS change
  usb: common: otg-fsm: add license to usb-otg-fsm
2016-07-01 09:18:17 -07:00
Herbert Xu
9b45b7bba3 crypto: rsa - Generate fixed-length output
Every implementation of RSA that we have naturally generates
output with leading zeroes.  The one and only user of RSA,
pkcs1pad wants to have those leading zeroes in place, in fact
because they are currently absent it has to write those zeroes
itself.

So we shouldn't be stripping leading zeroes in the first place.
In fact this patch makes rsa-generic produce output with fixed
length so that pkcs1pad does not need to do any extra work.

This patch also changes DH to use the new interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-01 23:45:18 +08:00
Herbert Xu
32f27c745c crypto: api - Add crypto_inst_setname
This patch adds the helper crypto_inst_setname because the current
helper crypto_alloc_instance2 is no longer useful given that we
now look up the algorithm after we allocate the instance object.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-01 23:45:11 +08:00
Kuninori Morimoto
cecdef3656 ASoC: simple-card: use asoc_simple_card_parse_daifmt()
We can use simpel utils asoc_simple_card_parse_daifmt().
Let's use it

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-01 17:34:02 +02:00
Joe Perches
4ae89ad924 etherdevice.h & bridge: netfilter: Add and use ether_addr_equal_masked
There are code duplications of a masked ethernet address comparison here
so make it a separate function instead.

Miscellanea:

o Neaten alignment of FWINV macro uses to make it clearer for the reader

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01 16:37:06 +02:00
arun.siluvery@linux.intel.com
37f501afed drm/i915/bxt: Export pooled eu info to userspace
Pooled EU is a bxt only feature and kernel changes are already merged. This
feature is not yet exposed to userspace as the support was not yet
available. Beignet team expressed interest and added patches to use this.

Since we now have a user and patches to use them, expose them from the
kernel side as well.

v2: fix compile error

[1] https://lists.freedesktop.org/archives/beignet/2016-June/007698.html
[2] https://lists.freedesktop.org/archives/beignet/2016-June/007699.html

Cc: Winiarski, Michal <michal.winiarski@intel.com>
Cc: Zou, Nanhai <nanhai.zou@intel.com>
Cc: Yang, Rong R <rong.r.yang@intel.com>
Cc: Tim Gore <tim.gore@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467369782-25992-1-git-send-email-arun.siluvery@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
2016-07-01 14:53:52 +01:00
Mohamad Haj Yahia
65ee670845 net/mlx5: Add timeout handle to commands with callback
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get fw response.
Add delayed callback timeout work before posting the command to fw.
In case of real fw command completion we will cancel the delayed work.
In case of fw command timeout the callback timeout handler will be
called and it will simulate fw completion with timeout error.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Eric Dumazet
f87fda00b6 bonding: prevent out of bound accesses
ether_addr_equal_64bits() requires some care about its arguments,
namely that 8 bytes might be read, even if last 2 byte values are not
used.

KASan detected a violation with null_mac_addr and lacpdu_mcast_addr
in bond_3ad.c

Same problem with mac_bcast[] and mac_v6_allmcast[] in bond_alb.c :
Although the 8-byte alignment was there, KASan would detect out
of bound accesses.

Fixes: 815117adaf ("bonding: use ether_addr_equal_unaligned for bond addr compare")
Fixes: bb54e58929 ("bonding: Verify RX LACPDU has proper dest mac-addr")
Fixes: 885a136c52 ("bonding: use compare_ether_addr_64bits() in ALB")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:06:09 -04:00
Jason Wang
1576d98605 tun: switch to use skb array for tx
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.

This patch tries to address this by:

- switch from sk_receive_queue to a skb_array, and resize it when
  tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
  skb length.
- implement a tun version of peek_len for vhost_net to use and convert
  vhost_net to use peek_len if possible.

Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:

Before: ~1300000pps
After : ~1500000pps

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Jason Wang
08294a26e1 net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Jason Wang
bf900b3dbe skb_array: add wrappers for resizing
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Michael S. Tsirkin
59e6ae5324 ptr_ring: support resizing multiple queues
Sometimes, we need support resizing multiple queues at once. This is
because it was not easy to recover to recover from a partial failure
of multiple queues resizing.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Jason Wang
fd68adec9d skb_array: minor tweak
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:16 -04:00
Jason Wang
982fb490c2 ptr_ring: support zero length ring
Sometimes, we need zero length ring. But current code will crash since
we don't do any check before accessing the ring. This patch fixes this.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:16 -04:00