Commit Graph

649779 Commits

Author SHA1 Message Date
Dave Chinner
5f1c6d28cf Merge branch 'iomap-4.10-directio' into for-next 2016-11-30 14:39:29 +11:00
Christoph Hellwig
acdda3aae1 xfs: use iomap_dio_rw
Straight switch over to using iomap for direct I/O - we already have the
non-COW dio path in write_begin for DAX and files with extent size hints,
so nothing to add there.  The COW path is ported over from the old
get_blocks version and a bit of a mess, but I have some work in progress
to make it look more like the buffered I/O COW path.

This gets rid of xfs_get_blocks_direct and the last caller of
xfs_get_blocks with the create flag set, so all that code can be removed.

Last but not least I've removed a comment in xfs_filemap_fault that
refers to xfs_get_blocks entirely instead of updating it - while the
reference is correct, the whole DAX fault path looks different than
the non-DAX one, so it seems rather pointless.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30 14:37:15 +11:00
Christoph Hellwig
ff6a9292e6 iomap: implement direct I/O
This adds a full fledget direct I/O implementation using the iomap
interface. Full fledged in this case means all features are supported:
AIO, vectored I/O, any iov_iter type including kernel pointers, bvecs
and pipes, support for hole filling and async apending writes.  It does
not mean supporting all the warts of the old generic code.  We expect
i_rwsem to be held over the duration of the call, and we expect to
maintain i_dio_count ourselves, and we pass on any kinds of mapping
to the file system for now.

The algorithm used is very simple: We use iomap_apply to iterate over
the range of the I/O, and then we use the new bio_iov_iter_get_pages
helper to lock down the user range for the size of the extent.
bio_iov_iter_get_pages can currently lock down twice as many pages as
the old direct I/O code did, which means that we will have a better
batch factor for everything but overwrites of badly fragmented files.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com>
Tested-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30 14:36:01 +11:00
Christoph Hellwig
ec1b826097 fs: make sb_init_dio_done_wq available outside of direct-io.c
We want to use the per-sb completion workqueue from the new iomap
direct I/O code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30 14:33:53 +11:00
Christoph Hellwig
6552321831 xfs: remove i_iolock and use i_rwsem in the VFS inode instead
This patch drops the XFS-own i_iolock and uses the VFS i_rwsem which
recently replaced i_mutex instead.  This means we only have to take
one lock instead of two in many fast path operations, and we can
also shrink the xfs_inode structure.  Thanks to the xfs_ilock family
there is very little churn, the only thing of note is that we need
to switch to use the lock_two_directory helper for taking the i_rwsem
on two inodes in a few places to make sure our lock order matches
the one used in the VFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30 14:33:25 +11:00
Peter Zijlstra
f8319483f5 locking/lockdep: Provide a type check for lock_is_held
Christoph requested lockdep_assert_held() variants that distinguish
between held-for-read or held-for-write.

Provide:

  int lock_is_held_type(struct lockdep_map *lock, int read)

which takes the same argument as lock_acquire(.read) and matches it to
the held_lock instance.

Use of this function should be gated by the debug_locks variable. When
that is 0 the return value of the lock_is_held_type() function is
undefined. This is done to allow both negative and positive tests for
holding locks.

By default we provide (positive) lockdep_assert_held{,_exclusive,_read}()
macros.

Requested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-30 14:32:25 +11:00
Eugeniy Paltsev
bd2c6636cc dmaengine: DW DMAC: add multi-block property to device tree
Several versions of DW DMAC have multi block transfers hardware
support. Hardware support of multi block transfers is disabled
by default if we use DT to configure DMAC and software emulation
of multi block transfers used instead.
Add multi-block property, so it is possible to enable hardware
multi block transfers (if present) via DT.

Switch from per device is_nollp variable to multi_block array
to be able enable/disable multi block transfers separately per
channel.

Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:57:50 +05:30
Eugeniy Paltsev
258f2277a9 dmaengine: DW DMAC: enable memory-to-memory transfers support
All known devices, which use DT for configuration, support
memory-to-memory transfers. So enable it by default, if we read
configuration from DT.

Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:57:17 +05:30
Vignesh R
08c824e87e dmaengine: edma: re-initialize dummy slot during system resume
The last param set in a transfer should always be pointing to dummy
param set in non-cyclic mode. When system wakes from low power state
EDMA PARAM slots may be reset to random values. Hence, re-initialize
dummy slot to dummy param set on system resume.

Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:55:05 +05:30
Peter Ujfalusi
201ac4861c dmaengine: omap-dma: Support for slave devices with data port window
Based on the src/dst_port_window_size - if it is set - configure the DMA
channel to use double indexing in order to be able to loop within the
address window.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:54:04 +05:30
Peter Ujfalusi
54cd255808 dmaengine: dma_slave_config: add support for slave port window
Some slave devices uses address window instead of single register for read
and/or write of data. With the src/dst_port_window_size the address window
can be specified and the DMAengine driver should use this information to
correctly set up the transfer to loop within the provided window.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:54:04 +05:30
Souptick Joarder
9dcd74089a dmaengine: at_xdmac: Use dma_pool_zalloc
We should use dma_pool_zalloc instead of dma_pool_alloc/memset.

Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:50:46 +05:30
Souptick Joarder
c2e60fc702 dmaengine: zx296702_dma: Use dma_pool_zalloc
We should use dma_pool_zalloc instead of dma_pool_alloc/memset.

Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:50:40 +05:30
Dave Jiang
d648160863 dmaengine: dmatest: honor alignment restriction for buffers
Existing implementation does not honor the alignment restrictions imposed
by the DMA engines. Allocate buffers with built in slack for honoring
alignment restrictions. Creating new arrays to hold the aligned pointers
and use those pointers for operations.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:50:18 +05:30
Dave Jiang
31d182574a dmaengine: fix spacing issues for dmatest
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-11-30 08:50:18 +05:30
Brian Norris
5fed67df87 Merge tag 'spi-nor/for-4.10' of git://github.com/spi-nor/linux
From Cyrille Pitchen:

"""
This pull request contains the following notable changes:
- add support to new memory parts.
- fix of spansion_quad_enable().
- fix of the Candence QSPI driver.
- constify some structure instances of the Freescale QSPI driver.
"""
2016-11-29 18:31:14 -08:00
Linus Torvalds
ded6e842cf Merge tag 'arc-4.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:

 - fix PAE40 crash [Yuriy]

 - disable IO-Coherency by default

 - use a different inline asm constraint for Zero Overhead loops

* tag 'arc-4.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: mm: PAE40: Fix crash at munmap
  ARC: mm: IOC: Don't enable IOC by default
  ARC: Don't use "+l" inline asm constraint
2016-11-29 18:30:50 -08:00
Brian Norris
0989b0909c Merge tag 'nand/for-4.10' of github.com:linux-nand/linux
From Boris Brezillon:

"""
This pull request contains the following notable changes:
- new tango NAND controller driver
- new ox820 NAND controller driver
- addition of a new full-ID entry in the nand_ids table
- rework of the s3c240 driver to support DT
- extension of the nand_sdr_timings to expose tCCS, tPROG and tR
- addition of a new flag to ask the core to wait for tCCS when sending
  a RNDIN/RNDOUT command
- addition of a new flag to ask the core to let the controller driver
  send the READ/PROGPAGE command

This pull request also contains minor fixes/cleanup/cosmetic changes:
- properly support 512 ECC step size in the sunxi driver
- improve the error messages in the pxa probe path
- fix module autoload in the omap2 driver
- cleanup of several nand drivers to return nand_scan{_tail}() error
  code instead of returning -EIO
- various cleanups in the denali driver
- cleanups in the ooblayout handling (MTD core)
- fix an error check in nandsim
"""
2016-11-29 18:28:30 -08:00
Zhang Rui
9245ac20d8 Merge branches 'thermal-core', 'thermal-intel', 'thermal-soc-fixes' and 'thermal-reorg' into next 2016-11-30 10:26:38 +08:00
Sebastian Andrzej Siewior
7646ff2e7a thermal/x86 pkg temp: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.

Replace the wrmsr/rdmrs_on_cpu() calls in the hotplug callbacks as they are
guaranteed to be invoked on the incoming/outgoing cpu.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:47 +08:00
Thomas Gleixner
556238e45c thermal/x86_pkg_temp: Sanitize package management
Packages are kept in a list, which must be searched over and over.

We can be smarter than that and just store the package pointers in an array
which is allocated at init time. Sizing of the array is determined from the
topology information. That makes the package search a simple array lookup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:41 +08:00
Thomas Gleixner
411bb3835f thermal/x86_pkg_temp: Move work into package struct
Delayed work structs are held in a static percpu storage, which makes no
sense at all because work is strictly per package and we never schedule
more than one work per package.

Aside of that the work cancelation in the hotplug is broken when the work
is queued on the outgoing cpu and canceled. Nothing reschedules the work on
another online cpu in the package, so the interrupts stay disabled and the
work_scheduled flag stays active.

Move the delayed work struct into the package struct, which is the only
sensible place to have it.

To simplify the cancelation logic schedule the work always on the cpu which
is the target for the sysfs files. This is required so the cancelation
logic in the cpu offline path cancels only when the outgoing cpu is the
current target and reschedule the work when there is still a online
CPU in the package.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:34 +08:00
Thomas Gleixner
64ca738f1f thermal/x86_pkg_temp: Move work scheduled flag into package struct
Storage for a boolean information whether work is scheduled for a package
is kept in separate allocated storage, which is resized when the number of
detected packages grows.

With the proper locking in place this is a completely pointless exercise
because we can simply stick it into the per package struct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:27 +08:00
Thomas Gleixner
ab47bd964a thermal/x86_pkg_temp: Sanitize locking
The work cancellation code, the thermal zone unregistering, the work code
and the interrupt notification function are racy against each other and
against cpu hotplug and module exit. The random locking sprinkeled all
over the place does not help anything and probably exists to make people
feel good. The resulting issues (mainly use after free) are probably
hard to trigger, but they clearly exist

Protect the package list with a spinlock so it can be accessed from the
interrupt notifier and also from the work function. The add/removal code in
the hotplug callbacks take the lock for list manipulation. That makes sure
that on removal neither the interrupt notifier nor the work function can
access the about to be freed package structure anymore.

The thermal zone unregistering is another trainwreck. It's not serialized
against the work function. So unregistering the zone device can race with
the work function and cause havoc.

Protect the thermal zone with a mutex, which is held in the work
function to make sure that the zone device is not being unregistered
concurrently.

To solve the module exit issues, we simply invoke the cpu offline callback
and let it work its magic. For that it's required to keep track of the
participating cpus in a package, because topology_core_mask is not affected
by calling the offline callback for teardown of the driver, so it would
never free the package as there is always a valid target in
topology_core_mask.

Use proper names for the locks so it's clear what they are for and add a
pile of comments to explain the protection rules.

It's amazing that fixing the locking and adding 30 lines of comments
explaining it still removes more lines than it adds.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:19 +08:00
Thomas Gleixner
8079a4bdcb thermal/x86_pkg_temp: Cleanup code some more
Coding style fixups and replacement of overly complex constructs and random
error codes instead of returning the real ones. This mess makes the eyes bleeding.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:13 +08:00
Thomas Gleixner
3883a64e38 thermal/x86_pkg_temp: Cleanup namespace
Any randomly chosen struct name is more descriptive than phy_dev_entry.

Rename the whole thing to struct pkg_device, which describes the content
reasonably well and use the same variable name throughout the code so it
gets readable. Rename the msr struct members as well.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:25:06 +08:00
Thomas Gleixner
b6badbea30 thermal/x86_pkg_temp: Get rid of ref counting
There is no point in the whole package data refcounting dance because
topology_core_cpumask tells us whether this is the last cpu in the
package. If yes, then the package can go, if not it stays. It's already
serialized via the hotplug code.

While at it rename the first_cpu member of the package structure to
cpu. The first has absolutely no meaning.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:24:59 +08:00
Thomas Gleixner
09a674cd69 thermal/x86_pkg_temp: Sanitize callback (de)initialization
The threshold callbacks are installed before the initialization of the
online cpus has succeeded and removed after the teardown has been
done. That's both wrong as callbacks might be invoked into a half
initialized or torn down state.

Move them to the proper places: Last in init() and first in exit().

While at it shorten the insane long and horrible named function names.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:24:53 +08:00
Thomas Gleixner
21a3d3d4c8 thermal/x86_pkg_temp: Replace open coded cpu search
find_next_sibling() iterates over the online cpus and searches for a cpu
with the same package id as the current cpu. This is a pointless exercise
as topology_core_cpumask() allows a simple cpumask search for an online cpu
on the same package.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:24:44 +08:00
Thomas Gleixner
89baa56be7 thermal/x86_pkg_temp: Remove redundant package search
In pkg_temp_thermal_device_remove() the package device is searched at the
beginning of the function. When the device refcount becomes zero another
search for the same device is conducted. Remove the pointless loop and use
the device pointer which was retrieved at the beginning of the function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:24:38 +08:00
Thomas Gleixner
768bd13c93 thermal/x86_pkg_temp: Cleanup thermal interrupt handling
Wenn a package is removed nothing restores the thermal interrupt MSR so
the content will be stale when a CPU of that package becomes online again.

Aside of that the work function reenables interrupts before acknowledging
the current one, which is the wrong order to begin with.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:24:20 +08:00
Krzysztof Kozlowski
f37fabb864 thermal: hwmon: Properly report critical temperature in sysfs
In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space.  It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.

For example:
	/sys/class/hwmon/hwmon0/temp1_crit:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_type:active
	/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
	/sys/class/thermal/thermal_zone0/trip_point_3_type:critical

Since commit e68b16abd9 ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided.  However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().

Fixes: e68b16abd9 ("thermal: add hwmon sysfs I/F")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-30 10:07:13 +08:00
Kent Overstreet
3816199506 block: add bio_iov_iter_get_pages()
This is a helper that pins down a range from an iov_iter and adds it to
a bio without requiring a separate memory allocation for the page array.
It will be used for upcoming direct I/O implementations for block devices
and iomap based file systems.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[hch: ported to the iov_iter interface, renamed and added comments.
      All blame should be directed to me and all fame should go to Kent
      after this!]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

(cherry picked from commit 9cd56d916aa481ce8f56d9c5302a6ed90c2e0b5f)
2016-11-30 13:02:03 +11:00
Vitaly Kuznetsov
93ba222255 hv_netvsc: remove excessive logging on MTU change
When we change MTU or the number of channels on a netvsc device we get the
following logged:

 hv_netvsc bf5edba8...: net device safe to remove
 hv_netvsc: hv_netvsc channel opened successfully
 hv_netvsc bf5edba8...: Send section size: 6144, Section count:2560
 hv_netvsc bf5edba8...: Device MAC 00:15:5d:1e:91:12 link state up

This information is useful as debug at most.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:50:07 -05:00
Dave Chinner
e3df41f978 Merge branch 'xfs-4.10-misc-fixes-2' into iomap-4.10-directio 2016-11-30 12:49:38 +11:00
David S. Miller
736c9ba2f2 Merge branch 'mlxsw-enhancements'
Jiri Pirko says:

====================
mlxsw: couple of enhancements and fixes

Couple of enhancements and fixes from Ido.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:48:52 -05:00
Ido Schimmel
523779c734 mlxsw: core: Change order of operations in removal path
We call bus->init() before allocating 'lag.mapping'. Change the order of
operations in removal path to reflect that.

This makes the error path of mlxsw_core_bus_device_register() symmetric
with mlxsw_core_bus_device_unregister().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:48:51 -05:00
Ido Schimmel
81d4d7289a mlxsw: core: Add missing rollback in error path
Without this rollback, the thermal zone is still registered during the
error path, whereas its private data is freed upon the destruction of
the underlying bus device due to the use of devm_kzalloc(). This results
in use after free.

Fix this by calling mlxsw_thermal_fini() from the appropriate place in
the error path.

Fixes: a50c1e3565 ("mlxsw: core: Implement thermal zone")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:48:51 -05:00
Ido Schimmel
87259f1877 mlxsw: spectrum_buffers: Limit size of pools
The shared buffer pools are containers whose size is used to calculate
the maximum usage for packets from / to a specific port / {port, PG/TC},
when dynamic threshold is employed.

While it's perfectly fine for the sum of the pools to exceed the maximum
size of the shared buffer, a single pool cannot.

Add a check when the pool size is set and forbid sizes larger than the
maximum size of the shared buffer.

Without the patch:
$ devlink sb pool set pci/0000:03:00.0 pool 0 size 999999999 thtype
dynamic
// No error is returned

With the patch:
$ devlink sb pool set pci/0000:03:00.0 pool 0 size 999999999 thtype
dynamic
devlink answers: Invalid argument

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:48:51 -05:00
Ido Schimmel
f414b48e92 mlxsw: resources: Add maximum buffer size
We need to be able to limit the size of shared buffer pools, so query
the maximum size from the device during init.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:48:51 -05:00
Arnaldo Carvalho de Melo
a510887824 GSO: Reload iph after pskb_may_pull
As it may get stale and lead to use after free.

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexander Duyck <aduyck@mirantis.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Fixes: cbc53e08a7 ("GSO: Add GSO type for fixed IPv4 ID")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:45:54 -05:00
Jiri Pirko
725cbb62e7 sched: cls_flower: remove from hashtable only in case skip sw flag is not set
Be symmetric to hashtable insert and remove filter from hashtable only
in case skip sw flag is not set.

Fixes: e69985c67c ("net/sched: cls_flower: Introduce support in SKIP SW flag")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:44:38 -05:00
Eric Dumazet
648f0c28df net/dccp: fix use-after-free in dccp_invalid_packet
pskb_may_pull() can reallocate skb->head, we need to reload dh pointer
in dccp_invalid_packet() or risk use after free.

Bug found by Andrey Konovalov using syzkaller.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:37:26 -05:00
Arnd Bergmann
67ea7ef19b mlxsw: switchib: add MLXSW_PCI dependency
The newly added switchib driver fails to link if MLXSW_PCI=m:

drivers/net/ethernet/mellanox/mlxsw/mlxsw_switchib.o: In function^Cmlxsw_sib_module_exit':
switchib.c:(.exit.text+0x8): undefined reference to `mlxsw_pci_driver_unregister'
switchib.c:(.exit.text+0x10): undefined reference to `mlxsw_pci_driver_unregister'
drivers/net/ethernet/mellanox/mlxsw/mlxsw_switchib.o: In function `mlxsw_sib_module_init':
switchib.c:(.init.text+0x28): undefined reference to `mlxsw_pci_driver_register'
switchib.c:(.init.text+0x38): undefined reference to `mlxsw_pci_driver_register'
switchib.c:(.init.text+0x48): undefined reference to `mlxsw_pci_driver_unregister'

The other two such sub-drivers have a dependency, so add the same one
here. In theory we could allow this driver if MLXSW_PCI is disabled,
but it's probably not worth it.

Fixes: d1ba526384 ("mlxsw: switchib: Introduce SwitchIB and SwitchIB silicon driver")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:36:25 -05:00
Geert Uytterhoeven
0075bd692d net: phy: Fix use after free in phy_detach()
If device_release_driver(&phydev->mdio.dev) is called, it releases all
resources belonging to the PHY device. Hence the subsequent call to
phy_led_triggers_unregister() will access already freed memory when
unregistering the LEDs.

Move the call to phy_led_triggers_unregister() before the possible call
to device_release_driver() to fix this.

Fixes: 2e0bc452f4 ("net: phy: leds: add support for led triggers on phy link state change")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Zach Brown <zach.brown@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:35:24 -05:00
Zumeng Chen
ffac0e967f net: macb: ensure ordering write to re-enable RX smoothly
When a hardware issue happened as described by inline comments, the register
write pattern looks like the following:

<write ~MACB_BIT(RE)>
+ wmb();
<write MACB_BIT(RE)>

There might be a memory barrier between these two write operations, so add wmb
to ensure an flip from 0 to 1 for NCR.

Signed-off-by: Zumeng Chen <zumeng.chen@windriver.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:33:55 -05:00
Cyrille Pitchen
a0b44eea37 net: macb: fix the RX queue reset in macb_rx()
On macb only (not gem), when a RX queue corruption was detected from
macb_rx(), the RX queue was reset: during this process the RX ring
buffer descriptor was initialized by macb_init_rx_ring() but we forgot
to also set bp->rx_tail to 0.

Indeed, when processing the received frames, bp->rx_tail provides the
macb driver with the index in the RX ring buffer of the next buffer to
process. So when the whole ring buffer is reset we must also reset
bp->rx_tail so the driver is synchronized again with the hardware.

Since macb_init_rx_ring() is called from many locations, currently from
macb_rx() and macb_init_rings(), we'd rather add the "bp->rx_tail = 0;"
line inside macb_init_rx_ring() than add the very same line after each
call of this function.

Without this fix, the rx queue is not reset properly to recover from
queue corruption and connection drop may occur.

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Fixes: 9ba723b081 ("net: macb: remove BUG_ON() and reset the queue to handle RX errors")
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 20:01:31 -05:00
Pavel Machek
22d3efe5f6 stmmac: fix comments, make debug output consistent
Fix comments, add some new, and make debugfs output consistent.

Signed-off-by: Pavel Machek <pavel@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:53:22 -05:00
Daniel Mack
01ae87eab5 bpf: cgroup: fix documentation of __cgroup_bpf_update()
There's a 'not' missing in one paragraph. Add it.

Fixes: 3007098494 ("cgroup: add support for eBPF programs")
Signed-off-by: Daniel Mack <daniel@zonque.org>
Reported-by: Rami Rosen <roszenrami@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:50:59 -05:00
Herbert Xu
707693c8a4 netlink: Call cb->done from a worker thread
The cb->done interface expects to be called in process context.
This was broken by the netlink RCU conversion.  This patch fixes
it by adding a worker struct to make the cb->done call where
necessary.

Fixes: 21e4902aea ("netlink: Lockless lookup with RCU grace...")
Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:48:38 -05:00