Commit Graph

90886 Commits

Author SHA1 Message Date
Chao Yu
4ac912427c f2fs: introduce free nid bitmap
In scenario of intensively node allocation, free nids will be ran out
soon, then it needs to stop to load free nids by traversing NAT blocks,
in worse case, if NAT blocks does not be cached in memory, it generates
IOs which slows down our foreground operations.

In order to speed up node allocation, in this patch we introduce a new
free_nid_bitmap array, so there is an bitmap table for each NAT block,
Once the NAT block is loaded, related bitmap cache will be switched on,
and bitmap will be set during traversing nat entries in NAT block, later
we can query and update nid usage status in memory completely.

With such implementation, I expect performance of node allocation can be
improved in the long-term after filesystem image is mounted.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:47 -08:00
Jaegeuk Kim
22ad0b6ab4 f2fs: add bitmaps for empty or full NAT blocks
This patches adds bitmaps to represent empty or full NAT blocks containing
free nid entries.

If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll
use these bitmaps when building free nids. In order to avoid checkpointing
burden, up-to-date bitmaps will be flushed only during umount time. So,
normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs
which recovers this bitmap again.

After this patch, we build free nids from nid #0 at mount time to make more
full NAT blocks, but in runtime, we check empty NAT blocks to load free nids
without loading any NAT pages from disk.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:54 -08:00
Benjamin Gaignard
38b0b219fb of: add devm_ functions for populate and depopulate
Lots of calls to of_platform_populate() are not unbalanced by a call
to of_platform_depopulate(). This create issues while drivers are
bind/unbind.

In way to solve those issues is to add devm_of_platform_populate()
which will call of_platform_depopulate() when the device is unbound
from the bus.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1487952874-23635-2-git-send-email-benjamin.gaignard@linaro.org
2017-02-27 17:20:13 +01:00
Michael S. Tsirkin
51be7a9a26 virtio_mmio: expose header to userspace
It's handy for userspace emulators like QEMU.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27 16:31:23 +02:00
David S. Miller
4ca257eed6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains netfilter fixes for you net tree,
they are:

1) Missing ct zone size in the nft_ct initialization path, patch
   from Florian Westphal.

2) Two patches for netfilter uapi headers, one to remove unnecessary
   sysctl.h inclusion and another to fix compilation of xt_hashlimit.h
   in userspace, from Dmitry V. Levin.

3) Patch to fix a sloppy change in nf_ct_expect that incorrectly
   simplified nf_ct_expect_related_report() in the previous nf-next
   batch. This also includes another patch for __nf_ct_expect_check()
   to report success by returning 0 to keep it consistent with other
   existing functions. From Jarno Rajahalme.

4) The ->walk() iterator of the new bitmap set type goes over the real
   bitmap size, this results in incorrect dumps when NFTA_SET_USERDATA
   is used.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-27 09:17:43 -05:00
Rafael J. Wysocki
6dbf5cea05 cpuidle: menu: Avoid taking spinlock for accessing QoS values
After commit 9908859aca (cpuidle/menu: add per CPU PM QoS resume
latency consideration) the cpuidle menu governor calls
dev_pm_qos_read_value() on CPU devices to read the current resume
latency QoS constraint values for them.  That function takes a spinlock
to prevent the device's power.qos pointer from becoming NULL during
the access which is a problem for the RT patchset where spinlocks are
converted into mutexes and the idle loop stops working.

However, it is not even necessary for the menu governor to take
that spinlock, because the power.qos pointer accessed under it
cannot be modified during the access anyway.

For this reason, introduce a "raw" routine for accessing device
QoS resume latency constraints without locking and use it in the
menu governor.

Fixes: 9908859aca (cpuidle/menu: add per CPU PM QoS resume latency consideration)
Acked-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-27 15:07:38 +01:00
Michał Winiarski
ca7a45ba6f drm/i915/skl: Add missing SKL ID
Used by production device:
    Intel(R) Iris(TM) Graphics P555

Cc: <stable@vger.kernel.org>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227112256.20060-1-michal.winiarski@intel.com
2017-02-27 13:38:07 +02:00
Herbert Xu
016df0abc5 crypto: api - Add crypto_requires_off helper
This patch adds crypto_requires_off which is an extension of
crypto_requires_sync for similar bits such as NEED_FALLBACK.

Cc: stable@vger.kernel.org #4.10
Suggested-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-27 18:09:39 +08:00
Manasi Navare
40ee6fbef7 drm: Add a new connector atomic property for link status
At the time userspace does setcrtc, we've already promised the mode
would work. The promise is based on the theoretical capabilities of
the link, but it's possible we can't reach this in practice. The DP
spec describes how the link should be reduced, but we can't reduce
the link below the requirements of the mode. Black screen follows.

One idea would be to have setcrtc return a failure. However, it
already should not fail as the atomic checks have passed. It would
also conflict with the idea of making setcrtc asynchronous in the
future, returning before the actual mode setting and link training.

Another idea is to train the link "upfront" at hotplug time, before
pruning the mode list, so that we can do the pruning based on
practical not theoretical capabilities. However, the changes for link
training are pretty drastic, all for the sake of error handling and
DP compliance, when the most common happy day scenario is the current
approach of link training at mode setting time, using the optimal
parameters for the mode. It is also not certain all hardware could do
this without the pipe on; not even all our hardware can do this. Some
of this can be solved, but not trivially.

Both of the above ideas also fail to address link degradation *during*
operation.

The solution is to add a new "link-status" connector property in order
to address link training failure in a way that:
a) changes the current happy day scenario as little as possible, to
avoid regressions, b) can be implemented the same way by all drm
drivers, c) is still opt-in for the drivers and userspace, and opting
out doesn't regress the user experience, d) doesn't prevent drivers
from implementing better or alternate approaches, possibly without
userspace involvement. And, of course, handles all the issues presented.
In the usual happy day scenario, this is always "good". If something
fails during or after a mode set, the kernel driver can set the link
status to "bad" and issue a hotplug uevent for userspace to have it
re-check the valid modes through GET_CONNECTOR IOCTL, and try modeset
again. If the theoretical capabilities of the link can't be reached,
the mode list is trimmed based on that.

v7 by Jani:
* Rebase, simplify set property while at it, checkpatch fix
v6:
* Fix a typo in kernel doc (Sean Paul)
v5:
* Clarify doc for silent rejection of atomic properties by driver (Daniel Vetter)
v4:
* Add comments in kernel-doc format (Daniel Vetter)
* Update the kernel-doc for link-status (Sean Paul)
v3:
* Fixed a build error (Jani Saarinen)
v2:
* Removed connector->link_status (Daniel Vetter)
* Set connector->state->link_status in drm_mode_connector_set_link_status_property
(Daniel Vetter)
* Set the connector_changed flag to true if connector->state->link_status changed.
* Reset link_status to GOOD in update_output_state (Daniel Vetter)
* Never allow userspace to set link status from Good To Bad (Daniel Vetter)

Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tony Cheng <tony.cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Eric Anholt <eric@anholt.net> (for the -modesetting patch)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/0182487051aa9f1594820e35a4853de2f8747b4e.1481883920.git.jani.nikula@intel.com
2017-02-27 10:24:25 +01:00
Daniel Vetter
c771633daf Merge airlied/drm-next into drm-misc-next
Backmerge the main pull request to sync up with all the newly landed
drivers. Otherwise we'll have chaos even before 4.12 started in
earnest.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-02-27 09:30:11 +01:00
Archit Taneja
7adbd209ce drm/doc: Fix up some kms function names
A couple of the kms functions didn't have the correct/newest names.
This prevented them to be identified as refs in the html doc.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222084741.8485-1-architt@codeaurora.org
2017-02-27 08:47:38 +05:30
Nicholas Bellinger
c87ba9c49c target: Add counters for ABORT_TASK success + failure
This patch introduces two counters for ABORT_TASK success +
failure under:

   /sys/kernel/config/target/core/$HBA/$DEV/statistics/scsi_tgt_dev/

that are useful for diagnosing various backend device latency
and front fabric issues.

Normally when folks see alot of aborts_complete happening,
it means the backend device I/O completion latency is high,
and not returning completions fast enough before host side
timeouts trigger.

And normally when folks see alot of aborts_no_task, it means
completions are being posted by target-core into fabric driver
code, but the responses aren't making it back to the host.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-02-26 16:21:06 -08:00
Nicholas Bellinger
bd4e2d2907 target: Fix NULL dereference during LUN lookup + active I/O shutdown
When transport_clear_lun_ref() is shutting down a se_lun via
configfs with new I/O in-flight, it's possible to trigger a
NULL pointer dereference in transport_lookup_cmd_lun() due
to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD
checking before incrementing lun->lun_ref.count after
lun->lun_ref has switched to atomic_t mode.

This results in a NULL pointer dereference as LUN shutdown
code in core_tpg_remove_lun() continues running after the
existing ->release() -> core_tpg_lun_ref_release() callback
completes, and clears the RCU protected se_lun->lun_se_dev
pointer.

During the OOPs, the state of lun->lun_ref in the process
which triggered the NULL pointer dereference looks like
the following on v4.1.y stable code:

struct se_lun {
  lun_link_magic = 4294932337,
  lun_status = TRANSPORT_LUN_STATUS_FREE,

  .....

  lun_se_dev = 0x0,
  lun_sep = 0x0,

  .....

  lun_ref = {
    count = {
      counter = 1
    },
    percpu_count_ptr = 3,
    release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>,
    confirm_switch = 0x0,
    force_atomic = false,
    rcu = {
      next = 0xffff88154fa1a5d0,
      func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu>
    }
  }
}

To address this bug, use percpu_ref_tryget_live() to ensure
once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref
has switched to atomic_t, all new I/Os will fail to obtain
a new lun->lun_ref reference.

Also use an explicit percpu_ref_kill_and_confirm() callback
to block on ->lun_ref_comp to allow the first stage and
associated RCU grace period to complete, and then block on
->lun_ref_shutdown waiting for the final percpu_ref_put()
to drop the last reference via transport_lun_remove_cmd()
before continuing with core_tpg_remove_lun() shutdown.

Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Tested-by: Vaibhav Tandon <vst@datera.io>
Cc: Vaibhav Tandon <vst@datera.io>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-02-26 16:08:44 -08:00
Chris Wilson
2955b73def dma-buf/reservation: Wrap ww_mutex_trylock
In a similar fashion to reservation_object_lock() and
reservation_object_unlock(), ww_mutex_trylock is also useful and so is
worth wrapping for consistency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Add __must_check Joonas wants.]
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221093000.22802-1-chris@chris-wilson.co.uk
2017-02-26 22:43:44 +01:00
Linus Torvalds
e5d56efc97 Merge tag 'watchdog-for-linus-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull watchdog updates from Guenter Roeck:
 "Wim asked me to handle the watchdog pull request this time around.

  Key changes:

   - New drivers: Cortina Gemini, ZTE's zx2967 family, NIC7018

   - Convert to use device managed functions: ebc-c384_wdt, tegra_wdt,
     da9063_wdt, da9062_wdt, da9055_wdt, da9052_wdt, bcm2835_wdt,
     mena21_wdt, wm831x_wdt, digicolor_wdt, intel-mid_wdt, meson_wdt,
     sunxi_wdt, aspeed_wdt, coh901327_wdt, iTCO_wdt

   - Use watchdog core to install restart handler: tangox, dw_wdt,
     bcm2835_wdt, asm9260_wdt, bcm47xx_wdt

   - Convert ts72xx_wdt driver to watchdog core

   - Let core handle heartbeat in ep93xx_wdt driver

   - Enable COMPILE_TEST where possible

   - Various other improvements"

* tag 'watchdog-for-linus-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (54 commits)
  watchdog: s3c2410: Add prefix to local function
  watchdog: s3c2410: Select MFD_SYSCON on all Exynos platforms
  watchdog: s3c2410: Use dev_dbg instead of pr_info
  watchdog: s3c2410: Fix infinite interrupt in soft mode
  watchdog: s3c2410: Remove confusing CONFIG prefix from local defines
  watchdog: softdog: make pretimeout support a compile option
  watchdog: zx2967: add watchdog controller driver for ZTE's zx2967 family
  dt: bindings: add documentation for zx2967 family watchdog controller
  watchdog: sama5d4: Implement resume hook
  watchdog: sama5d4: Cache MR instead of a partial config
  watchdog: ts72xx_wdt: convert driver to watchdog core
  watchdog: ep93xx_wdt: cleanup and let the core handle the heartbeat
  watchdog: RDC321X_WDT always depends on PCI
  watchdog: add driver for Cortina Gemini watchdog
  watchdog: add DT bindings for Cortina Gemini
  watchdog: constify watchdog_ops structures
  watchdog: Introduce watchdog_stop_on_unregister helper
  watchdog: ebc-c384_wdt: Utilize devm_ functions in driver probe callback
  watchdog: tegra_wdt: Convert to use device managed functions
  watchdog: da9063_wdt: Convert to use device managed functions
  ...
2017-02-26 13:19:17 -08:00
Joe Perches
3c6d6e0fbf drm: drm_printer: add __printf validation
drm_printf does not currently use the compiler to verify
format and arguments.  Make it do so.

Miscellanea:

o Add appropriate #include files for __printf and struct va_format
o Convert dev_printk to dev_info

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/133858f214e9b90f92bb8eb44c6b1dc04429933d.1487201526.git.joe@perches.com
2017-02-26 21:43:08 +01:00
Daniel Vetter
8e22e1b349 Merge airlied/drm-next into drm-misc-next
Backmerge the main pull request to sync up with all the newly landed
drivers. Otherwise we'll have chaos even before 4.12 started in
earnest.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-02-26 21:34:42 +01:00
Linus Torvalds
9003ed1fed Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "This has a series of fixes and cleanups that Dave Sterba has been
  collecting.

  There is a pretty big variety here, cleaning up internal APIs and
  fixing corner cases"

* 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (124 commits)
  Btrfs: use the correct type when creating cow dio extent
  Btrfs: fix deadlock between dedup on same file and starting writeback
  btrfs: use btrfs_debug instead of pr_debug in transaction abort
  btrfs: btrfs_truncate_free_space_cache always allocates path
  btrfs: free-space-cache, clean up unnecessary root arguments
  btrfs: convert btrfs_inc_block_group_ro to accept fs_info
  btrfs: flush_space always takes fs_info->fs_root
  btrfs: pass fs_info to (more) routines that are only called with extent_root
  btrfs: qgroup: Move half of the qgroup accounting time out of commit trans
  btrfs: remove unused parameter from adjust_slots_upwards
  btrfs: remove unused parameters from __btrfs_write_out_cache
  btrfs: remove unused parameter from cleanup_write_cache_enospc
  btrfs: remove unused parameter from __add_inode_ref
  btrfs: remove unused parameter from clone_copy_inline_extent
  btrfs: remove unused parameters from btrfs_cmp_data
  btrfs: remove unused parameter from __add_inline_refs
  btrfs: remove unused parameters from scrub_setup_wr_ctx
  btrfs: remove unused parameter from create_snapshot
  btrfs: remove unused parameter from init_first_rw_device
  btrfs: remove unused parameter from __btrfs_alloc_chunk
  ...
2017-02-25 14:53:58 -08:00
Linus Torvalds
5d8a00eee2 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
 "The usual collection of new drivers, non-critical fixes, and updates
  to existing clk drivers. The bulk of the work is on Allwinner and
  Rockchip SoCs, but there's also an Intel Atom driver in here too.

  New Drivers:
   - Tegra BPMP firmware
   - Hisilicon hi3660 SoCs
   - Rockchip rk3328 SoCs
   - Intel Atom PMC
   - STM32F746
   - IDT VersaClock 5P49V5923 and 5P49V5933
   - Marvell mv98dx3236 SoCs
   - Allwinner V3s SoCs

  Removed Drivers:
   - Samsung Exynos4415 SoCs

  Updates:
   - Migrate ABx500 to OF
   - Qualcomm IPQ4019 CPU clks and general PLL support
   - Qualcomm MSM8974 RPM
   - Rockchip non-critical fixes and clk id additions
   - Samsung Exynos4412 CPUs
   - Socionext UniPhier NAND and eMMC support
   - ZTE zx296718 i2s and other audio clks
   - Renesas CAN and MSIOF clks for R-Car M3-W
   - Renesas resets for R-Car Gen2 and Gen3 and RZ/G1
   - TI CDCE913, CDCE937, and CDCE949 clk generators
   - Marvell Armada ap806 CPU frequencies
   - STM32F4* I2S/SAI support
   - Broadcom BCM2835 DSI support
   - Allwinner sun5i and A80 conversion to new style clk bindings"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (130 commits)
  clk: renesas: mstp: ensure register writes complete
  clk: qcom: Do not drop device node twice
  clk: mvebu: adjust clock handling for the CP110 system controller
  clk: mvebu: Expand mv98dx3236-core-clock support
  clk: zte: add i2s clocks for zx296718
  clk: sunxi-ng: sun9i-a80: Fix wrong pointer passed to PTR_ERR()
  clk: sunxi-ng: select SUNXI_CCU_MULT for sun5i
  clk: sunxi-ng: Check kzalloc() for errors and cleanup error path
  clk: tegra: Add BPMP clock driver
  clk: uniphier: add eMMC clock for LD11 and LD20 SoCs
  clk: uniphier: add NAND clock for all UniPhier SoCs
  ARM: dts: sun9i: Switch to new clock bindings
  clk: sunxi-ng: Add A80 Display Engine CCU
  clk: sunxi-ng: Add A80 USB CCU
  clk: sunxi-ng: Add A80 CCU
  clk: sunxi-ng: Support separately grouped PLL lock status register
  clk: sunxi-ng: mux: Get closest parent rate possible with CLK_SET_RATE_PARENT
  clk: sunxi-ng: mux: honor CLK_SET_RATE_NO_REPARENT flag
  clk: sunxi-ng: mux: Fix determine_rate for mux clocks with pre-dividers
  clk: qcom: SDHCI enablement on Nexus 5X / 6P
  ...
2017-02-25 14:28:06 -08:00
Linus Torvalds
7067739df2 Merge branch 'i2c/for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "I2C has for you two new drivers (Tegra BPMP and STM32F4), interrupt
  support for pca954x muxes, and a bunch of driver bugfixes and
  improvements. Nothing really special this cycle.

  A few commits have been added to my tree just recently. Those are the
  Tegra BPMP driver and a few straightforward bugfixes or cleanups which
  I prefer to have upstream rather soonish. The rest had proper
  linux-next exposure"

* 'i2c/for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (25 commits)
  i2c: thunderx: Replace pci_enable_msix()
  i2c: exynos5: fix arbitration lost handling
  i2c: exynos5: disable fifo-almost-empty irq signal when necessary
  i2c: at91: ensure state is restored after suspending
  i2c: bcm2835: Avoid possible NULL ptr dereference
  i2c: Add Tegra BPMP I2C proxy driver
  dt-bindings: Add Tegra186 BPMP I2C binding
  misc: eeprom: at24: use device_property_*() functions instead of of_get_property()
  i2c: mux: pca954x: Add interrupt controller support
  dt: bindings: i2c-mux-pca954x: Add documentation for interrupt controller
  i2c: mux: pca954x: Add missing pca9542 definition to chip_desc
  i2c: riic: correctly finish transfers
  i2c: i801: Add support for Intel Gemini Lake
  i2c: mux: pca9541: Export OF device ID table as module aliases
  i2c: mux: pca954x: Export OF device ID table as module aliases
  i2c: mux: mlxcpld: remove unused including <linux/version.h>
  i2c: busses: constify i2c_algorithm structures
  i2c: i2c-mux-gpio: rename i2c-gpio-mux to i2c-mux-gpio
  i2c: sh_mobile: document support for r8a7796 (R-Car M3-W)
  i2c: i2c-cros-ec-tunnel: Reduce logging noise
  ...
2017-02-25 14:21:18 -08:00
Linus Torvalds
ac1820fb28 Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Linus Torvalds
edccb59429 Merge tag 'fbdev-v4.11' of git://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:

 - fix for font color when console is switched to another fb driver

 - deferred probing fixes for simplefb driver

 - preparations to add support of an optional GPIO to enable panel for
   ARM CLCD driver

 - some improvements for ssd1307fb driver

 - cleanups for OMAP fbdev LCD drivers

 - misc fixes/cleanups for various fb drivers

* tag 'fbdev-v4.11' of git://github.com/bzolnier/linux: (30 commits)
  video: fbdev: fsl-diu-fb: fix spelling mistake "palette"
  fbdev: ssd1307fb: include linux/gpio/consumer.h
  video: fbdev: fsl-diu-fb: remove impossible condition
  video: fbdev: amifb: remove impossible condition
  fbdev/ssd1307fb: clear screen in probe
  fbdev/ssd1307fb: add support to enable VBAT
  fbdev: ssd1307fb: Make reset gpio devicetree property optional
  fbdev: ssd1307fb: Remove reset-active-low from the DT binding document
  fbdev: ssd1307fb: Start to use gpiod API for reset gpio
  video: fbdev: sh_mobile_lcdcfb: fix error return code in sh_mobile_lcdc_probe()
  video: fbdev: offb: switch to using for_each_node_by_type
  video/console: use setup_timer and mod_timer instead of init_timer
  fbdev: omap/lcd: Make callbacks optional
  fbdev: omap/lcd: Staticize non-exported lcd_panel structs
  fbdev: omap/lcd: Remove no-op driver callbacks
  video/mbx: use simple_open()
  video: fbdev: stifb: handle NULL return value from ioremap_nocache
  video: fbdev: pmagb-b-fb: Remove bad `__init' annotation
  video: fbdev: pmag-ba-fb: Remove bad `__init' annotation
  video: ARM CLCD: use panel device node for getting backlight and mode
  ...
2017-02-25 13:20:22 -08:00
Joe Perches
6e5c8381d1 treewide: Remove remaining executable attributes from source files
These are the current source files that should not have
executable attributes set.

[ Normally this would be sent through Andrew Morton's tree
  but his quilt tools don't like permission only patches. ]

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-25 12:12:50 -08:00
Linus Torvalds
7b46588f36 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - almost all of the rest of MM

 - misc bits

 - KASAN updates

 - procfs

 - lib/ updates

 - checkpatch updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (124 commits)
  checkpatch: remove false unbalanced braces warning
  checkpatch: notice unbalanced else braces in a patch
  checkpatch: add another old address for the FSF
  checkpatch: update $logFunctions
  checkpatch: warn on logging continuations
  checkpatch: warn on embedded function names
  lib/lz4: remove back-compat wrappers
  fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version
  crypto: change LZ4 modules to work with new LZ4 module version
  lib/decompress_unlz4: change module to work with new LZ4 module version
  lib: update LZ4 compressor module
  lib/test_sort.c: make it explicitly non-modular
  lib: add CONFIG_TEST_SORT to enable self-test of sort()
  rbtree: use designated initializers
  linux/kernel.h: fix DIV_ROUND_CLOSEST to support negative divisors
  lib/find_bit.c: micro-optimise find_next_*_bit
  lib: add module support to atomic64 tests
  lib: add module support to glob tests
  lib: add module support to crc32 tests
  kernel/ksysfs.c: add __ro_after_init to bin_attribute structure
  ...
2017-02-25 10:29:09 -08:00
Dmitry V. Levin
f216827342 uapi: fix linux/netfilter/xt_hashlimit.h userspace compilation error
Include <linux/limits.h> like some of uapi/linux/netfilter/xt_*.h
headers do to fix the following linux/netfilter/xt_hashlimit.h
userspace compilation error:

/usr/include/linux/netfilter/xt_hashlimit.h:90:12: error: 'NAME_MAX' undeclared here (not in a function)
  char name[NAME_MAX];

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-25 13:32:04 +01:00
Josh Poimboeuf
3d1e236022 objtool: Prevent GCC from merging annotate_unreachable()
0-day bot reported some new objtool warnings which were caused by the
new annotate_unreachable() macro:

  fs/afs/flock.o: warning: objtool: afs_do_unlk()+0x0: duplicate frame pointer save
  fs/afs/flock.o: warning: objtool: afs_do_unlk()+0x0: frame pointer state mismatch
  fs/btrfs/delayed-inode.o: warning: objtool: btrfs_delete_delayed_dir_index()+0x0: duplicate frame pointer save
  fs/btrfs/delayed-inode.o: warning: objtool: btrfs_delete_delayed_dir_index()+0x0: frame pointer state mismatch
  fs/dlm/lock.o: warning: objtool: _grant_lock()+0x0: duplicate frame pointer save
  fs/dlm/lock.o: warning: objtool: _grant_lock()+0x0: frame pointer state mismatch
  fs/ocfs2/alloc.o: warning: objtool: ocfs2_mv_path()+0x0: duplicate frame pointer save
  fs/ocfs2/alloc.o: warning: objtool: ocfs2_mv_path()+0x0: frame pointer state mismatch

It turns out that, for older versions of GCC, if a function has multiple
BUG() incantations, GCC will sometimes merge the corresponding
annotate_unreachable() inline asm statements into a single block.  That
has the undesirable effect of removing one of the entries in the
__unreachable section, confusing objtool greatly.

A workaround for this issue is to ensure that each instance of the
inline asm statement uses a different label, so that GCC sees the
statements are unique and leaves them alone.  The inline asm ‘%=’ token
could be used for that, but unfortunately older versions of GCC don't
support it.  So I implemented a poor man's version of it with the
__LINE__ macro.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d1091c7fa3 ("objtool: Improve detection of BUG() and other dead ends")
Link: http://lkml.kernel.org/r/0c14b00baf9f68d1b0221ddb6c88b925181c8be8.1487997036.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-25 10:11:23 +01:00
Linus Torvalds
9e31489029 Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:
 "Highlights include:

   - optimized memset and memcpy routines, ~20% boot time saving

   - support for cpu idling

   - adding support for l.swa and l.lwa atomic operations (in spec from
     2014)

   - use atomics to implement: bitops, cmpxchg, futex

   - the atomics are in preparation for SMP support"

* tag 'openrisc-for-linus' of git://github.com/openrisc/linux: (25 commits)
  openrisc: head: Init r0 to 0 on start
  openrisc: Export ioremap symbols used by modules
  arch/openrisc/lib/memcpy.c: use correct OR1200 option
  openrisc: head: Remove unused strings
  openrisc: head: Move init strings to rodata section
  openrisc: entry: Fix delay slot detection
  openrisc: entry: Whitespace and comment cleanups
  scripts/checkstack.pl: Add openrisc support
  MAINTAINERS: Add the openrisc official repository
  openrisc: Add .gitignore
  openrisc: Add optimized memcpy routine
  openrisc: Add optimized memset
  openrisc: Initial support for the idle state
  openrisc: Fix the bitmask for the unit present register
  openrisc: remove unnecessary stddef.h include
  openrisc: add futex_atomic_* implementations
  openrisc: add optimized atomic operations
  openrisc: add cmpxchg and xchg implementations
  openrisc: add atomic bitops
  openrisc: add l.lwa/l.swa emulation
  ...
2017-02-24 18:37:03 -08:00
Sven Schmidt
69c78423b8 lib/lz4: remove back-compat wrappers
Remove the functions introduced as wrappers for providing backwards
compatibility to the prior LZ4 version.  They're not needed anymore
since there's no callers left.

Link: http://lkml.kernel.org/r/1486321748-19085-6-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Cc: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Rui Salvaterra <rsalvaterra@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:57 -08:00
Sven Schmidt
4e1a33b105 lib: update LZ4 compressor module
Patch series "Update LZ4 compressor module", v7.

This patchset updates the LZ4 compression module to a version based on
LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast
which provides an "acceleration" parameter as a tradeoff between high
compression ratio and high compression speed.

We want to use LZ4 fast in order to support compression in lustre and
(mostly, based on that) investigate data reduction techniques in behalf
of storage systems.

Also, it will be useful for other users of LZ4 compression, as with LZ4
fast it is possible to enable applications to use fast and/or high
compression depending on the usecase.  For instance, ZRAM is offering a
LZ4 backend and could benefit from an updated LZ4 in the kernel.

LZ4 homepage: http://www.lz4.org/
LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3

Benchmark (taken from [1], Core i5-4300U @1.9GHz):
----------------|--------------|----------------|----------
Compressor      | Compression  | Decompression  | Ratio
----------------|--------------|----------------|----------
memcpy          |  4200 MB/s   |  4200 MB/s     | 1.000
LZ4 fast 50     |  1080 MB/s   |  2650 MB/s     | 1.375
LZ4 fast 17     |   680 MB/s   |  2220 MB/s     | 1.607
LZ4 fast 5      |   475 MB/s   |  1920 MB/s     | 1.886
LZ4 default     |   385 MB/s   |  1850 MB/s     | 2.101

[1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html

[PATCH 1/5] lib: Update LZ4 compressor module
[PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version
[PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version
[PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version
[PATCH 5/5] lib/lz4: Remove back-compat wrappers

This patch (of 5):

Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet.  The kernel
module is inspired by the previous work by Chanho Min.  The updated LZ4
module will not break existing code since the patchset contains
appropriate changes.

API changes:

New method LZ4_compress_fast which differs from the variant available in
kernel by the new acceleration parameter, allowing to trade compression
ratio for more compression speed and vice versa.

LZ4_decompress_fast is the respective decompression method, featuring a
very fast decoder (multiple GB/s per core), able to reach RAM speed in
multi-core systems.  The decompressor allows to decompress data
compressed with LZ4 fast as well as the LZ4 HC (high compression)
algorithm.

Also the useful functions LZ4_decompress_safe_partial and
LZ4_compress_destsize were added.  The latter reverses the logic by
trying to compress as much data as possible from source to dest while
the former aims to decompress partial blocks of data.

A bunch of streaming functions were also added which allow
compressig/decompressing data in multiple steps (so called "streaming
mode").

The methods lz4_compress and lz4_decompress_unknownoutputsize are now
known as LZ4_compress_default respectivley LZ4_decompress_safe.  The old
methods will be removed since there's no callers left in the code.

[arnd@arndb.de: fix KERNEL_LZ4 support]
  Link: http://lkml.kernel.org/r/20170208211946.2839649-1-arnd@arndb.de
[akpm@linux-foundation.org: simplify]
[akpm@linux-foundation.org: fix the simplification]
[4sschmid@informatik.uni-hamburg.de: fix performance regressions]
  Link: http://lkml.kernel.org/r/1486898178-17125-2-git-send-email-4sschmid@informatik.uni-hamburg.de
[4sschmid@informatik.uni-hamburg.de: v8]
  Link: http://lkml.kernel.org/r/1487182598-15351-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Link: http://lkml.kernel.org/r/1486321748-19085-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Rui Salvaterra <rsalvaterra@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:57 -08:00
Kees Cook
f231aebfc4 rbtree: use designated initializers
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers.  These were identified
during allyesconfig builds of x86, arm, and arm64, with most initializer
fixes extracted from grsecurity.

Link: http://lkml.kernel.org/r/20161217010253.GA140470@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jie Chen <fykcee1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:57 -08:00
Niklas Söderlund
4f5901f5a6 linux/kernel.h: fix DIV_ROUND_CLOSEST to support negative divisors
While working on a thermal driver I encounter a scenario where the
divisor could be negative, instead of adding local code to handle this I
though I first try to add support for this in DIV_ROUND_CLOSEST.

Add support to DIV_ROUND_CLOSEST for negative divisors if both dividend
and divisor variable types are signed.  This should not alter current
behavior for users of the macro as previously negative divisors where
not supported.

Before:

DIV_ROUND_CLOSEST(  59,  4) =  15
DIV_ROUND_CLOSEST(  59, -4) = -14
DIV_ROUND_CLOSEST( -59,  4) = -15
DIV_ROUND_CLOSEST( -59, -4) =  14

After:

DIV_ROUND_CLOSEST(  59,  4) =  15
DIV_ROUND_CLOSEST(  59, -4) = -15
DIV_ROUND_CLOSEST( -59,  4) = -15
DIV_ROUND_CLOSEST( -59, -4) =  15

[akpm@linux-foundation.org: fix comment, per Guenter]
Link: http://lkml.kernel.org/r/20161222102217.29011-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:57 -08:00
Kees Cook
85caa95b9f bug: switch data corruption check to __must_check
The CHECK_DATA_CORRUPTION() macro was designed to have callers do
something meaningful/protective on failure.  However, using "return
false" in the macro too strictly limits the design patterns of callers.
Instead, let callers handle the logic test directly, but make sure that
the result IS checked by forcing __must_check (which appears to not be
able to be used directly on macro expressions).

Link: http://lkml.kernel.org/r/20170206204547.GA125312@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Gideon Israel Dsouza
a3f0825e7e compiler-gcc.h: add a new macro to wrap gcc attribute
Add __mode(x) into compiler-gcc.h as part of a cleanup task I've taken
up, to replace gcc specific attributes with macros.

The next patch is a cleanup of the m68k subsystem and it requires a new
macro to wrap __attribute__ ((mode (...)))

Link: http://lkml.kernel.org/r/1485540901-1988-2-git-send-email-gidisrael@gmail.com
Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Masahiro Yamada
16f4e31951 include/linux/iopoll.h: include <linux/ktime.h> instead of <linux/hrtimer.h>
The timer APIs this header needs are ktime_get(), ktime_add_us(), and
ktime_compare().  So, including <linux/ktime.h> seems enough.  This
commit will cut unnecessary header file parsing.

Link: http://lkml.kernel.org/r/1481679225-10885-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Mike Frysinger
1ca5eebb89 uapi: mqueue.h: add missing linux/types.h include
Commit 63159f5dcc ("uapi: Use __kernel_long_t in struct mq_attr")
changed the types from long to __kernel_long_t, but didn't add a
linux/types.h include.  Code that tries to include this header directly
breaks:

  /usr/include/linux/mqueue.h:26:2: error: unknown type name '__kernel_long_t'
  __kernel_long_t mq_flags; /* message queue flags   */

This also upsets configure tests for this header:

  checking linux/mqueue.h usability... no
  checking linux/mqueue.h presence... yes
  configure: WARNING: linux/mqueue.h: present but cannot be compiled
  configure: WARNING: linux/mqueue.h:     check for missing prerequisite headers?
  configure: WARNING: linux/mqueue.h: see the Autoconf documentation
  configure: WARNING: linux/mqueue.h:     section "Present But Cannot Be Compiled"
  configure: WARNING: linux/mqueue.h: proceeding with the compiler's result
  checking for linux/mqueue.h... no

Link: http://lkml.kernel.org/r/20170119194644.4403-1-vapier@gentoo.org
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Lafcadio Wluiki
796f571b0c procfs: use an enum for possible hidepid values
Previously, the hidepid parameter was checked by comparing literal
integers 0, 1, 2.  Let's add a proper enum for this, to make the
checking more expressive:

        0 → HIDEPID_OFF
        1 → HIDEPID_NO_ACCESS
        2 → HIDEPID_INVISIBLE

This changes the internal labelling only, the userspace-facing interface
remains unmodified, and still works with literal integers 0, 1, 2.

No functional changes.

Link: http://lkml.kernel.org/r/1484572984-13388-2-git-send-email-djalal@gmail.com
Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Greg Thelen
f9fa1d919c kasan: drain quarantine of memcg slab objects
Per memcg slab accounting and kasan have a problem with kmem_cache
destruction.
 - kmem_cache_create() allocates a kmem_cache, which is used for
   allocations from processes running in root (top) memcg.
 - Processes running in non root memcg and allocating with either
   __GFP_ACCOUNT or from a SLAB_ACCOUNT cache use a per memcg
   kmem_cache.
 - Kasan catches use-after-free by having kfree() and kmem_cache_free()
   defer freeing of objects. Objects are placed in a quarantine.
 - kmem_cache_destroy() destroys root and non root kmem_caches. It takes
   care to drain the quarantine of objects from the root memcg's
   kmem_cache, but ignores objects associated with non root memcg. This
   causes leaks because quarantined per memcg objects refer to per memcg
   kmem cache being destroyed.

To see the problem:

 1) create a slab cache with kmem_cache_create(,,,SLAB_ACCOUNT,)
 2) from non root memcg, allocate and free a few objects from cache
 3) dispose of the cache with kmem_cache_destroy() kmem_cache_destroy()
    will trigger a "Slab cache still has objects" warning indicating
    that the per memcg kmem_cache structure was leaked.

Fix the leak by draining kasan quarantined objects allocated from non
root memcg.

Racing memcg deletion is tricky, but handled.  kmem_cache_destroy() =>
shutdown_memcg_caches() => __shutdown_memcg_cache() => shutdown_cache()
flushes per memcg quarantined objects, even if that memcg has been
rmdir'd and gone through memcg_deactivate_kmem_caches().

This leak only affects destroyed SLAB_ACCOUNT kmem caches when kasan is
enabled.  So I don't think it's worth patching stable kernels.

Link: http://lkml.kernel.org/r/1482257462-36948-1-git-send-email-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Nathan Fontenot
dc18d706a4 memory-hotplug: use dev_online for memhp_auto_online
Commit 31bc3858ea ("add automatic onlining policy for the newly added
memory") provides the capability to have added memory automatically
onlined during add, but this appears to be slightly broken.

The current implementation uses walk_memory_range() to call
online_memory_block, which uses memory_block_change_state() to online
the memory.  Instead, we should be calling device_online() for the
memory block in online_memory_block().  This would online the memory
(the memory bus online routine memory_subsys_online() called from
device_online calls memory_block_change_state()) and properly update the
device struct offline flag.

As a result of the current implementation, attempting to remove a memory
block after adding it using auto online fails.  This is because doing a
remove, for instance

  echo offline > /sys/devices/system/memory/memoryXXX/state

uses device_offline() which checks the dev->offline flag.

Link: http://lkml.kernel.org/r/20170222220744.8119.19687.stgit@ltcalpine2-lp14.aus.stglabs.ibm.com
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Hugh Dickins
3a4f8a0b3f mm: remove shmem_mapping() shmem_zero_setup() duplicates
Remove the prototypes for shmem_mapping() and shmem_zero_setup() from
linux/mm.h, since they are already provided in linux/shmem_fs.h.  But
shmem_fs.h must then provide the inline stub for shmem_mapping() when
CONFIG_SHMEM is not set, and a few more cfiles now need to #include it.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1702081658250.1549@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Aneesh Kumar K.V
595cd8f256 mm/ksm: handle protnone saved writes when making page write protect
Without this KSM will consider the page write protected, but a numa
fault can later mark the page writable.  This can result in memory
corruption.

Link: http://lkml.kernel.org/r/1487498625-10891-3-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Aneesh Kumar K.V
288bc54949 mm/autonuma: let architecture override how the write bit should be stashed in a protnone pte.
Patch series "Numabalancing preserve write fix", v2.

This patch series address an issue w.r.t THP migration and autonuma
preserve write feature.  migrate_misplaced_transhuge_page() cannot deal
with concurrent modification of the page.  It does a page copy without
following the migration pte sequence.  IIUC, this was done to keep the
migration simpler and at the time of implemenation we didn't had THP
page cache which would have required a more elaborate migration scheme.
That means thp autonuma migration expect the protnone with saved write
to be done such that both kernel and user cannot update the page
content.  This patch series enables archs like ppc64 to do that.  We are
good with the hash translation mode with the current code, because we
never create a hardware page table entry for a protnone pte.

This patch (of 2):

Autonuma preserves the write permission across numa fault to avoid
taking a writefault after a numa fault (Commit: b191f9b106 " mm: numa:
preserve PTE write permissions across a NUMA hinting fault").
Architecture can implement protnone in different ways and some may
choose to implement that by clearing Read/ Write/Exec bit of pte.
Setting the write bit on such pte can result in wrong behaviour.  Fix
this up by allowing arch to override how to save the write bit on a
protnone pte.

[aneesh.kumar@linux.vnet.ibm.com: don't mark pte saved write in case of dirty_accountable]
  Link: http://lkml.kernel.org/r/1487942884-16517-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
[aneesh.kumar@linux.vnet.ibm.com: v3]
  Link: http://lkml.kernel.org/r/1487498625-10891-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1487050314-3892-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
David Rientjes
def5efe037 mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count
If madvise(2) advice will result in the underlying vma being split and
the number of areas mapped by the process will exceed
/proc/sys/vm/max_map_count as a result, return ENOMEM instead of EAGAIN.

EAGAIN is returned by madvise(2) when a kernel resource, such as slab,
is temporarily unavailable.  It indicates that userspace should retry
the advice in the near future.  This is important for advice such as
MADV_DONTNEED which is often used by malloc implementations to free
memory back to the system: we really do want to free memory back when
madvise(2) returns EAGAIN because slab allocations (for vmas, anon_vmas,
or mempolicies) cannot be allocated.

Encountering /proc/sys/vm/max_map_count is not a temporary failure,
however, so return ENOMEM to indicate this is a more serious issue.  A
followup patch to the man page will specify this behavior.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701241431120.42507@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Lucas Stach
712c604dcd mm: wire up GFP flag passing in dma_alloc_from_contiguous
The callers of the DMA alloc functions already provide the proper
context GFP flags.  Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.

Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Lucas Stach
e2f466e32f mm: cma_alloc: allow to specify GFP mask
Most users of this interface just want to use it with the default
GFP_KERNEL flags, but for cases where DMA memory is allocated it may be
called from a different context.

No functional change yet, just passing through the flag to the
underlying alloc_contig_range function.

Link: http://lkml.kernel.org/r/20170127172328.18574-2-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Lucas Stach
ca96b62534 mm: alloc_contig_range: allow to specify GFP mask
Currently alloc_contig_range assumes that the compaction should be done
with the default GFP_KERNEL flags.  This is probably right for all
current uses of this interface, but may change as CMA is used in more
use-cases (including being the default DMA memory allocator on some
platforms).

Change the function prototype, to allow for passing through the GFP mask
set by upper layers.

Also respect global restrictions by applying memalloc_noio_flags to the
passed in flags.

Link: http://lkml.kernel.org/r/20170127172328.18574-1-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Mike Rapoport
ca49ca7114 userfaultfd: non-cooperative: add event for exit() notification
Allow userfaultfd monitor track termination of the processes that have
memory backed by the uffd.

[rppt@linux.vnet.ibm.com: add comment]
  Link: http://lkml.kernel.org/r/20170202135448.GB19804@rapoport-lnxLink: http://lkml.kernel.org/r/1485542673-24387-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Mike Rapoport
897ab3e0c4 userfaultfd: non-cooperative: add event for memory unmaps
When a non-cooperative userfaultfd monitor copies pages in the
background, it may encounter regions that were already unmapped.
Addition of UFFD_EVENT_UNMAP allows the uffd monitor to track precisely
changes in the virtual memory layout.

Since there might be different uffd contexts for the affected VMAs, we
first should create a temporary representation for the unmap event for
each uffd context and then notify them one by one to the appropriate
userfault file descriptors.

The event notification occurs after the mmap_sem has been released.

[arnd@arndb.de: fix nommu build]
  Link: http://lkml.kernel.org/r/20170203165141.3665284-1-arnd@arndb.de
[mhocko@suse.com: fix nommu build]
  Link: http://lkml.kernel.org/r/20170202091503.GA22823@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1485542673-24387-3-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Kirill A. Shutemov
d53a8b49a6 mm: drop page_check_address{,_transhuge}
All users are gone. Let's drop them.

Link: http://lkml.kernel.org/r/20170129173858.45174-12-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Kirill A. Shutemov
ace71a19ce mm: introduce page_vma_mapped_walk()
Introduce a new interface to check if a page is mapped into a vma.  It
aims to address shortcomings of page_check_address{,_transhuge}.

Existing interface is not able to handle PTE-mapped THPs: it only finds
the first PTE.  The rest lefted unnoticed.

page_vma_mapped_walk() iterates over all possible mapping of the page in
the vma.

Link: http://lkml.kernel.org/r/20170129173858.45174-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Yisheng Xie
cbae0170e5 mm/migration: make isolate_movable_page always defined
Define isolate_movable_page as a static inline function when
CONFIG_MIGRATION is not enable.  It should return -EBUSY here which
means failed to isolate movable pages.

This patch do not have any functional change but prepare for later
patch.

Link: http://lkml.kernel.org/r/1485867981-16037-3-git-send-email-ysxie@foxmail.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00