Commit Graph

48594 Commits

Author SHA1 Message Date
Yaowei Bai
79a3ed2e98 ceph: ceph_frag_contains_value can be boolean
This patch makes ceph_frag_contains_value return bool to improve
readability due to this particular function only using either one or
zero as its return value.

No functional change.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Yaowei Bai
eade1fe75f ceph: remove unused functions in ceph_frag.h
These functions were introduced in commit 3d14c5d2b ("ceph: factor
out libceph from Ceph file system"). Howover, there's no user of
these functions since then, so remove them for simplicity.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Peter Zijlstra
fae3fde651 perf: Collapse and fix event_function_call() users
There is one common bug left in all the event_function_call() users,
between loading ctx->task and getting to the remote_function(),
ctx->task can already have been changed.

Therefore we need to double check and retry if ctx->task != current.

Insert another trampoline specific to event_function_call() that
checks for this and further validates state. This also allows getting
rid of the active/inactive functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:24 +01:00
majd@mellanox.com
427c1e7bcd {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
When modifying a QP, the desired operation was determined in
the mlx5_core using a transition table that takes the current
state, the final state, and returns the desired operation.

Since this logic will be used for Raw Packet QP, move the
operation table to the mlx5_ib.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
75850d0bce IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
When the user changes the Address Vector(AV) in the modify QP, he
provides an SL. This SL should be translated to Ethernet Priority
by taking the 3 LSB bits, and modify the QP's TIS according to this
Ethernet priority.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
6d2f89df04 IB/mlx5: Add Raw Packet QP query functionality
Since Raw Packet QP is composed of RQ and SQ, the IB QP's
state is derived from the sub-objects. Therefore we need
to query each one of the sub-objects, and decide on the
IB QP's state.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
e2013b212f net/mlx5_core: Add RQ and SQ event handling
RQ/SQ will be used to implement IB verbs QPs, so the IB QP affiliated
events are affiliated also with SQs and RQs.

Since SQ, RQ and QP resource numbers do not share the same name
space, a queue type field was added to the event data to specify
the SW object that the event is affiliated with.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
8d7f9ecb37 net/mlx5_core: Export transport objects
To be used by mlx5_ib in the following patches for implementing
RAW PACKET QP.

Add mlx5_core_ prefix to alloc and delloc transport_domain since
they are exposed now.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:08 -05:00
Linus Torvalds
30f05309bd Merge tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI updates from Rafael Wysocki:
 "This includes fixes on top of the previous batch of PM+ACPI updates
  and some new material as well.

  From the new material perspective the most significant are the driver
  core changes that should allow USB devices to stay suspended over
  system suspend/resume cycles if they have been runtime-suspended
  already beforehand.  Apart from that, ACPICA is updated to upstream
  revision 20160108 (cosmetic mostly, but including one fixup on top of
  the previous ACPICA update) and there are some devfreq updates the
  didn't make it before (due to timing).

  A few recent regressions are fixed, most importantly in the cpuidle
  menu governor and in the ACPI backlight driver and some x86 platform
  drivers depending on it.

  Some more bugs are fixed and cleanups are made on top of that.

  Specifics:

   - Modify the driver core and the USB subsystem to allow USB devices
     to stay suspended over system suspend/resume cycles if they have
     been runtime-suspended already beforehand and fix some bugs on top
     of these changes (Tomeu Vizoso, Rafael Wysocki).

   - Update ACPICA to upstream revision 20160108, including updates of
     the ACPICA's copyright notices, a code fixup resulting from a
     regression fix that was necessary in the upstream code only (the
     regression fixed by it has never been present in Linux) and a
     compiler warning fix (Bob Moore, Lv Zheng).

   - Fix a recent regression in the cpuidle menu governor that broke it
     on practically all architectures other than x86 and make a couple
     of optimizations on top of that fix (Rafael Wysocki).

   - Clean up the selection of cpuidle governors depending on whether or
     not the kernel is configured for tickless systems (Jean Delvare).

   - Revert a recent commit that introduced a regression in the ACPI
     backlight driver, address the problem it attempted to fix in a
     different way and revert one more cosmetic change depending on the
     problematic commit (Hans de Goede).

   - Add two more ACPI backlight quirks (Hans de Goede).

   - Fix a few minor problems in the core devfreq code, clean it up a
     bit and update the MAINTAINERS information related to it (Chanwoo
     Choi, MyungJoo Ham).

   - Improve an error message in the ACPI fan driver (Andy Lutomirski).

   - Fix a recent build regression in the cpupower tool (Shreyas
     Prabhu)"

* tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
  cpuidle: menu: Avoid pointless checks in menu_select()
  sched / idle: Drop default_idle_call() fallback from call_cpuidle()
  cpupower: Fix build error in cpufreq-info
  cpuidle: Don't enable all governors by default
  cpuidle: Default to ladder governor on ticking systems
  time: nohz: Expose tick_nohz_enabled
  ACPICA: Update version to 20160108
  ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t'
  ACPICA: Additional 2016 copyright changes
  ACPICA: Reduce regression fix divergence from upstream ACPICA
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()"
  ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit
  ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses()
  ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses"
  ACPI / fan: Improve acpi_device_update_power error message
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
  cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
  MAINTAINERS: Add devfreq-event entry
  MAINTAINERS: Add missing git repository and directory for devfreq
  ...
2016-01-20 19:06:49 -08:00
Linus Torvalds
3549d82279 Merge tag 'renesas-sh-drivers-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas
Pull SH driver updates from Simon Horman:
 "Clean up the clock API on legacy SH to make it more similar to the
  Common Clock Framework.  This will avoid different behaviour in
  drivers shared between legacy and CCF-based platforms (e.g.  SCIF)"

* tag 'renesas-sh-drivers-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  drivers: sh: clk: Avoid crashes when passing NULL clocks
  drivers: sh: clk: Remove obsolete and unused clk_round_parent()
2016-01-20 19:00:46 -08:00
Linus Torvalds
9638685e32 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
 "Driver updates for ARM SoCs.  Some for SoC-family code under
  drivers/soc, but also some other driver updates that don't belong
  anywhere else.  We also bring in the drivers/reset code through
  arm-soc.

  Some of the larger updates:

   - Qualcomm support for SMEM, SMSM, SMP2P.  All used to communicate
     with other parts of the chip/board on these platforms, all
     proprietary protocols that don't fit into other subsystems and live
     in drivers/soc for now.

   - System bus driver for UniPhier

   - Driver for the TI Wakeup M3 IPC device

   - Power management for Raspberry PI

  + Again a bunch of other smaller updates and patches"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
  bus: uniphier: allow only built-in driver
  ARM: bcm2835: clarify RASPBERRYPI_FIRMWARE dependency
  MAINTAINERS: Drop Kumar Gala from QCOM
  bus: uniphier-system-bus: add UniPhier System Bus driver
  ARM: bcm2835: add rpi power domain driver
  dt-bindings: add rpi power domain driver bindings
  ARM: bcm2835: Define two new packets from the latest firmware.
  drivers/soc: make mediatek/mtk-scpsys.c explicitly non-modular
  soc: mediatek: SCPSYS: Add regulator support
  MAINTAINERS: Change QCOM entries
  soc: qcom: smd-rpm: Add existing platform support
  memory/tegra: Add number of TLB lines for Tegra124
  reset: hi6220: fix modular build
  soc: qcom: Introduce WCNSS_CTRL SMD client
  ARM: qcom: select ARM_CPU_SUSPEND for power management
  MAINTAINERS: Add rules for Qualcomm dts files
  soc: qcom: enable smsm/smp2p modular build
  serial: msm_serial: Make config tristate
  soc: qcom: smp2p: Qualcomm Shared Memory Point to Point
  soc: qcom: smsm: Add driver for Qualcomm SMSM
  ...
2016-01-20 18:42:30 -08:00
Linus Torvalds
1305eda751 Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Olof Johansson:
 "Updates for new platform support:

   - New platform: Tango4 from Sigma Designs.
   - Broadcom BCM2836 (Raspberry Pi 2 SoC)
   - Enable cpufreq on Freescale i.MX7D
   - Rockchip: SMP support for rk3036, general support for rk3228
   - SMP support on Broadcom Kona and NSP
   - Cleanups for OMAP removing legacy IOMMU data

  + a bunch of misc fixes and tweaks for various platforms"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
  ARM: tango: Fix UP build issues
  ARM: tango: pass ARM arch level for smc.S
  ARM: bcm2835: Add Kconfig support for bcm2836
  ARM: OMAP2+: Add support for dm814x and dra62x usb
  ARM: OMAP2+: Add mmc hwmod entries for dm814x
  ARM: OMAP2+: Update 81xx clock and power domains for default, active and sgx
  ARM: OMAP2+: Fix SoC detection for dra62x j5-eco
  ARM: tango4: Initial platform support
  ARM: bcm2835: Add a compat string for bcm2836 machine probe
  dt-bindings: Add root properties for Raspberry Pi 2
  ARM: imx: select SRC for i.MX7
  ARM: uniphier: select PINCTRL
  ARM: OMAP2+: Remove device creation for omap-pcm-audio
  ARM: OMAP1: Remove device creation for omap-pcm-audio
  ARM: rockchip: enable support for RK3228 SoCs
  ARM: rockchip: use const and __initconst for rk3036 smp_operations
  ARM: zynq: Select ARCH_HAS_RESET_CONTROLLER
  ARM: BCM: Add SMP support for Broadcom 4708
  ARM: BCM: Add SMP support for Broadcom NSP
  ARM: BCM: Clean up SMP support for Broadcom Kona
  ...
2016-01-20 18:10:05 -08:00
Linus Torvalds
6b5a12dbca Merge tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC multiplatform code updates from Arnd Bergmann:
 "This branch is the culmination of 5 years of effort to bring the ARMv6
  and ARMv7 platforms together such that they can all be enabled and
  boot the same kernel.  It has been a tremendous amount of cleanup and
  refactoring by a huge number of people, and creation of several new
  (and major) subsystems to better abstract out all the platform details
  in an appropriate manner.

  The bulk of this branch is a large patchset from Arnd that brings
  several of the more minor and older platforms we have closer to
  multiplatform support.  Among these are MMP, S3C64xx, Orion5x, mv78xx0
  and realview Much of this is moving around header files from old mach
  directories, but there are also some cleanup patches of debug_ll
  (lowlevel debug per-platform options) and other parts.

  Linus Walleij also has some patchs to clean up the older ARM Realview
  platforms by finally introducing DT support, and Rob Herring has some
  for ARM Versatile which is now DT-only.  Both of these platforms are
  now multiplatform.

  Finally, a couple of patches from Russell for Dove PMU, and a fix from
  Valentin Rothberg for Exynos ADC, which were rebased on top of the
  series to avoid conflicts"

* tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits)
  ARM: realview: don't select SMP_ON_UP for UP builds
  ARM: s3c: simplify s3c_irqwake_{e,}intallow definition
  ARM: s3c64xx: fix pm-debug compilation
  iio: exynos-adc: fix irqf_oneshot.cocci warnings
  ARM: realview: build realview-dt SMP support only when used
  ARM: realview: select apropriate targets
  ARM: realview: clean up header files
  ARM: realview: make all header files local
  ARM: no longer make CPU targets visible separately
  ARM: integrator: use explicit core module options
  ARM: realview: enable multiplatform
  ARM: make default platform work for NOMMU
  ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location
  ARM: defconfig: use correct debug_ll settings
  ARM: versatile: convert to multi-platform
  ARM: versatile: merge mach code into a single file
  ARM: versatile: switch to DT only booting and remove legacy code
  ARM: versatile: add DT based PCI detection
  ARM: pxa: mark ezx structures as __maybe_unused
  ARM: pxa: mark raumfeld init functions as __maybe_unused
  ...
2016-01-20 18:03:56 -08:00
Linus Torvalds
5083c54264 Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Olof Johansson:
 "A smallish number of general cleanup commits this release cycle.  Some
  of these are minor tweaks:

   - shmobile change of binding for their GIC (using arm,pl390 now)
   - ARCH_RENESAS introduction
   - Misc other renesas updates

  There's also a couple of treewide commits from Masahiro Yamada
  cleaning up const/__initconst for SMP operation structs and a switch
  to using "depends on" instead of if-constructs on most of the Kconfig
  platform targets"

* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  staging: board: armadillo800eva: Use "arm,pl390"
  staging: board: kzm9d: Use "arm,pl390"
  ARM: shmobile: r8a7778 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: emev2 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: r8a7740 dtsi: Use "arm,pl390" for GIC
  ARM: shmobile: r7s72100 dtsi: Use "arm,pl390" for GIC
  ARM: use "depends on" for SoC configs instead of "if" after prompt
  ARM/clocksource: use automatic DT probing for ux500 PRCMU
  ARM: use const and __initconst for smp_operations
  ARM: hisi: do not export smp_operations structures
  ARM: mvebu: remove unused mach/gpio.h
  ARM: shmobile: Remove legacy mach/irqs.h
  ARM: shmobile: Introduce ARCH_RENESAS
  MAINTAINERS: Remove link to oss.renesas.com which is closed
2016-01-20 17:55:20 -08:00
Linus Torvalds
0c582826a1 Merge tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull non-urgent ARM SoC fixes from Olof Johansson:
 "As usual, we queue up a few fixes that don't seem urgent enough to go
  in through -rc.

   - MAINTAINERS updates to add a list for brcmstb and fix a typo
   - A handful of fixes for OMAP 81xx, a recently resurrected platform
     so these can't be considered real regressions and thus got queued.
   - A couple of other small fixes for scoop, sa1100 and davinci"

* tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: OMAP2+: Fix randconfig build warning for dm814_pllss_data
  ARM: sa1100/simpad: Be sure to clamp return value
  ARM: scoop: Be sure to clamp return value
  ARM: davinci: fix a problematic usage of WARN()
  ARM: davinci: only select WT cache if cache is enabled
  ARM: OMAP2+: Remove useless check for legacy booting for dm814x
  ARM: OMAP2+: Enable GPIO for dm814x
  ARM: dts: Fix dm814x pinctrl address and mask
  ARM: dts: Fix dm8148 control modules ranges
  ARM: OMAP2+: Fix timer entries for dm814x
  ARM: dts: Fix some mux and divider clocks to get dm814x-evm booting
  ARM: OMAP2+: Add DPPLS clock manager for dm814x
  clk: ti: Add few dm814x clock aliases
  ARM: dts: Fix dm814x entries for pllss and prcm
  MAINTAINERS: gpio-brcmstb: Remove stray '>'
  MAINTAINERS: brcmstb: Include Broadcom internal mailing-list
2016-01-20 17:44:16 -08:00
Linus Torvalds
71e4634e00 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Introduce configfs support for unlocked configfs_depend_item()
     (krzysztof + andrezej)
   - Conversion of usb-gadget target driver to new function registration
     interface (andrzej + sebastian)
   - Enable qla2xxx FC target mode support for Extended Logins (himansu +
     giridhar)
   - Enable qla2xxx FC target mode support for Exchange Offload (himansu +
     giridhar)
   - Add qla2xxx FC target mode irq affinity notification + selective
     command queuing.  (quinn + himanshu)
   - Fix iscsi-target deadlock in se_node_acl configfs deletion (sagi +
     nab)
   - Convert se_node_acl configfs deletion + se_node_acl->queue_depth to
     proper se_session->sess_kref + target_get_session() usage.  (hch +
     sagi + nab)
   - Fix long-standing race between se_node_acl->acl_kref get and
     get_initiator_node_acl() lookup.  (hch + nab)
   - Fix target/user block-size handling, and make sure netlink reaches
     all network namespaces (sheng + andy)

  Note there is an outstanding bug-fix series for remote I_T nexus port
  TMR LUN_RESET has been posted and still being tested, and will likely
  become post -rc1 material at this point"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (56 commits)
  scsi: qla2xxxx: avoid type mismatch in comparison
  target/user: Make sure netlink would reach all network namespaces
  target: Obtain se_node_acl->acl_kref during get_initiator_node_acl
  target: Convert ACL change queue_depth se_session reference usage
  iscsi-target: Fix potential dead-lock during node acl delete
  ib_srpt: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Wait for command completion before freeing a session
  target: Fix a memory leak in target_dev_lba_map_store()
  target: Support aborting tasks with a 64-bit tag
  usb/gadget: Remove set-but-not-used variables
  target: Remove an unused variable
  target: Fix indentation in target_core_configfs.c
  target/user: Allow user to set block size before enabling device
  iser-target: Fix non negative ERR_PTR isert_device_get usage
  target/fcoe: Add tag support to tcm_fc
  qla2xxx: Check for online flag instead of active reset when transmitting responses
  qla2xxx: Set all queues to 4k
  qla2xxx: Disable ZIO at start time.
  qla2xxx: Move atioq to a different lock to reduce lock contention
  ...
2016-01-20 17:20:53 -08:00
Johannes Weiner
b2807f07f4 mm: memcontrol: add "sock" to cgroup2 memory.stat
Provide statistics on how much of a cgroup's memory footprint is made up
of socket buffers from network connections owned by the group.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov
5ccc5abaaf mm: free swap cache aggressively if memcg swap is full
Swap cache pages are freed aggressively if swap is nearly full (>50%
currently), because otherwise we are likely to stop scanning anonymous
when we near the swap limit even if there is plenty of freeable swap cache
pages.  We should follow the same trend in case of memory cgroup, which
has its own swap limit.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov
d8b38438a0 mm: vmscan: do not scan anon pages if memcg swap limit is hit
We don't scan anonymous memory if we ran out of swap, neither should we do
it in case memcg swap limit is hit, because swap out is impossible anyway.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov
6f2cb2f177 swap.h: move memcg related stuff to the end of the file
The following patches will add more functions to the memcg section of
include/linux/swap.h.  Some of them will need values defined below the
current location of the section.  So let's move the section to the end of
the file.  No functional changes intended.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov
eb01aaab43 mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
mem_cgroup_lruvec_online() takes lruvec, but it only needs memcg.  Since
get_scan_count(), which is the only user of this function, now possesses
pointer to memcg, let's pass memcg directly to mem_cgroup_online() instead
of picking it out of lruvec and rename the function accordingly.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Vladimir Davydov
37e8435119 mm: memcontrol: charge swap to cgroup2
This patchset introduces swap accounting to cgroup2.

This patch (of 7):

In the legacy hierarchy we charge memsw, which is dubious, because:

 - memsw.limit must be >= memory.limit, so it is impossible to limit
   swap usage less than memory usage. Taking into account the fact that
   the primary limiting mechanism in the unified hierarchy is
   memory.high while memory.limit is either left unset or set to a very
   large value, moving memsw.limit knob to the unified hierarchy would
   effectively make it impossible to limit swap usage according to the
   user preference.

 - memsw.usage != memory.usage + swap.usage, because a page occupying
   both swap entry and a swap cache page is charged only once to memsw
   counter. As a result, it is possible to effectively eat up to
   memory.limit of memory pages *and* memsw.limit of swap entries, which
   looks unexpected.

That said, we should provide a different swap limiting mechanism for
cgroup2.

This patch adds mem_cgroup->swap counter, which charges the actual number
of swap entries used by a cgroup.  It is only charged in the unified
hierarchy, while the legacy hierarchy memsw logic is left intact.

The swap usage can be monitored using new memory.swap.current file and
limited using memory.swap.max.

Note, to charge swap resource properly in the unified hierarchy, we have
to make swap_entry_free uncharge swap only when ->usage reaches zero, not
just ->count, i.e.  when all references to a swap entry, including the one
taken by swap cache, are gone.  This is necessary, because otherwise
swap-in could result in uncharging swap even if the page is still in swap
cache and hence still occupies a swap entry.  At the same time, this
shouldn't break memsw counter logic, where a page is never charged twice
for using both memory and swap, because in case of legacy hierarchy we
uncharge swap on commit (see mem_cgroup_commit_charge).

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
0b8f73e104 mm: memcontrol: clean up alloc, online, offline, free functions
The creation and teardown of struct mem_cgroup is fairly messy and
that has attracted mistakes and subtle bugs before.

The main cause for this is that there is no clear model about what
needs to happen when, and that attracts more chaos. So create one:

1. mem_cgroup_alloc() should allocate struct mem_cgroup and its
   auxiliary members and initialize work items, locks etc. so that the
   object it returns is fully initialized and in a neutral state.

2. mem_cgroup_css_alloc() will use mem_cgroup_alloc() to obtain a new
   memcg object and configure it and the system according to the role
   of the new memory-controlled cgroup in the hierarchy.

3. mem_cgroup_css_online() is no longer needed to synchronize with
   iterators, but it verifies css->id which isn't available earlier.

4. mem_cgroup_css_offline() implements stuff that needs to happen upon
   the user-visible destruction of a cgroup, which includes stopping
   all user interfacing as well as releasing certain structures when
   continued memory consumption would be unexpected at that point.

5. mem_cgroup_css_free() prepares the system and the memcg object for
   the object's disappearance, neutralizes its state, and then gives
   it back to mem_cgroup_free().

6. mem_cgroup_free() releases struct mem_cgroup and auxiliary memory.

[arnd@arndb.de: fix SLOB build regression]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
0db1529817 mm: memcontrol: flatten struct cg_proto
There are no more external users of struct cg_proto, flatten the
structure into struct mem_cgroup.

Since using those struct members doesn't stand out as much anymore,
add cgroup2 static branches to make it clearer which code is legacy.

Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
d886f4e483 mm: memcontrol: rein in the CONFIG space madness
What CONFIG_INET and CONFIG_LEGACY_KMEM guard inside the memory
controller code is insignificant, having these conditionals is not
worth the complication and fragility that comes with them.

[akpm@linux-foundation.org: rework mem_cgroup_css_free() statement ordering]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
489c2a20a4 mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
interface. This also makes legacy-only code sections stand out better.

[arnd@arndb.de: mm: memcontrol: only manage socket pressure for CONFIG_INET]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
127424c86b mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
The cgroup2 memory controller will account important in-kernel memory
consumers per default.  Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Johannes Weiner
567e9ab2e6 mm: memcontrol: give the kmem states more descriptive names
On any given memcg, the kmem accounting feature has three separate
states: not initialized, structures allocated, and actively accounting
slab memory.  These are represented through a combination of the
kmem_acct_activated and kmem_acct_active flags, which is confusing.

Convert to a kmem_state enum with the states NONE, ALLOCATED, and
ONLINE.  Then rename the functions to modify the state accordingly.
This follows the nomenclature of css object states more closely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Geliang Tang
8e99469ab0 dma-mapping: use offset_in_page macro
Use offset_in_page macro instead of (addr & ~PAGE_MASK).

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Christoph Hellwig
20d666e411 dma-mapping: remove <asm-generic/dma-coherent.h>
This wasn't an asm-generic header to start with, and can be merged into
dma-mapping.h trivially.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Christoph Hellwig
e1c7e32453 dma-mapping: always provide the dma_map_ops based implementation
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.

[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Yaowei Bai
2954e440be ipc/shm.c: is_file_shm_hugepages() can be boolean
Make is_file_shm_hugepages() return bool to improve readability due to
this particular function only using either one or zero as its return
value.

No functional change.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Bongkyu Kim
06af1c52c9 lz4: fix wrong compress buffer size for 64-bits
The current lz4 compress buffer is 16kb on 32-bits, 32kb on 64-bits
system.  But, lz4 needs only 16kb on both.  On 64-bits, this causes
wasted cpu cycles for additional memset during every compression.

In case of lz4hc, the current buffer size is (256kb + 8) on 32-bits,
(512kb + 16) on 64-bits.  But, lz4hc needs only (256kb + 2 * pointer) on
both.

This patch fixes these wrong compress buffer sizes for 64-bits.

Signed-off-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Chanho Min <chanho.min@lge.com>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Andrey Ryabinin
c6d308534a UBSAN: run-time undefined behavior sanity checker
UBSAN uses compile-time instrumentation to catch undefined behavior
(UB).  Compiler inserts code that perform certain kinds of checks before
operations that could cause UB.  If check fails (i.e.  UB detected)
__ubsan_handle_* function called to print error message.

So the most of the work is done by compiler.  This patch just implements
ubsan handlers printing errors.

GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.

[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

Issues which UBSAN has found thus far are:

Found bugs:

 * out-of-bounds access - 97840cb67f ("netfilter: nfnetlink: fix
   insufficient validation in nfnetlink_bind")

undefined shifts:

 * d48458d4a7 ("jbd2: use a better hash function for the revoke
   table")

 * 10632008b9 ("clockevents: Prevent shift out of bounds")

 * 'x << -1' shift in ext4 -
   http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com>

 * undefined rol32(0) -
   http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com>

 * undefined dirty_ratelimit calculation -
   http://lkml.kernel.org/r/<566594E2.3050306@odin.com>

 * undefined roundown_pow_of_two(0) -
   http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com>

 * [WONTFIX] undefined shift in __bpf_prog_run -
   http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com>

   WONTFIX here because it should be fixed in bpf program, not in kernel.

signed overflows:

 * 32a8df4e0b ("sched: Fix odd values in effective_load()
   calculations")

 * mul overflow in ntp -
   http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com>

 * incorrect conversion into rtc_time in rtc_time64_to_tm() -
   http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com>

 * unvalidated timespec in io_getevents() -
   http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com>

 * [NOTABUG] signed overflow in ktime_add_safe() -
   http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com>

[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Davidlohr Bueso
a460bece02 rbtree: use READ_ONCE in RB_EMPTY_ROOT
With commit d72da4a4d9 ("rbtree: Make lockless searches non-fatal") our
rbtrees provide weak guarantees that allows us to do lockless (and very
speculative) reads of the tree.  Such readers cannot see partial stores
on nodes, ie left/right as well as root.  As such, similar to the
WRITE_ONCE semantics when doing rotations, use READ_ONCE when checking
the root node in RB_EMPTY_ROOT.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Xunlei Pang
978e30c9b4 kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE
Move the stuff currently only used by the kexec file code within
CONFIG_KEXEC_FILE (and CONFIG_KEXEC_VERIFY_SIG).

Also move internal "struct kexec_sha_region" and "struct kexec_buf" into
"kexec_internal.h".

Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes
9425676a36 kernel/cpu.c: make set_cpu_* static inlines
Almost all callers of the set_cpu_* functions pass an explicit true or
false.  Making them static inline thus replaces the function calls with a
simple set_bit/clear_bit, saving some .text.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes
5aec01b834 kernel/cpu.c: eliminate cpu_*_mask
Replace the variables cpu_possible_mask, cpu_online_mask, cpu_present_mask
and cpu_active_mask with macros expanding to expressions of the same type
and value, eliminating some indirection.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes
4b804c85dc kernel/cpu.c: export __cpu_*_mask
Exporting the cpumasks __cpu_possible_mask and friends will allow us to
remove the extra indirection through the cpu_*_mask variables.  It will
also allow the set_cpu_* functions to become static inlines, which will
give a .text reduction.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Jann Horn
caaee6234d ptrace: use fsuid, fsgid, effective creds for fs access checks
By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Adam Barth
243c2137cd include/linux/radix-tree.h: fix error in docs about locks
This text refers to the "first 7 functions", which was correct when
written but became incorrect when Johannes Weiner added another function
to the list in 139e561660 ("lib: radix_tree: tree node interface").

Change the text to correctly refer to the first 8 functions.

Signed-off-by: Adam Barth <aurorean@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Stephen Boyd
a9aec5881b lib/iomap_copy.c: add __ioread32_copy()
Some drivers need to read data out of iomem areas 32-bits at a time.
Add an API to do this.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Cc: <zajec5@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rafael J. Wysocki
f11aef69b2 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: menu: Avoid pointless checks in menu_select()
  sched / idle: Drop default_idle_call() fallback from call_cpuidle()
  cpuidle: Don't enable all governors by default
  cpuidle: Default to ladder governor on ticking systems
  time: nohz: Expose tick_nohz_enabled
  cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
2016-01-21 00:43:21 +01:00
Rafael J. Wysocki
fa8bb45187 Merge branch 'pm-devfreq'
* pm-devfreq:
  MAINTAINERS: Add devfreq-event entry
  MAINTAINERS: Add missing git repository and directory for devfreq
  PM / devfreq: Do not show statistics if it's not ready.
  PM / devfreq: Modify the indentation of trans_stat sysfs for readability
  PM / devfreq: Set the freq_table of devfreq device
  PM / devfreq: Add show_one macro to delete the duplicate code
  PM / devfreq: event: Fix the error and warning from script/checkpatch.pl
  PM / devfreq: event: Remove the error log of devfreq_event_get_edev_by_phandle()
2016-01-21 00:43:13 +01:00
Rafael J. Wysocki
6efd3f8cde Merge branch 'pm-core'
* pm-core:
  driver core: Avoid NULL pointer dereferences in device_is_bound()
  platform: Do not detach from PM domains on shutdown
  USB / PM: Allow USB devices to remain runtime-suspended when sleeping
  PM / sleep: Go direct_complete if driver has no callbacks
  PM / Domains: add setter for dev.pm_domain
  device core: add device_is_bound()
2016-01-21 00:42:59 +01:00
Thierry Reding
386744425e swiotlb: Make linux/swiotlb.h standalone includible
This header file uses the enum dma_data_direction and struct page types
without explicitly including the corresponding header files. This makes
it rely on the includer to have included the proper headers before.

To fix this, include linux/dma-direction.h and forward-declare struct
page. The swiotlb_free() function is also annotated __init, therefore
requires linux/init.h to be included as well.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-01-20 17:29:52 -05:00
Bjorn Helgaas
9662e32c81 Merge branch 'pci/trivial' into next
* pci/trivial:
  PCI: shpchp: Constify hpc_ops structure
  PCI: Use kobj_to_dev() instead of open-coding it
  PCI: Use to_pci_dev() instead of open-coding it
  PCI: Fix all whitespace issues
  PCI/MSI: Fix typos in <linux/msi.h>
2016-01-20 11:48:25 -06:00
Bjorn Helgaas
904f664b58 Merge branches 'pci/iommu' and 'pci/misc' into next
* pci/iommu:
  PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183

* pci/misc:
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
2016-01-20 11:47:54 -06:00
Willy Tarreau
759c01142a pipe: limit the per-user amount of pages allocated in pipes
On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-19 19:25:21 -05:00
Linus Torvalds
7c24d9f3b2 Merge branch 'for-4.5/core' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
 "We don't have a lot of core changes this time around, it's mostly in
  drivers, which will come in a subsequent pull.

  The cores changes include:

   - blk-mq
        - Prep patch from Christoph, changing blk_mq_alloc_request() to
          take flags instead of just using gfp_t for sleep/nosleep.
        - Doc patch from me, clarifying the difference between legacy
          and blk-mq for timer usage.
        - Fixes from Raghavendra for memory-less numa nodes, and a reuse
          of CPU masks.

   - Cleanup from Geliang Tang, using offset_in_page() instead of open
     coding it.

   - From Ilya, rename request_queue slab to it reflects what it holds,
     and a fix for proper use of bdgrab/put.

   - A real fix for the split across stripe boundaries from Keith.  We
     yanked a broken version of this from 4.4-rc final, this one works.

   - From Mike Krinkin, emit a trace message when we split.

   - From Wei Tang, two small cleanups, not explicitly clearing memory
     that is already cleared"

* 'for-4.5/core' of git://git.kernel.dk/linux-block:
  block: use bd{grab,put}() instead of open-coding
  block: split bios to max possible length
  block: add call to split trace point
  blk-mq: Avoid memoryless numa node encoded in hctx numa_node
  blk-mq: Reuse hardware context cpumask for tags
  blk-mq: add a flags parameter to blk_mq_alloc_request
  Revert "blk-flush: Queue through IO scheduler when flush not required"
  block: clarify blk_add_timer() use case for blk-mq
  bio: use offset_in_page macro
  block: do not initialise statics to 0 or NULL
  block: do not initialise globals to 0 or NULL
  block: rename request_queue slab cache
2016-01-19 15:03:34 -08:00