Commit Graph

51418 Commits

Author SHA1 Message Date
Michal Hocko
44a70adec9 mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj
oom_score_adj is shared for the thread groups (via struct signal) but this
is not sufficient to cover processes sharing mm (CLONE_VM without
CLONE_SIGHAND) and so we can easily end up in a situation when some
processes update their oom_score_adj and confuse the oom killer.  In the
worst case some of those processes might hide from the oom killer
altogether via OOM_SCORE_ADJ_MIN while others are eligible.  OOM killer
would then pick up those eligible but won't be allowed to kill others
sharing the same mm so the mm wouldn't release the mm and so the memory.

It would be ideal to have the oom_score_adj per mm_struct because that is
the natural entity OOM killer considers.  But this will not work because
some programs are doing

	vfork()
	set_oom_adj()
	exec()

We can achieve the same though.  oom_score_adj write handler can set the
oom_score_adj for all processes sharing the same mm if the task is not in
the middle of vfork.  As a result all the processes will share the same
oom_score_adj.  The current implementation is rather pessimistic and
checks all the existing processes by default if there is more than 1
holder of the mm but we do not have any reliable way to check for external
users yet.

Link: http://lkml.kernel.org/r/1466426628-15074-5-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Linus Torvalds
76d5b28bba Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota update from Jan Kara:
 "time64 support for quota"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: use time64_t internally
2016-07-28 13:53:23 -07:00
Linus Torvalds
6784725ab0 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  Probably the most interesting part long-term is ->d_init() - that will
  have a bunch of followups in (at least) ceph and lustre, but we'll
  need to sort the barrier-related rules before it can get used for
  really non-trivial stuff.

  Another fun thing is the merge of ->d_iput() callers (dentry_iput()
  and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
  except the one in __d_lookup_lru())"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  fs/dcache.c: avoid soft-lockup in dput()
  vfs: new d_init method
  vfs: Update lookup_dcache() comment
  bdev: get rid of ->bd_inodes
  Remove last traces of ->sync_page
  new helper: d_same_name()
  dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
  vfs: clean up documentation
  vfs: document ->d_real()
  vfs: merge .d_select_inode() into .d_real()
  unify dentry_iput() and dentry_unlink_inode()
  binfmt_misc: ->s_root is not going anywhere
  drop redundant ->owner initializations
  ufs: get rid of redundant checks
  orangefs: constify inode_operations
  missed comment updates from ->direct_IO() prototype change
  file_inode(f)->i_mapping is f->f_mapping
  trim fsnotify hooks a bit
  9p: new helper - v9fs_parent_fid()
  debugfs: ->d_parent is never NULL or negative
  ...
2016-07-28 12:59:05 -07:00
Linus Torvalds
554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Richard Cochran
4fae16dffb timers/core: Correct callback order during CPU hot plug
On the tear-down path, the dead CPU callback for the timers was
misplaced within the 'cpuhp_state' enumeration.  There is a hidden
dependency between the timers and block multiqueue.  The timers
callback must happen before the block multiqueue callback otherwise a
RCU stall occurs.

Move the timers callback to the proper place in the state machine.

Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 24f73b9971 ("timers/core: Convert to hotplug state machine")
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: rt@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1469610498-25914-1-git-send-email-rcochran@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-28 18:56:22 +02:00
Arnd Bergmann
a0f2b65275 ceph: fix symbol versioning for ceph_monc_do_statfs
The genksyms helper in the kernel cannot parse a type definition
like "typeof(((type *)0)->keyfld)" that is used in the DEFINE_RB_FUNCS
helper, causing the following EXPORT_SYMBOL() statement to be ignored
when computing the crcs, and triggering a warning about this:

WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC

To work around the problem, we can rewrite the type to reference
an undefined 'extern' symbol instead of a NULL pointer. This is
evidently ok for genksyms, and it no longer complains about the
line when calling it with 'genksyms -w'.

I've looked briefly into extending genksyms instead, but it seems
really hard to do. Jan Beulich introduced basic support for 'typeof'
a while ago in dc53324060 ("genksyms: fix typeof() handling"),
but that is not sufficient for the expression we have here.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: fcd00b68bb ("libceph: DEFINE_RB_FUNCS macro")
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 15:32:53 +02:00
Rob Rice
a24532f8d1 mailbox: Add Broadcom PDC mailbox driver
The Broadcom PDC mailbox driver is a mailbox controller that
manages data transfers to and from one or more offload engines.

Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2016-07-28 09:34:47 +05:30
Yan, Zheng
0cabbd94ff libceph: fsmap.user subscription support
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 03:00:40 +02:00
Yan, Zheng
774a6a118c ceph: reduce i_nr_by_mode array size
Track usage count for individual fmode bit. This can reduce the
array size by half.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:39 +02:00
Yan, Zheng
30c156d995 libceph: rados pool namespace support
Add pool namesapce pointer to struct ceph_file_layout and struct
ceph_object_locator. Pool namespace is used by when mapping object
to PG, it's also used when composing OSD request.

The namespace pointer in struct ceph_file_layout is RCU protected.
So libceph can read namespace without taking lock.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
[idryomov@gmail.com: ceph_oloc_destroy(), misc minor changes]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:37 +02:00
Yan, Zheng
51e9273796 libceph: introduce reference counted string
The data structure is for storing namesapce string. It allows namespace
string to be shared between cephfs inodes with same layout. This data
structure can also be referenced by OSD request.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:37 +02:00
Yan, Zheng
7627151ea3 libceph: define new ceph_file_layout structure
Define new ceph_file_layout structure and rename old ceph_file_layout
to ceph_file_layout_legacy. This is preparation for adding namespace
to ceph_file_layout structure.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 02:55:36 +02:00
Ilya Dryomov
22748f9d61 libceph: add start en/decoding block helpers
Add ceph_start_encoding() and ceph_start_decoding(), the equivalent of
ENCODE_START and DECODE_START in the userspace ceph code.

This is based on a patch from Mike Christie <michaelc@cs.wisc.edu>.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:36 +02:00
Ilya Dryomov
281dbe5db8 libceph: add an ONSTACK initializer for oids
An on-stack oid in ceph_ioctl_get_dataloc() is not initialized,
resulting in a WARN and a NULL pointer dereference later on.  We will
have more of these on-stack in the future, so fix it with a convenience
macro.

Fixes: d30291b985 ("libceph: variable-sized ceph_object_id")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:35 +02:00
Ilya Dryomov
b2aa5d0bc8 libceph: fix some missing includes
- decode.h needs slab.h for kmalloc()
- osd_client.h needs msgpool.h for struct ceph_msgpool
- msgpool.h doesn't need messenger.h

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-07-28 02:55:35 +02:00
Linus Torvalds
8448cefe49 Merge tag 'hsi-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi
Pull HSI updates from Sebastian Reichel:

 - proper runtime pm support for omap-ssi and ssi-protocol

 - misc fixes

* tag 'hsi-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi: (24 commits)
  HSI: omap_ssi: drop pm_runtime_irq_safe
  HSI: omap_ssi_port: use rpm autosuspend API
  HSI: omap_ssi: call msg->complete() from process context
  HSI: omap_ssi_port: ensure clocks are kept enabled during transfer
  HSI: omap_ssi_port: replace pm_runtime_put_sync with non-sync variant
  HSI: omap_ssi_port: avoid calling runtime_pm_*_sync inside spinlock
  HSI: omap_ssi_port: avoid pm_runtime_get_sync in ssi_start_dma and ssi_start_pio
  HSI: omap_ssi_port: switch to threaded pio irq
  HSI: omap_ssi_core: remove pm_runtime_get_sync call from tasklet
  HSI: omap_ssi_core: use pm_runtime_put instead of pm_runtime_put_sync
  HSI: omap_ssi_port: prepare start_tx/stop_tx for blocking pm_runtime calls
  HSI: core: switch port event notifier from atomic to blocking
  HSI: omap_ssi_port: replace wkin_cken with atomic bitmap operations
  HSI: omap_ssi: convert cawake irq handler to thread
  HSI: ssi_protocol: fix ssip_xmit invocation
  HSI: ssi_protocol: replace spin_lock with spin_lock_bh
  HSI: ssi_protocol: avoid ssi_waketest call with held spinlock
  HSI: omap_ssi: do not reset module
  HSI: omap_ssi_port: remove useless newline
  hsi: Only descend into hsi directory when CONFIG_HSI is set
  ...
2016-07-27 15:18:53 -07:00
Linus Torvalds
d85486d471 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Updates for the input subsystem.  This contains the following new
  drivers promised in the last merge window:

   - driver for touchscreen controller found in Surface 3
   - driver for Pegasus Notetaker tablet
   - driver for Atmel Captouch Buttons
   - driver for Raydium I2C touchscreen controllers
   - powerkey driver for HISI 65xx SoC

  plus a few fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (40 commits)
  Input: tty/vt/keyboard - use memdup_user()
  Input: pegasus_notetaker - set device mode in reset_resume() if in use
  Input: pegasus_notetaker - cancel workqueue's work in suspend()
  Input: pegasus_notetaker - fix usb_autopm calls to be balanced
  Input: pegasus_notetaker - handle usb control msg errors
  Input: wacom_w8001 - handle errors from input_mt_init_slots()
  Input: wacom_w8001 - resolution wasn't set for ABS_MT_POSITION_X/Y
  Input: pixcir_ts - add support for axis inversion / swapping
  Input: icn8318 - use of_touchscreen helpers for inverting / swapping axes
  Input: edt-ft5x06 - add support for inverting / swapping axes
  Input: of_touchscreen - add support for inverted / swapped axes
  Input: synaptics-rmi4 - use the RMI_F11_REL_BYTES define in rmi_f11_rel_pos_report
  Input: synaptics-rmi4 - remove unneeded variable
  Input: synaptics-rmi4 - remove pointer to rmi_function in f12_data
  Input: synaptics-rmi4 - support regulator supplies
  Input: raydium_i2c_ts - check CRC of incoming packets
  Input: xen-kbdfront - prefer xenbus_write() over xenbus_printf() where possible
  Input: fix a double word "is is" in include/linux/input.h
  Input: add powerkey driver for HISI 65xx SoC
  Input: apanel - spelling mistake - "skiping" -> "skipping"
  ...
2016-07-27 14:30:41 -07:00
Linus Torvalds
66304207cd Merge branch 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "Here is the I2C pull request for 4.8:

   - the core and i801 driver gained support for SMBus Host Notify

   - core support for more than one address in DT

   - i2c_add_adapter() has now better error messages.  We can remove all
     error messages from drivers calling it as a next step.

   - bigger updates to rk3x driver to support rk3399 SoC

   - the at24 eeprom driver got refactored and can now read special
     variants with unique serials or fixed MAC addresses.

  The rest is regular driver updates and bugfixes"

* 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (66 commits)
  i2c: i801: use IS_ENABLED() instead of checking for built-in or module
  Documentation: i2c: slave: give proper example for pm usage
  Documentation: i2c: slave: describe buffer problems a bit better
  i2c: bcm2835: Don't complain on -EPROBE_DEFER from getting our clock
  i2c: i2c-smbus: drop useless stubs
  i2c: efm32: fix a failure path in efm32_i2c_probe()
  Revert "i2c: core: Cleanup I2C ACPI namespace"
  Revert "i2c: core: Add function for finding the bus speed from ACPI"
  i2c: Update the description of I2C_SMBUS
  i2c: i2c-smbus: fix i2c_handle_smbus_host_notify documentation
  eeprom: at24: tweak the loop_until_timeout() macro
  eeprom: at24: add support for at24mac series
  eeprom: at24: support reading the serial number for 24csxx
  eeprom: at24: platform_data: use BIT() macro
  eeprom: at24: split at24_eeprom_write() into specialized functions
  eeprom: at24: split at24_eeprom_read() into specialized functions
  eeprom: at24: hide the read/write loop behind a macro
  eeprom: at24: call read/write functions via function pointers
  eeprom: at24: coding style fixes
  eeprom: at24: move at24_read() below at24_eeprom_write()
  ...
2016-07-27 14:19:25 -07:00
Linus Torvalds
7ae0ae4a02 Merge tag 'spi-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
 "Quite a lot of cleanup and maintainence work going on this release in
  various drivers, and also a fix for a nasty locking issue in the core:

   - A fix for locking issues when external drivers explicitly locked
     the bus with spi_bus_lock() - we were using the same lock to both
     control access to the physical bus in multi-threaded I/O operations
     and exclude multiple callers.

     Confusion between these two caused us to have scenarios where we
     were dropping locks.  These are fixed by splitting into two
     separate locks like should have been done originally, making
     everything much clearer and correct.

   - Support for DMA in spi_flash_read().

   - Support for instantiating spidev on ACPI systems, including some
     test devices used in Windows validation.

   - Use of the core DMA mapping functionality in the McSPI driver.

   - Start of support for ThunderX SPI controllers, involving a very big
     set of changes to the Cavium driver.

   - Support for Braswell, Exynos 5433, Kaby Lake, Merrifield, RK3036,
     RK3228, RK3368 controllers"

* tag 'spi-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (64 commits)
  spi: Split bus and I/O locking
  spi: octeon: Split driver into Octeon specific and common parts
  spi: octeon: Move include file from arch/mips to drivers/spi
  spi: octeon: Put register offsets into a struct
  spi: octeon: Store system clock freqency in struct octeon_spi
  spi: octeon: Convert driver to use readq()/writeq() functions
  spi: pic32-sqi: fixup wait_for_completion_timeout return handling
  spi: pic32: fixup wait_for_completion_timeout return handling
  spi: rockchip: limit transfers to (64K - 1) bytes
  spi: xilinx: Return IRQ_NONE if no interrupts were detected
  spi: xilinx: Handle errors from platform_get_irq()
  spi: s3c64xx: restore removed comments
  spi: s3c64xx: add Exynos5433 compatible for ioclk handling
  spi: s3c64xx: use error code from clk_prepare_enable()
  spi: s3c64xx: rename goto labels to meaningful names
  spi: s3c64xx: document the clocks and the clock-name property
  spi: s3c64xx: add exynos5433 spi compatible
  spi: s3c64xx: fix reference leak to master in s3c64xx_spi_remove()
  spi: spi-sh: Remove deprecated create_singlethread_workqueue
  spi: spi-topcliff-pch: Remove deprecated create_singlethread_workqueue
  ...
2016-07-27 14:11:43 -07:00
Linus Torvalds
607e11ab66 Merge tag 'leds_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
 "New LED class driver:
   - LED driver for TI LP3952 6-Channel Color LED

  LED core improvements:
   - Only descend into leds directory when CONFIG_NEW_LEDS is set
   - Add no-op gpio_led_register_device when LED subsystem is disabled
   - MAINTAINERS: Add file patterns for led device tree bindings

  LED Trigger core improvements:
   - return error if invalid trigger name is provided via sysfs

  LED class drivers improvements
   - is31fl32xx: define complete i2c_device_id table
   - is31fl32xx: fix typo in id and match table names
   - leds-gpio: Set of_node for created LED devices
   - pca9532: Add device tree support

  Conversion of IDE trigger to common disk trigger:
   - leds: convert IDE trigger to common disk trigger
   - leds: documentation: 'ide-disk' to 'disk-activity'
   - unicore32: use the new LED disk activity trigger
   - parisc: use the new LED disk activity trigger
   - mips: use the new LED disk activity trigger
   - arm: use the new LED disk activity trigger
   - powerpc: use the new LED disk activity trigger"

* tag 'leds_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: is31fl32xx: define complete i2c_device_id table
  leds: is31fl32xx: fix typo in id and match table names
  leds: LED driver for TI LP3952 6-Channel Color LED
  leds: leds-gpio: Set of_node for created LED devices
  leds: triggers: return error if invalid trigger name is provided via sysfs
  leds: Only descend into leds directory when CONFIG_NEW_LEDS is set
  leds: Add no-op gpio_led_register_device when LED subsystem is disabled
  unicore32: use the new LED disk activity trigger
  parisc: use the new LED disk activity trigger
  mips: use the new LED disk activity trigger
  arm: use the new LED disk activity trigger
  powerpc: use the new LED disk activity trigger
  leds: documentation: 'ide-disk' to 'disk-activity'
  leds: convert IDE trigger to common disk trigger
  leds: pca9532: Add device tree support
  MAINTAINERS: Add file patterns for led device tree bindings
2016-07-27 14:03:52 -07:00
Linus Torvalds
78d51aee04 Merge tag 'for-linus-4.8' of git://git.code.sf.net/p/openipmi/linux-ipmi
Pull IPMI updates from Corey Minyard:
 "Remove some old cruft that was disabled by default a long time ago.

  No modern hardware should need this, and anybody who really doesn't
  have something to automatically detect IPMI can add the device by hand
  on the module commandline or hot add it"

* tag 'for-linus-4.8' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: remove trydefaults parameter and default init
2016-07-27 13:46:07 -07:00
Linus Torvalds
468fc7ed55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Unified UDP encapsulation offload methods for drivers, from
    Alexander Duyck.

 2) Make DSA binding more sane, from Andrew Lunn.

 3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.

 4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.

 5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
    packets as soon as the device sees them, with the option to mirror
    the packet on TX via the same interface.  From Brenden Blanco and
    others.

 6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.

 7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.

 8) Simplify netlink conntrack entry layout, from Florian Westphal.

 9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
    Schimmel, Yotam Gigi, and Jiri Pirko.

10) Add SKB array infrastructure and convert tun and macvtap over to it.
    From Michael S Tsirkin and Jason Wang.

11) Support qdisc packet injection in pktgen, from John Fastabend.

12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.

13) Add NV congestion control support to TCP, from Lawrence Brakmo.

14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.

15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.

16) Support MPLS over IPV4, from Simon Horman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  xgene: Fix build warning with ACPI disabled.
  be2net: perform temperature query in adapter regardless of its interface state
  l2tp: Correctly return -EBADF from pppol2tp_getname.
  net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
  net: ipmr/ip6mr: update lastuse on entry change
  macsec: ensure rx_sa is set when validation is disabled
  tipc: dump monitor attributes
  tipc: add a function to get the bearer name
  tipc: get monitor threshold for the cluster
  tipc: make cluster size threshold for monitoring configurable
  tipc: introduce constants for tipc address validation
  net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
  MAINTAINERS: xgene: Add driver and documentation path
  Documentation: dtb: xgene: Add MDIO node
  dtb: xgene: Add MDIO node
  drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
  drivers: net: xgene: Use exported functions
  drivers: net: xgene: Enable MDIO driver
  drivers: net: xgene: Add backward compatibility
  drivers: net: phy: xgene: Add MDIO driver
  ...
2016-07-27 12:03:20 -07:00
Linus Torvalds
08fd8c1768 Merge tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
 "Features and fixes for 4.8-rc0:

   - ACPI support for guests on ARM platforms.
   - Generic steal time support for arm and x86.
   - Support cases where kernel cpu is not Xen VCPU number (e.g., if
     in-guest kexec is used).
   - Use the system workqueue instead of a custom workqueue in various
     places"

* tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits)
  xen: add static initialization of steal_clock op to xen_time_ops
  xen/pvhvm: run xen_vcpu_setup() for the boot CPU
  xen/evtchn: use xen_vcpu_id mapping
  xen/events: fifo: use xen_vcpu_id mapping
  xen/events: use xen_vcpu_id mapping in events_base
  x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
  x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
  xen: introduce xen_vcpu_id mapping
  x86/acpi: store ACPI ids from MADT for future usage
  x86/xen: update cpuid.h from Xen-4.7
  xen/evtchn: add IOCTL_EVTCHN_RESTRICT
  xen-blkback: really don't leak mode property
  xen-blkback: constify instance of "struct attribute_group"
  xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
  xen-blkback: prefer xenbus_scanf() over xenbus_gather()
  xen: support runqueue steal time on xen
  arm/xen: add support for vm_assist hypercall
  xen: update xen headers
  xen-pciback: drop superfluous variables
  xen-pciback: short-circuit read path used for merging write values
  ...
2016-07-27 11:35:37 -07:00
Linus Torvalds
0e6acf0204 Merge tag 'xfs-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs updates from Dave Chinner:
 "The major addition is the new iomap based block mapping
  infrastructure.  We've been kicking this about locally for years, but
  there are other filesystems want to use it too (e.g. gfs2).  Now it
  is fully working, reviewed and ready for merge and be used by other
  filesystems.

  There are a lot of other fixes and cleanups in the tree, but those are
  XFS internal things and none are of the scale or visibility of the
  iomap changes.  See below for details.

  I am likely to send another pull request next week - we're just about
  ready to merge some new functionality (on disk block->owner reverse
  mapping infrastructure), but that's a huge chunk of code (74 files
  changed, 7283 insertions(+), 1114 deletions(-)) so I'm keeping that
  separate to all the "normal" pull request changes so they don't get
  lost in the noise.

  Summary of changes in this update:
   - generic iomap based IO path infrastructure
   - generic iomap based fiemap implementation
   - xfs iomap based Io path implementation
   - buffer error handling fixes
   - tracking of in flight buffer IO for unmount serialisation
   - direct IO and DAX io path separation and simplification
   - shortform directory format definition changes for wider platform
     compatibility
   - various buffer cache fixes
   - cleanups in preparation for rmap merge
   - error injection cleanups and fixes
   - log item format buffer memory allocation restructuring to prevent
     rare OOM reclaim deadlocks
   - sparse inode chunks are now fully supported"

* tag 'xfs-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (53 commits)
  xfs: remove EXPERIMENTAL tag from sparse inode feature
  xfs: bufferhead chains are invalid after end_page_writeback
  xfs: allocate log vector buffers outside CIL context lock
  libxfs: directory node splitting does not have an extra block
  xfs: remove dax code from object file when disabled
  xfs: skip dirty pages in ->releasepage()
  xfs: remove __arch_pack
  xfs: kill xfs_dir2_inou_t
  xfs: kill xfs_dir2_sf_off_t
  xfs: split direct I/O and DAX path
  xfs: direct calls in the direct I/O path
  xfs: stop using generic_file_read_iter for direct I/O
  xfs: split xfs_file_read_iter into buffered and direct I/O helpers
  xfs: remove s_maxbytes enforcement in xfs_file_read_iter
  xfs: kill ioflags
  xfs: don't pass ioflags around in the ioctl path
  xfs: track and serialize in-flight async buffers against unmount
  xfs: exclude never-released buffers from buftarg I/O accounting
  xfs: don't reset b_retries to 0 on every failure
  xfs: remove extraneous buffer flag changes
  ...
2016-07-27 09:53:35 -07:00
Tony Camuso
b07b58a3e4 ipmi: remove trydefaults parameter and default init
Parameter trydefaults=1 causes the ipmi_init to initialize ipmi through
the legacy port io space that was designated for ipmi. Architectures
that do not map legacy port io can panic when trydefaults=1.

Rather than implement build-time conditional exceptions for each
architecture that does not map legacy port io, we have removed legacy
port io from the driver.

Parameter 'trydefaults' has been removed. Attempts to use it hereafter
will evoke the "Unknown symbol in module, or unknown parameter" message.

The patch was built against a number of architectures and tested for
regressions and functionality on x86_64 and ARM64.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>

Removed the config entry and the address source entry for default,
since neither were used any more.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-07-27 10:24:38 -05:00
Linus Torvalds
0e06f5c0de Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2

 - most(?) of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits)
  thp: fix comments of __pmd_trans_huge_lock()
  cgroup: remove unnecessary 0 check from css_from_id()
  cgroup: fix idr leak for the first cgroup root
  mm: memcontrol: fix documentation for compound parameter
  mm: memcontrol: remove BUG_ON in uncharge_list
  mm: fix build warnings in <linux/compaction.h>
  mm, thp: convert from optimistic swapin collapsing to conservative
  mm, thp: fix comment inconsistency for swapin readahead functions
  thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
  shmem: split huge pages beyond i_size under memory pressure
  thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
  khugepaged: add support of collapse for tmpfs/shmem pages
  shmem: make shmem_inode_info::lock irq-safe
  khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
  thp: extract khugepaged from mm/huge_memory.c
  shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
  shmem: add huge pages support
  shmem: get_unmapped_area align huge page
  shmem: prepare huge= mount option and sysfs knob
  mm, rmap: account shmem thp pages
  ...
2016-07-26 19:55:54 -07:00
Linus Torvalds
f7816ad0f8 Merge tag 'for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
 - introduce reboot mode driver
 - add DT support to max8903
 - add power supply support for axp221
 - misc fixes

* tag 'for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: reset: add reboot mode driver
  dt-bindings: power: reset: add document for reboot-mode driver
  power_supply: fix return value of get_property
  power: qcom_smbb: Make an extcon for usb cable detection
  max8903: adds support for initiation via device tree
  max8903: adds documentation for device tree bindings.
  max8903: remove unnecessary 'out of memory' error message.
  max8903: removes non zero validity checks on gpios.
  max8903: adds requesting of gpios.
  max8903: cleans up confusing relationship between dc_valid, dok and dcm.
  max8903: store pointer to pdata instead of copying it.
  power_supply: bq27xxx_battery: Group register mappings into one table
  docs: Move brcm,bcm21664-resetmgr.txt
  power/reset: make syscon_poweroff() static
  power: axp20x_usb: Add support for usb power-supply on axp22x pmics
  power_supply: bq27xxx_battery: Index register numbers by enum
  power_supply: bq27xxx_battery: Fix copy/paste error in header comment
  MAINTAINERS: Add file patterns for power supply device tree bindings
  power: reset: keystone: Enable COMPILE_TEST
2016-07-26 19:49:09 -07:00
Linus Torvalds
6097d55e10 Merge tag 'regulator-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
 "A quiet regulator API release, a few new drivers and some fixes but
  nothing too notable.  There will also be some updates for the PWM
  regulator coming through the PWM tree which provide much smoother
  operation when taking over an already running PWM regulator after boot
  using some new PWM APIs.

  Summary:

   - Support for configuration of the initial suspend state from DT.

   - New drivers for Mediatek MT6323, Ricoh RN5T567 and X-Powers AXP809"

* tag 'regulator-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (38 commits)
  regulator: da9053/52: Fix incorrectly stated minimum and maximum voltage limits
  regulator: mt6323: Constify struct regulator_ops
  regulator: mt6323: Fix module description
  regulator: mt6323: Add support for MT6323 regulator
  regulator: Add document for MT6323 regulator
  regulator: da9210: addition of device tree support
  regulator: act8865: Fix missing of_node_put() in act8865_pdata_from_dt()
  regulator: qcom_smd: Avoid overlapping linear voltage ranges
  regulator: s2mps11: Fix the voltage linear range for s2mps15
  regulator: pwm: Fix regulator ramp delay for continuous mode
  regulator: da9211: add descriptions for da9212/da9214
  mfd: rn5t618: Register restart handler
  mfd: rn5t618: Register power off callback optionally
  regulator: rn5t618: Add RN5T567 PMIC support
  mfd: rn5t618: Add Ricoh RN5T567 PMIC support
  ARM: dts: meson: minix-neo-x8: define PMIC as power controller
  regulator: tps65218: force set power-up/down strobe to 3 for dcdc3
  regulator: tps65218: Enable suspend configuration
  regulator: tps65217: Enable suspend configuration
  regulator: qcom_spmi: Add support for get_mode/set_mode on switches
  ...
2016-07-26 19:40:43 -07:00
Linus Torvalds
ae9799975c Merge tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
 "Several small updates and API enhancements:

   - provide transparent unrolling of bulk writes into individual writes
     so they can be used with devices without raw formatting.

   - fix compatibility between I2C controllers supporting block commands
     and devices with more than 8 bit wide registers.

   - add some helpers for iopoll-like functionality and workarounds for
     weird interrupt controllers"

* tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: add iopoll-like polling macro
  regmap: Support bulk writes for devices without raw formatting
  regmap-i2c: Use i2c block command only if register value width is 8 bit
  regmap: irq: Add support to call client specific pre/post interrupt service
  regmap: Add file patterns for regmap device tree bindings
2016-07-26 19:24:33 -07:00
Linus Torvalds
9c1958fc32 Merge tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:

 - new framework support for HDMI CEC and remote control support

 - new encoding codec driver for Mediatek SoC

 - new frontend driver: helene tuner

 - added support for NetUp almost universal devices, with supports
   DVB-C/S/S2/T/T2 and ISDB-T

 - the mn88472 frontend driver got promoted from staging

 - a new driver for RCar video input

 - some soc_camera legacy drivers got removed: timb, omap1, mx2, mx3

 - lots of driver cleanups, improvements and fixups

* tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (377 commits)
  [media] cec: always check all_device_types and features
  [media] cec: poll should check if there is room in the tx queue
  [media] vivid: support monitor all mode
  [media] cec: fix test for unconfigured adapter in main message loop
  [media] cec: limit the size of the transmit queue
  [media] cec: zero unused msg part after msg->len
  [media] cec: don't set fh to NULL in CEC_TRANSMIT
  [media] cec: clear all status fields before transmit and always fill in sequence
  [media] cec: CEC_RECEIVE overwrote the timeout field
  [media] cxd2841er: Reading SNR for DVB-C added
  [media] cxd2841er: Reading BER and UCB for DVB-C added
  [media] cxd2841er: fix switch-case for DVB-C
  [media] cxd2841er: fix signal strength scale for ISDB-T
  [media] cxd2841er: adjust the dB scale for DVB-C
  [media] cxd2841er: provide signal strength for DVB-C
  [media] cxd2841er: fix BER report via DVBv5 stats API
  [media] mb86a20s: apply mask to val after checking for read failure
  [media] airspy: fix error logic during device register
  [media] s5p-cec/TODO: add TODO item
  [media] cec/TODO: drop comment about sphinx documentation
  ...
2016-07-26 18:59:59 -07:00
Linus Torvalds
1b3fc0bef8 Merge tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore subsystem updates from Kees Cook:
 "This expands the supported compressors, fixes some bugs, and finally
  adds DT bindings"

* tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: add Device Tree bindings
  efi-pstore: implement efivars_pstore_exit()
  pstore: drop file opened reference count
  pstore: add lzo/lz4 compression support
  pstore: Cleanup pstore_dump()
  pstore: Enable compression on normal path (again)
  ramoops: Only unregister when registered
2016-07-26 18:48:23 -07:00
Linus Torvalds
396d10993f Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "The major change this cycle is deleting ext4's copy of the file system
  encryption code and switching things over to using the copies in
  fs/crypto.  I've updated the MAINTAINERS file to add an entry for
  fs/crypto listing Jaeguk Kim and myself as the maintainers.

  There are also a number of bug fixes, most notably for some problems
  found by American Fuzzy Lop (AFL) courtesy of Vegard Nossum.  Also
  fixed is a writeback deadlock detected by generic/130, and some
  potential races in the metadata checksum code"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
  ext4: verify extent header depth
  ext4: short-cut orphan cleanup on error
  ext4: fix reference counting bug on block allocation error
  MAINTAINRES: fs-crypto maintainers update
  ext4 crypto: migrate into vfs's crypto engine
  ext2: fix filesystem deadlock while reading corrupted xattr block
  ext4: fix project quota accounting without quota limits enabled
  ext4: validate s_reserved_gdt_blocks on mount
  ext4: remove unused page_idx
  ext4: don't call ext4_should_journal_data() on the journal inode
  ext4: Fix WARN_ON_ONCE in ext4_commit_super()
  ext4: fix deadlock during page writeback
  ext4: correct error value of function verifying dx checksum
  ext4: avoid modifying checksum fields directly during checksum verification
  ext4: check for extents that wrap around
  jbd2: make journal y2038 safe
  jbd2: track more dependencies on transaction commit
  jbd2: move lockdep tracking to journal_s
  jbd2: move lockdep instrumentation for jbd2 handles
  ext4: respect the nobarrier mount option in nojournal mode
  ...
2016-07-26 18:35:55 -07:00
Linus Torvalds
e663107fa1 Merge tag 'acpi-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
 "The new feaures here are the support for ACPI overlays (allowing ACPI
  tables to be loaded at any time from EFI variables or via configfs)
  and the LPI (Low-Power Idle) support.  Also notable is the ACPI-based
  NUMA support for ARM64.

  Apart from that we have two new drivers, for the DPTF (Dynamic Power
  and Thermal Framework) power participant device and for the Intel
  Broxton WhiskeyCove PMIC, some more PMIC-related changes, support for
  the Boot Error Record Table (BERT) in APEI and support for
  platform-initiated graceful shutdown.

  Plus two new pieces of documentation and usual assorted fixes and
  cleanups in quite a few places.

  Specifics:

   - Support for ACPI SSDT overlays allowing Secondary System
     Description Tables (SSDTs) to be loaded at any time from EFI
     variables or via configfs (Octavian Purdila, Mika Westerberg).

   - Support for the ACPI LPI (Low-Power Idle) feature introduced in
     ACPI 6.0 and allowing processor idle states to be represented in
     ACPI tables in a hierarchical way (with the help of Processor
     Container objects) and support for ACPI idle states management on
     ARM64, based on LPI (Sudeep Holla).

   - General improvements of ACPI support for NUMA and ARM64 support for
     ACPI-based NUMA (Hanjun Guo, David Daney, Robert Richter).

   - General improvements of the ACPI table upgrade mechanism and ARM64
     support for that feature (Aleksey Makarov, Jon Masters).

   - Support for the Boot Error Record Table (BERT) in APEI and
     improvements of kernel messages printed by the error injection code
     (Huang Ying, Borislav Petkov).

   - New driver for the Intel Broxton WhiskeyCove PMIC operation region
     and support for the REGS operation region on Broxton, PMIC code
     cleanups (Bin Gao, Felipe Balbi, Paul Gortmaker).

   - New driver for the power participant device which is part of the
     Dynamic Power and Thermal Framework (DPTF) and DPTF-related code
     reorganization (Srinivas Pandruvada).

   - Support for the platform-initiated graceful shutdown feature
     introduced in ACPI 6.1 (Prashanth Prakash).

   - ACPI button driver update related to lid input events generated
     automatically on initialization and system resume that have been
     problematic for some time (Lv Zheng).

   - ACPI EC driver cleanups (Lv Zheng).

   - Documentation of the ACPICA release automation process and the
     in-kernel ACPI AML debugger (Lv Zheng).

   - New blacklist entry and two fixes for the ACPI backlight driver
     (Alex Hung, Arvind Yadav, Ralf Gerbig).

   - Cleanups of the ACPI pci_slot driver (Joe Perches, Paul Gortmaker).

   - ACPI CPPC code changes to make it more robust against possible
     defects in ACPI tables and new symbol definitions for PCC (Hoan
     Tran).

   - System reboot code modification to execute the ACPI _PTS (Prepare
     To Sleep) method in addition to _TTS (Ocean He).

   - ACPICA-related change to carry out lock ordering checks in ACPICA
     if ACPICA debug is enabled in the kernel (Lv Zheng).

   - Assorted minor fixes and cleanups (Andy Shevchenko, Baoquan He,
     Bhaktipriya Shridhar, Paul Gortmaker, Rafael Wysocki)"

* tag 'acpi-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits)
  ACPI: enable ACPI_PROCESSOR_IDLE on ARM64
  arm64: add support for ACPI Low Power Idle(LPI)
  drivers: firmware: psci: initialise idle states using ACPI LPI
  cpuidle: introduce CPU_PM_CPU_IDLE_ENTER macro for ARM{32, 64}
  arm64: cpuidle: drop __init section marker to arm_cpuidle_init
  ACPI / processor_idle: Add support for Low Power Idle(LPI) states
  ACPI / processor_idle: introduce ACPI_PROCESSOR_CSTATE
  ACPI / DPTF: move int340x_thermal.c to the DPTF folder
  ACPI / DPTF: Add DPTF power participant driver
  ACPI / lpat: make it explicitly non-modular
  ACPI / dock: make dock explicitly non-modular
  ACPI / PCI: make pci_slot explicitly non-modular
  ACPI / PMIC: remove modular references from non-modular code
  ACPICA: Linux: Enable ACPI_MUTEX_DEBUG for Linux kernel
  ACPI: Rename configfs.c to acpi_configfs.c to prevent link error
  ACPI / debugger: Add AML debugger documentation
  ACPI: Add documentation describing ACPICA release automation
  ACPI: add support for loading SSDTs via configfs
  ACPI: add support for configfs
  efi / ACPI: load SSTDs from EFI variables
  ...
2016-07-26 17:56:45 -07:00
Linus Torvalds
6453dbdda3 Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael  Wysocki:
 "Again, the majority of changes go into the cpufreq subsystem, but
  there are no big features this time.  The cpufreq changes that stand
  out somewhat are the governor interface rework and improvements
  related to the handling of frequency tables.  Apart from those, there
  are fixes and new device/CPU IDs in drivers, cleanups and an
  improvement of the new schedutil governor.

  Next, there are some changes in the hibernation core, including a fix
  for a nasty problem related to the MONITOR/MWAIT usage by CPU offline
  during resume from hibernation, a few core improvements related to
  memory management during resume, a couple of additional debug features
  and cleanups.

  Finally, we have some fixes and cleanups in the devfreq subsystem,
  generic power domains framework improvements related to system
  suspend/resume, support for some new chips in intel_idle and in the
  power capping RAPL driver, a new version of the AnalyzeSuspend utility
  and some assorted fixes and cleanups.

  Specifics:

   - Rework the cpufreq governor interface to make it more
     straightforward and modify the conservative governor to avoid using
     transition notifications (Rafael Wysocki).

   - Rework the handling of frequency tables by the cpufreq core to make
     it more efficient (Viresh Kumar).

   - Modify the schedutil governor to reduce the number of wakeups it
     causes to occur in cases when the CPU frequency doesn't need to be
     changed (Steve Muckle, Viresh Kumar).

   - Fix some minor issues and clean up code in the cpufreq core and
     governors (Rafael Wysocki, Viresh Kumar).

   - Add Intel Broxton support to the intel_pstate driver (Srinivas
     Pandruvada).

   - Fix problems related to the config TDP feature and to the validity
     of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka,
     Srinivas Pandruvada).

   - Make intel_pstate update the cpu_frequency tracepoint even if the
     frequency doesn't change to avoid confusing powertop (Rafael
     Wysocki).

   - Clean up the usage of __init/__initdata in intel_pstate, mark some
     of its internal variables as __read_mostly and drop an unused
     structure element from it (Jisheng Zhang, Carsten Emde).

   - Clean up the usage of some duplicate MSR symbols in intel_pstate
     and turbostat (Srinivas Pandruvada).

   - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay
     Adiga, Viresh Kumar, Ben Dooks).

   - Fix a regression (introduced during the 4.5 cycle) in the
     pcc-cpufreq driver by reverting the problematic commit (Andreas
     Herrmann).

   - Add support for Intel Denverton to intel_idle, clean up Broxton
     support in it and make it explicitly non-modular (Jacob Pan, Jan
     Beulich, Paul Gortmaker).

   - Add support for Denverton and Ivy Bridge server to the Intel RAPL
     power capping driver and make it more careful about the handing of
     MSRs that may not be present (Jacob Pan, Xiaolong Wang).

   - Fix resume from hibernation on x86-64 by making the CPU offline
     during resume avoid using MONITOR/MWAIT in the "play dead" loop
     which may lead to an inadvertent "revival" of a "dead" CPU and a
     page fault leading to a kernel crash from it (Rafael Wysocki).

   - Make memory management during resume from hibernation more
     straightforward (Rafael Wysocki).

   - Add debug features that should help to detect problems related to
     hibernation and resume from it (Rafael Wysocki, Chen Yu).

   - Clean up hibernation core somewhat (Rafael Wysocki).

   - Prevent KASAN from instrumenting the hibernation core which leads
     to large numbers of false-positives from it (James Morse).

   - Prevent PM (hibernate and suspend) notifiers from being called
     during the cleanup phase if they have not been called during the
     corresponding preparation phase which is possible if one of the
     other notifiers returns an error at that time (Lianwei Wang).

   - Improve suspend-related debug printout in the tasks freezer and
     clean up suspend-related console handling (Roger Lu, Borislav
     Petkov).

   - Update the AnalyzeSuspend script in the kernel sources to version
     4.2 (Todd Brandt).

   - Modify the generic power domains framework to make it handle system
     suspend/resume better (Ulf Hansson).

   - Make the runtime PM framework avoid resuming devices synchronously
     when user space changes the runtime PM settings for them and
     improve its error reporting (Rafael Wysocki, Linus Walleij).

   - Fix error paths in devfreq drivers (exynos, exynos-ppmu,
     exynos-bus) and in the core, make some devfreq code explicitly
     non-modular and change some of it into tristate (Bartlomiej
     Zolnierkiewicz, Peter Chen, Paul Gortmaker).

   - Add DT support to the generic PM clocks management code and make it
     export some more symbols (Jon Hunter, Paul Gortmaker).

   - Make the PCI PM core code slightly more robust against possible
     driver errors (Andy Shevchenko).

   - Make it possible to change DESTDIR and PREFIX in turbostat (Andy
     Shevchenko)"

* tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits)
  Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency"
  PM / hibernate: Introduce test_resume mode for hibernation
  cpufreq: export cpufreq_driver_resolve_freq()
  cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()
  PCI / PM: check all fields in pci_set_platform_pm()
  cpufreq: acpi-cpufreq: use cached frequency mapping when possible
  cpufreq: schedutil: map raw required frequency to driver frequency
  cpufreq: add cpufreq_driver_resolve_freq()
  cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT
  intel_pstate: Update cpu_frequency tracepoint every time
  cpufreq: intel_pstate: clean remnant struct element
  PM / tools: scripts: AnalyzeSuspend v4.2
  x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
  cpufreq: powernv: Replacing pstate_id with frequency table index
  intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate()
  PM / hibernate: Image data protection during restoration
  PM / hibernate: Add missing braces in __register_nosave_region()
  PM / hibernate: Clean up comments in snapshot.c
  PM / hibernate: Clean up function headers in snapshot.c
  PM / hibernate: Add missing braces in hibernate_setup()
  ...
2016-07-26 17:29:07 -07:00
Linus Torvalds
f7e6816994 Merge tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:

 - initially based on Jens' 'for-4.8/core' (given all the flag churn)
   and later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX
   commits that DM depends on to provide its DAX support

 - clean up the bio-based vs request-based DM core code by moving the
   request-based DM core code out to dm-rq.[hc]

 - reinstate bio-based support in the DM multipath target (done with the
   idea that fast storage like NVMe over Fabrics could benefit) -- while
   preserving support for request_fn and blk-mq request-based DM mpath

 - SCSI and DM multipath persistent reservation fixes that were
   coordinated with Martin Petersen.

 - the DM raid target saw the most extensive change this cycle; it now
   provides reshape and takeover support (by layering ontop of the
   corresponding MD capabilities)

 - DAX support for DM core and the linear, stripe and error targets

 - a DM thin-provisioning block discard vs allocation race fix that
   addresses potential for corruption

 - a stable fix for DM verity-fec's block calculation during decode

 - a few cleanups and fixes to DM core and various targets

* tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (73 commits)
  dm: allow bio-based table to be upgraded to bio-based with DAX support
  dm snap: add fake origin_direct_access
  dm stripe: add DAX support
  dm error: add DAX support
  dm linear: add DAX support
  dm: add infrastructure for DAX support
  dm thin: fix a race condition between discarding and provisioning a block
  dm btree: fix a bug in dm_btree_find_next_single()
  dm raid: fix random optimal_io_size for raid0
  dm raid: address checkpatch.pl complaints
  dm: call PR reserve/unreserve on each underlying device
  sd: don't use the ALL_TG_PT bit for reservations
  dm: fix second blk_delay_queue() parameter to be in msec units not jiffies
  dm raid: change logical functions to actually return bool
  dm raid: use rdev_for_each in status
  dm raid: use rs->raid_disks to avoid memory leaks on free
  dm raid: support delta_disks for raid1, fix table output
  dm raid: enhance reshape check and factor out reshape setup
  dm raid: allow resize during recovery
  dm raid: fix rs_is_recovering() to allow for lvextend
  ...
2016-07-26 17:12:11 -07:00
Minchan Kim
dd4123f324 mm: fix build warnings in <linux/compaction.h>
Randy reported below build error.

> In file included from ../include/linux/balloon_compaction.h:48:0,
>                  from ../mm/balloon_compaction.c:11:
> ../include/linux/compaction.h:237:51: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline int compaction_register_node(struct node *node)
> ../include/linux/compaction.h:237:51: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
> ../include/linux/compaction.h:242:54: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline void compaction_unregister_node(struct node *node)
>

It was caused by non-lru page migration which needs compaction.h but
compaction.h doesn't include any header to be standalone.

I think proper header for non-lru page migration is migrate.h rather
than compaction.h because migrate.h has already headers needed to work
non-lru page migration indirectly like isolate_mode_t, migrate_mode
MIGRATEPAGE_SUCCESS.

[akpm@linux-foundation.org: revert mm-balloon-use-general-non-lru-movable-page-feature-fix.patch temp fix]
Link: http://lkml.kernel.org/r/20160610003304.GE29779@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
779750d20b shmem: split huge pages beyond i_size under memory pressure
Even if user asked to allocate huge pages always (huge=always), we
should be able to free up some memory by splitting pages which are
partly byound i_size if memory presure comes or once we hit limit on
filesystem size (-o size=).

In order to do this we maintain per-superblock list of inodes, which
potentially have huge pages on the border of file size.

Per-fs shrinker can reclaim memory by splitting such pages.

If we hit -ENOSPC during shmem_getpage_gfp(), we try to split a page to
free up space on the filesystem and retry allocation if it succeed.

Link: http://lkml.kernel.org/r/1466021202-61880-37-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
e496cf3d78 thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
For file mappings, we don't deposit page tables on THP allocation
because it's not strictly required to implement split_huge_pmd(): we can
just clear pmd and let following page faults to reconstruct the page
table.

But Power makes use of deposited page table to address MMU quirk.

Let's hide THP page cache, including huge tmpfs, under separate config
option, so it can be forbidden on Power.

We can revert the patch later once solution for Power found.

Link: http://lkml.kernel.org/r/1466021202-61880-36-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
f3f0e1d215 khugepaged: add support of collapse for tmpfs/shmem pages
This patch extends khugepaged to support collapse of tmpfs/shmem pages.
We share fair amount of infrastructure with anon-THP collapse.

Few design points:

  - First we are looking for VMA which can be suitable for mapping huge
    page;

  - If the VMA maps shmem file, the rest scan/collapse operations
    operates on page cache, not on page tables as in anon VMA case.

  - khugepaged_scan_shmem() finds a range which is suitable for huge
    page. The scan is lockless and shouldn't disturb system too much.

  - once the candidate for collapse is found, collapse_shmem() attempts
    to create a huge page:

      + scan over radix tree, making the range point to new huge page;

      + new huge page is not-uptodate, locked and freezed (refcount
        is 0), so nobody can touch them until we say so.

      + we swap in pages during the scan. khugepaged_scan_shmem()
        filters out ranges with more than khugepaged_max_ptes_swap
	swapped out pages. It's HPAGE_PMD_NR/8 by default.

      + old pages are isolated, unmapped and put to local list in case
        to be restored back if collapse failed.

  - if collapse succeed, we retract pte page tables from VMAs where huge
    pages mapping is possible. The huge page will be mapped as PMD on
    next minor fault into the range.

Link: http://lkml.kernel.org/r/1466021202-61880-35-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
b46e756f5e thp: extract khugepaged from mm/huge_memory.c
khugepaged implementation grew to the point when it deserve separate
file in source.

Let's move it to mm/khugepaged.c.

Link: http://lkml.kernel.org/r/1466021202-61880-32-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
800d8c63b2 shmem: add huge pages support
Here's basic implementation of huge pages support for shmem/tmpfs.

It's all pretty streight-forward:

  - shmem_getpage() allcoates huge page if it can and try to inserd into
    radix tree with shmem_add_to_page_cache();

  - shmem_add_to_page_cache() puts the page onto radix-tree if there's
    space for it;

  - shmem_undo_range() removes huge pages, if it fully within range.
    Partial truncate of huge pages zero out this part of THP.

    This have visible effect on fallocate(FALLOC_FL_PUNCH_HOLE)
    behaviour. As we don't really create hole in this case,
    lseek(SEEK_HOLE) may have inconsistent results depending what
    pages happened to be allocated.

  - no need to change shmem_fault: core-mm will map an compound page as
    huge if VMA is suitable;

Link: http://lkml.kernel.org/r/1466021202-61880-30-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Hugh Dickins
c01d5b3007 shmem: get_unmapped_area align huge page
Provide a shmem_get_unmapped_area method in file_operations, called at
mmap time to decide the mapping address.  It could be conditional on
CONFIG_TRANSPARENT_HUGEPAGE, but save #ifdefs in other places by making
it unconditional.

shmem_get_unmapped_area() first calls the usual mm->get_unmapped_area
(which we treat as a black box, highly dependent on architecture and
config and executable layout).  Lots of conditions, and in most cases it
just goes with the address that chose; but when our huge stars are
rightly aligned, yet that did not provide a suitable address, go back to
ask for a larger arena, within which to align the mapping suitably.

There have to be some direct calls to shmem_get_unmapped_area(), not via
the file_operations: because of the way shmem_zero_setup() is called to
create a shmem object late in the mmap sequence, when MAP_SHARED is
requested with MAP_ANONYMOUS or /dev/zero.  Though this only matters
when /proc/sys/vm/shmem_huge has been set.

Link: http://lkml.kernel.org/r/1466021202-61880-29-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
5a6e75f811 shmem: prepare huge= mount option and sysfs knob
This patch adds new mount option "huge=".  It can have following values:

  - "always":
	Attempt to allocate huge pages every time we need a new page;

  - "never":
	Do not allocate huge pages;

  - "within_size":
	Only allocate huge page if it will be fully within i_size.
	Also respect fadvise()/madvise() hints;

  - "advise:
	Only allocate huge pages if requested with fadvise()/madvise();

Default is "never" for now.

"mount -o remount,huge= /mountpoint" works fine after mount: remounting
huge=never will not attempt to break up huge pages at all, just stop
more from being allocated.

No new config option: put this under CONFIG_TRANSPARENT_HUGEPAGE, which
is the appropriate option to protect those who don't want the new bloat,
and with which we shall share some pmd code.

Prohibit the option when !CONFIG_TRANSPARENT_HUGEPAGE, just as mpol is
invalid without CONFIG_NUMA (was hidden in mpol_parse_str(): make it
explicit).

Allow enabling THP only if the machine has_transparent_hugepage().

But what about Shmem with no user-visible mount? SysV SHM, memfds,
shared anonymous mmaps (of /dev/zero or MAP_ANONYMOUS), GPU drivers' DRM
objects, Ashmem.  Though unlikely to suit all usages, provide sysfs knob
/sys/kernel/mm/transparent_hugepage/shmem_enabled to experiment with
huge on those.

And allow shmem_enabled two further values:

  - "deny":
	For use in emergencies, to force the huge option off from
	all mounts;
  - "force":
	Force the huge option on for all - very useful for testing;

Based on patch by Hugh Dickins.

Link: http://lkml.kernel.org/r/1466021202-61880-28-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
65c453778a mm, rmap: account shmem thp pages
Let's add ShmemHugePages and ShmemPmdMapped fields into meminfo and
smaps.  It indicates how many times we allocate and map shmem THP.

NR_ANON_TRANSPARENT_HUGEPAGES is renamed to NR_ANON_THPS.

Link: http://lkml.kernel.org/r/1466021202-61880-27-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
c78c66d1dd radix-tree: implement radix_tree_maybe_preload_order()
The new helper is similar to radix_tree_maybe_preload(), but tries to
preload number of nodes required to insert (1 << order) continuous
naturally-aligned elements.

This is required to push huge pages into pagecache.

Link: http://lkml.kernel.org/r/1466021202-61880-24-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
e2f0a0db95 page-flags: relax policy for PG_mappedtodisk and PG_reclaim
These flags are in use for file THP.

Link: http://lkml.kernel.org/r/1466021202-61880-23-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
9a73f61bdb thp, mlock: do not mlock PTE-mapped file huge pages
As with anon THP, we only mlock file huge pages if we can prove that the
page is not mapped with PTE.  This way we can avoid mlock leak into
non-mlocked vma on split.

We rely on PageDoubleMap() under lock_page() to check if the the page
may be PTE mapped.  PG_double_map is set by page_add_file_rmap() when
the page mapped with PTEs.

Link: http://lkml.kernel.org/r/1466021202-61880-21-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
95ecedcd6a thp, vmstats: add counters for huge file pages
THP_FILE_ALLOC: how many times huge page was allocated and put page
cache.

THP_FILE_MAPPED: how many times file huge page was mapped.

Link: http://lkml.kernel.org/r/1466021202-61880-13-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
1010245964 mm: introduce do_set_pmd()
With postponed page table allocation we have chance to setup huge pages.
do_set_pte() calls do_set_pmd() if following criteria met:

 - page is compound;
 - pmd entry in pmd_none();
 - vma has suitable size and alignment;

Link: http://lkml.kernel.org/r/1466021202-61880-12-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kirill A. Shutemov
dd78fedde4 rmap: support file thp
Naive approach: on mapping/unmapping the page as compound we update
->_mapcount on each 4k page.  That's not efficient, but it's not obvious
how we can optimize this.  We can look into optimization later.

PG_double_map optimization doesn't work for file pages since lifecycle
of file pages is different comparing to anon pages: file page can be
mapped again at any time.

Link: http://lkml.kernel.org/r/1466021202-61880-11-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00