Commit Graph

54654 Commits

Author SHA1 Message Date
Linus Torvalds
bb6bbc7ca2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix wrong TCP checksums on MTU probing when checksum offloading is
    disabled, from Douglas Caetano dos Santos.

 2) Fix qdisc backlog updates in qfq and sfb schedulers, from Cong Wang.

 3) Route lookup flow key protocol value is wrong in ip6gre_xmit_other(),
    fix from Lance Richardson.

 4) Scheduling while atomic in multicast routing code of ipv4 and ipv6,
    fix from Nikolay Aleksandrov.

 5) Fix packet alignment in fec driver, from Eric Nelson.

 6) Fix perf regression in sctp due to struct layout and cache misses,
    from Xin Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock
  sctp: change to check peer prsctp_capable when using prsctp polices
  sctp: remove prsctp_param from sctp_chunk
  sctp: move sent_count to the memory hole in sctp_chunk
  tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
  act_ife: Fix false encoding
  act_ife: Fix external mac header on encode
  VSOCK: Don't dec ack backlog twice for rejected connections
  Revert "net: ethernet: bcmgenet: use phydev from struct net_device"
  net: fec: align IP header in hardware
  net: fec: remove QUIRK_HAS_RACC from i.mx27
  net: fec: remove QUIRK_HAS_RACC from i.mx25
  ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
  ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
  tcp: fix a compile error in DBGUNDO()
  tcp: fix wrong checksum calculation on MTU probing
  sch_sfb: keep backlog updated with qlen
  sch_qfq: keep backlog updated with qlen
  can: dev: fix deadlock reported after bus-off
2016-10-02 10:36:41 -07:00
Rafael J. Wysocki
993eb0aeae Merge branches 'pm-devfreq' and 'pm-sleep'
* pm-devfreq:
  PM / devfreq: rk3399_dmc: Remove explictly regulator_put call in .remove
  PM / devfreq: rockchip: add PM_DEVFREQ_EVENT dependency
  partial revert of "PM / devfreq: Add COMPILE_TEST for build coverage"
  PM / devfreq: rockchip: add devfreq driver for rk3399 dmc
  Documentation: bindings: add dt documentation for rk3399 dmc
  PM / devfreq: event: support rockchip dfi controller
  Documentation: bindings: add dt documentation for dfi controller
  PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
  PM / devfreq: fix Kconfig indent style
  PM / devfreq: Add COMPILE_TEST for build coverage
  PM / devfreq: exynos-ppmu: remove unneeded of_node_put()

* pm-sleep:
  PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO
  PM / sleep: enable suspend-to-idle even without registered suspend_ops
  PM / sleep: Increase default DPM watchdog timeout to 120
2016-10-02 01:43:45 +02:00
Rafael J. Wysocki
7005f6dc69 Merge branch 'pm-cpufreq'
* pm-cpufreq: (24 commits)
  cpufreq: st: add missing \n to end of dev_err message
  cpufreq: kirkwood: add missing \n to end of dev_err messages
  cpufreq: CPPC: Avoid overflow when calculating desired_perf
  cpufreq: ti: Use generic platdev driver
  cpufreq: intel_pstate: Add io_boost trace
  cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm
  cpufreq: schedutil: Add iowait boosting
  cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition
  cpufreq: CPPC: Force reporting values in KHz to fix user space interface
  cpufreq: create link to policy only for registered CPUs
  intel_pstate: constify local structures
  cpufreq: dt: Support governor tunables per policy
  cpufreq: dt: Update kconfig description
  cpufreq: dt: Remove unused code
  MAINTAINERS: Add Documentation/cpu-freq/
  cpufreq: dt: Add support for r8a7792
  cpufreq / sched: ignore SMT when determining max cpu capacity
  cpufreq: Drop unnecessary check from cpufreq_policy_alloc()
  ARM: multi_v7_defconfig: Don't attempt to enable schedutil governor as module
  ARM: exynos_defconfig: Don't attempt to enable schedutil governor as module
  ...
2016-10-02 01:42:45 +02:00
Rafael J. Wysocki
b6e2511782 Merge branch 'pm-cpufreq-sched' into pm-cpufreq 2016-10-02 01:42:33 +02:00
Rafael J. Wysocki
2dc3c72cd0 Merge branch 'pm-domains'
* pm-domains:
  PM / Domains: Rename pm_genpd_sync_poweron|poweroff()
  PM / Domains: Don't measure latency of ->power_on|off() during system PM
  PM / Domains: Remove redundant system PM callbacks
  PM / Domains: Simplify detaching a device from its genpd
  PM / Domains: Allow holes in genpd_data.domains array
  PM / Domains: Add support for removing nested PM domains by provider
  PM / Domains: Add support for removing PM domains
  PM / Domains: Store the provider in the PM domain structure
  PM / Domains: Prepare for adding support to remove PM domains
  PM / Domains: Verify the PM domain is present when adding a provider
  PM / Domains: Don't expose xlate and provider helper functions
  PM / Domains: Don't expose generic_pm_domain structure to clients
  staging: board: Remove calls to of_genpd_get_from_provider()
  ARM: EXYNOS: Remove calls to of_genpd_get_from_provider()
  PM / Domains: Add new helper functions for device-tree
  PM / Domains: Always enable debugfs support if available
2016-10-02 01:41:29 +02:00
Rafael J. Wysocki
0137a337d7 Merge branches 'acpi-wdat' and 'acpi-ec'
* acpi-wdat:
  watchdog: wdat_wdt: Fix warning for using 0 as NULL
  watchdog: wdat_wdt: fix return value check in wdat_wdt_probe()
  platform/x86: intel_pmc_ipc: Do not create iTCO watchdog when WDAT table exists
  i2c: i801: Do not create iTCO watchdog when WDAT table exists
  mfd: lpc_ich: Do not create iTCO watchdog when WDAT table exists
  ACPI / watchdog: Add support for WDAT hardware watchdog

* acpi-ec:
  ACPI / EC: Fix issues related to boot_ec
  ACPI / EC: Fix a gap that ECDT EC cannot handle EC events
  ACPI / EC: Fix a memory leakage issue in acpi_ec_add()
  ACPI / EC: Cleanup first_ec/boot_ec code
  ACPI / EC: Enable event freeze mode to improve event handling for suspend process
  ACPI / EC: Add PM operations to improve event handling for suspend process
  ACPI / EC: Add PM operations to improve event handling for resume process
  ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled
  ACPI / EC: Add EC_FLAGS_QUERY_ENABLED to reveal a hidden logic
  ACPI / EC: Add PM operations for suspend/resume noirq stage
2016-10-02 01:40:07 +02:00
Rafael J. Wysocki
4ec9e2890a Merge branch 'acpi-bus'
* acpi-bus:
  ACPI / bus: Adjust ACPI subsystem initialization for new table loading mode
  ACPI / bus: Make acpi_get_first_physical_node() public
2016-10-02 01:38:44 +02:00
Rafael J. Wysocki
84a78c724b Merge branches 'acpi-sysfs', 'acpi-pci' and 'acpi-tables'
* acpi-sysfs:
  ACPI / sysfs: Update sysfs signature handling code
  ACPI / sysfs: Fix an issue for LoadTable opcode
  ACPI / sysfs: Use new GPE masking mechanism in GPE interface

* acpi-pci:
  ACPI / platform: Pay attention to parent device's resources
  PCI: Add pci_find_resource()
  ACPI / PCI: fix GIC irq model default PCI IRQ polarity

* acpi-tables:
  ACPI / tables: Remove duplicated include from tables.c
  ACPI / tables: do not report the number of entries ignored by acpi_parse_entries()
  ACPI / tables: fix acpi_parse_entries_array() so it traverses all subtables
  ACPI / tables: fix incorrect counts returned by acpi_parse_entries_array()
2016-10-02 01:38:34 +02:00
Brian Masney
f3b0deea89 include: linux: iio: add IIO_ATTR_{RO, WO, RW} and IIO_DEVICE_ATTR_{RO, WO, RW} macros
Add new macros: IIO_ATTR_RO, IIO_ATTR_WO, IIO_ATTR_RW,
IIO_DEVICE_ATTR_RO, IIO_DEVICE_ATTR_WO and IIO_DEVICE_ATTR_RW to reduce
the amount of boiler plate code that is needed for creating new
attributes. This mimics the *_RO, *_WO, and *_RW macros that are found
in include/linux/device.h and include/linux/sysfs.h.

Signed-off-by: Brian Masney <masneyb@onstation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-10-01 14:43:59 +01:00
Bhumika Goyal
a9a0d64a8b iio: Declare event_attrs field of iio_info structure as const
The event_attrs field of iio_info structure is only initialized once
whenever an object of iio_info is created. After that this field
is never modified again anywhere in the kernel. So, declare event_attrs
field of iio_info as a const struct attribute_group.
Checked for occurences throughout the kernel using grep and
coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-10-01 14:37:35 +01:00
Dan Williams
44c462eb9e libnvdimm, region: move region-mapping input-paramters to nd_mapping_desc
Before we add more libnvdimm-private fields to nd_mapping make it clear
which parameters are input vs libnvdimm internals. Use struct
nd_mapping_desc instead of struct nd_mapping in nd_region_desc and make
struct nd_mapping private to libnvdimm.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-30 19:13:42 -07:00
Vishal Verma
e046114af5 libnvdimm: clear the internal poison_list when clearing badblocks
nvdimm_clear_poison cleared the user-visible badblocks, and sent
commands to the NVDIMM to clear the areas marked as 'poison', but it
neglected to clear the same areas from the internal poison_list which is
used to marshal ARS results before sorting them by namespace. As a
result, once on-demand ARS functionality was added:

37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand

A scrub triggered from either sysfs or an MCE was found to be adding
stale entries that had been cleared from gendisk->badblocks, but were
still present in nvdimm_bus->poison_list. Additionally, the stale entries
could be triggered into producing stale disk->badblocks by simply disabling
and re-enabling the namespace or region.

This adds the missing step of clearing poison_list entries when clearing
poison, so that it is always in sync with badblocks.

Fixes: 37b137f ("nfit, libnvdimm: allow an ARS scrub to be triggered on demand")
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-30 17:03:45 -07:00
John Youn
37aa7271d9 include/linux/property.h: fix typo/compile error
This fixes commit d76eebfa17 ("include/linux/property.h: fix build
issues with gcc-4.4.4").

With that commit we get the following compile error when using the
PROPERTY_ENTRY_INTEGER_ARRAY macro.

 include/linux/property.h:201:39: error: `u32_data' undeclared (first
                 use in this function)
  PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u32, _val_)
                                       ^
 include/linux/property.h:193:17: note: in definition of macro
                 `PROPERTY_ENTRY_INTEGER_ARRAY'
  { .pointer = { _type_##_data = _val_ } },  \
                 ^

This needs a '.' to reference the union member.  It seems this was just
overlooked here since it is done correctly in similar constructs in
other parts of the original commit.

This fix is in preparation of upcoming commits that will use this macro.

Fixes: commit d76eebfa17 ("include/linux/property.h: fix build issues with gcc-4.4.4")
Link: http://lkml.kernel.org/r/2de3b929290d88a723ed829a3e3cbd02044714df.1475114627.git.johnyoun@synopsys.com
Signed-off-by: John Youn <johnyoun@synopsys.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Johannes Weiner
22f2ac51b6 mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

Fixes: 449dd6984d ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>	[3.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Frank Sorenson
e856a231d5 sunrpc: add hash_cred() function to rpc_authops struct
Currently, a single hash algorithm is used to hash the auth_cred for
the credcache for all rpc_auth types.  Add a hash_cred() function to
the rpc_authops struct to allow a hash function specific to each
auth flavor.

Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-30 15:28:46 -04:00
Eric W. Biederman
d29216842a mnt: Add a per mount namespace limit on the number of mounts
CAI Qian <caiqian@redhat.com> pointed out that the semantics
of shared subtrees make it possible to create an exponentially
increasing number of mounts in a mount namespace.

    mkdir /tmp/1 /tmp/2
    mount --make-rshared /
    for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done

Will create create 2^20 or 1048576 mounts, which is a practical problem
as some people have managed to hit this by accident.

As such CVE-2016-6213 was assigned.

Ian Kent <raven@themaw.net> described the situation for autofs users
as follows:

> The number of mounts for direct mount maps is usually not very large because of
> the way they are implemented, large direct mount maps can have performance
> problems. There can be anywhere from a few (likely case a few hundred) to less
> than 10000, plus mounts that have been triggered and not yet expired.
>
> Indirect mounts have one autofs mount at the root plus the number of mounts that
> have been triggered and not yet expired.
>
> The number of autofs indirect map entries can range from a few to the common
> case of several thousand and in rare cases up to between 30000 and 50000. I've
> not heard of people with maps larger than 50000 entries.
>
> The larger the number of map entries the greater the possibility for a large
> number of active mounts so it's not hard to expect cases of a 1000 or somewhat
> more active mounts.

So I am setting the default number of mounts allowed per mount
namespace at 100,000.  This is more than enough for any use case I
know of, but small enough to quickly stop an exponential increase
in mounts.  Which should be perfect to catch misconfigurations and
malfunctioning programs.

For anyone who needs a higher limit this can be changed by writing
to the new /proc/sys/fs/mount-max sysctl.

Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-09-30 12:46:48 -05:00
Jaegeuk Kim
a468f0ef51 f2fs: use crc and cp version to determine roll-forward recovery
Previously, we used cp_version only to detect recoverable dnodes.
In order to avoid same garbage cp_version, we needed to truncate the next
dnode during checkpoint, resulting in additional discard or data write.
If we can distinguish this by using crc in addition to cp_version, we can
remove this overhead.

There is backward compatibility concern where it changes node_footer layout.
So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to
detect new layout. New layout will be activated only when this flag is set.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 10:05:46 -07:00
Mark Brown
66b5a337d0 Merge remote-tracking branches 'spi/topic/octeon', 'spi/topic/pic32-sqi', 'spi/topic/pxa2xx' and 'spi/topic/qup' into spi-next 2016-09-30 09:14:14 -07:00
Mark Brown
e2df04ed3b Merge remote-tracking branches 'spi/topic/fsl-espi', 'spi/topic/imx', 'spi/topic/jcore', 'spi/topic/loopback' and 'spi/topic/meson' into spi-next 2016-09-30 09:14:10 -07:00
Mark Brown
1a8dabf88d Merge remote-tracking branch 'spi/topic/core' into spi-next 2016-09-30 09:14:06 -07:00
Mark Brown
2dfcb921da Merge remote-tracking branches 'regulator/topic/of', 'regulator/topic/pv88080', 'regulator/topic/rk808', 'regulator/topic/set-voltage' and 'regulator/topic/tps65218' into regulator-next 2016-09-30 09:13:58 -07:00
Mark Brown
81c383c9ba Merge remote-tracking branches 'regulator/topic/bulk', 'regulator/topic/dbx500', 'regulator/topic/hi6421', 'regulator/topic/load' and 'regulator/topic/ltc3676' into regulator-next 2016-09-30 09:13:55 -07:00
Thomas Gleixner
d7e25c66c9 Merge branch 'x86/urgent' into x86/asm
Get the cr4 fixes so we can apply the final cleanup
2016-09-30 12:38:28 +02:00
Sebastian Andrzej Siewior
c8761e2016 xen/events: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30 11:23:04 +01:00
Boris Ostrovsky
4d737042d6 xen/x86: Convert to hotplug state machine
Switch to new CPU hotplug infrastructure.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30 11:22:59 +01:00
Frederic Weisbecker
68107df5f2 u64_stats: Introduce IRQs disabled helpers
Introduce light versions of u64_stats helpers for context where
either preempt or IRQs are disabled. This way we can make this library
usable by scheduler irqtime accounting which currenty implement its
ad-hoc version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:40 +02:00
Peter Zijlstra
a458ae2ea6 sched/core, ia64: Rename set_curr_task()
Rename the ia64 only set_curr_task() function to free up the name.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:03:27 +02:00
Peter Zijlstra
10e2f1acd0 sched/core: Rewrite and improve select_idle_siblings()
select_idle_siblings() is a known pain point for a number of
workloads; it either does too much or not enough and sometimes just
does plain wrong.

This rewrite attempts to address a number of issues (but sadly not
all).

The current code does an unconditional sched_domain iteration; with
the intent of finding an idle core (on SMT hardware). The problems
which this patch tries to address are:

 - its pointless to look for idle cores if the machine is real busy;
   at which point you're just wasting cycles.

 - it's behaviour is inconsistent between SMT and !SMT hardware in
   that !SMT hardware ends up doing a scan for any idle CPU in the LLC
   domain, while SMT hardware does a scan for idle cores and if that
   fails, falls back to a scan for idle threads on the 'target' core.

The new code replaces the sched_domain scan with 3 explicit scans:

 1) search for an idle core in the LLC
 2) search for an idle CPU in the LLC
 3) search for an idle thread in the 'target' core

where 1 and 3 are conditional on SMT support and 1 and 2 have runtime
heuristics to skip the step.

Step 1) is conditional on sd_llc_shared->has_idle_cores; when a cpu
goes idle and sd_llc_shared->has_idle_cores is false, we scan all SMT
siblings of the CPU going idle. Similarly, we clear
sd_llc_shared->has_idle_cores when we fail to find an idle core.

Step 2) tracks the average cost of the scan and compares this to the
average idle time guestimate for the CPU doing the wakeup. There is a
significant fudge factor involved to deal with the variability of the
averages. Esp. hackbench was sensitive to this.

Step 3) is unconditional; we assume (also per step 1) that scanning
all SMT siblings in a core is 'cheap'.

With this; SMT systems gain step 2, which cures a few benchmarks --
notably one from Facebook.

One 'feature' of the sched_domain iteration, which we preserve in the
new code, is that it would start scanning from the 'target' CPU,
instead of scanning the cpumask in cpu id order. This avoids multiple
CPUs in the LLC scanning for idle to gang up and find the same CPU
quite as much. The down side is that tasks can end up hopping across
the LLC for no apparent reason.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:03:09 +02:00
Ingo Molnar
0b429e18c2 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:54:46 +02:00
Peter Zijlstra
0e369d7575 sched/core: Replace sd_busy/nr_busy_cpus with sched_domain_shared
Move the nr_busy_cpus thing from its hacky sd->parent->groups->sgc
location into the much more natural sched_domain_shared location.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:54:07 +02:00
Peter Zijlstra
24fc7edb92 sched/core: Introduce 'struct sched_domain_shared'
Since struct sched_domain is strictly per cpu; introduce a structure
that is shared between all 'identical' sched_domains.

Limit to SD_SHARE_PKG_RESOURCES domains for now, as we'll only use it
for shared cache state; if another use comes up later we can easily
relax this.

While the sched_group's are normally shared between CPUs, these are
not natural to use when we need some shared state on a domain level --
since that would require the domain to have a parent, which is not a
given.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:54:06 +02:00
Oleg Nesterov
0176beaffb sched/wait: Introduce init_wait_entry()
The partial initialization of wait_queue_t in prepare_to_wait_event() looks
ugly. This was done to shrink .text, but we can simply add the new helper
which does the full initialization and shrink the compiled code a bit more.

And. This way prepare_to_wait_event() can have more users. In particular we
are ready to remove the signal_pending_state() checks from wait_bit_action_f
helpers and change __wait_on_bit_lock() to use prepare_to_wait_event().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140055.GA6167@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:54:03 +02:00
Oleg Nesterov
eaf9ef5224 sched/wait: Avoid abort_exclusive_wait() in __wait_on_bit_lock()
__wait_on_bit_lock() doesn't need abort_exclusive_wait() too. Right
now it can't use prepare_to_wait_event() (see the next change), but
it can do the additional finish_wait() if action() fails.

abort_exclusive_wait() no longer has callers, remove it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140053.GA6164@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:54:03 +02:00
Oleg Nesterov
b1ea06a90f sched/wait: Avoid abort_exclusive_wait() in ___wait_event()
___wait_event() doesn't really need abort_exclusive_wait(), we can simply
change prepare_to_wait_event() to remove the waiter from q->task_list if
it was interrupted.

This simplifies the code/logic, and this way prepare_to_wait_event() can
have more users, see the next change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160908164815.GA18801@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
--
 include/linux/wait.h |    7 +------
 kernel/sched/wait.c  |   35 +++++++++++++++++++++++++----------
 2 files changed, 26 insertions(+), 16 deletions(-)
2016-09-30 10:53:44 +02:00
Oleg Nesterov
38a3e1fc1d sched/wait: Fix abort_exclusive_wait(), it should pass TASK_NORMAL to wake_up()
Otherwise this logic only works if mode is "compatible" with another
exclusive waiter.

If some wq has both TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE waiters,
abort_exclusive_wait() won't wait an uninterruptible waiter.

The main user is __wait_on_bit_lock() and currently it is fine but only
because TASK_KILLABLE includes TASK_UNINTERRUPTIBLE and we do not have
lock_page_interruptible() yet.

Just use TASK_NORMAL and remove the "mode" arg from abort_exclusive_wait().
Yes, this means that (say) wake_up_interruptible() can wake up the non-
interruptible waiter(s), but I think this is fine. And in fact I think
that abort_exclusive_wait() must die, see the next change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140047.GA6157@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:53:19 +02:00
Ingo Molnar
536e0e81e0 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 10:44:27 +02:00
Maciej Żenczykowski
bd11f0741f ipv6 addrconf: implement RFC7559 router solicitation backoff
This implements:
  https://tools.ietf.org/html/rfc7559

Backoff is performed according to RFC3315 section 14:
  https://tools.ietf.org/html/rfc3315#section-14

We allow setting /proc/sys/net/ipv6/conf/*/router_solicitations
to a negative value meaning an unlimited number of retransmits,
and we make this the new default (inline with the RFC).

We also add a new setting:
  /proc/sys/net/ipv6/conf/*/router_solicitation_max_interval
defaulting to 1 hour (per RFC recommendation).

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-30 01:54:28 -04:00
Niklas Söderlund
3757dc48a6 dma-mapping: fix m32r build warning
kbuild test robot reports:

   In file included from include/linux/skbuff.h:34:0,
                    from include/linux/icmpv6.h:4,
                    from include/linux/ipv6.h:75,
                    from include/net/ipv6.h:16,
                    from include/linux/sunrpc/clnt.h:27,
                    from include/linux/nfs_fs.h:30,
                    from fs/lockd/clntlock.c:13:
   include/linux/dma-mapping.h: In function 'dma_map_resource':
>> include/linux/dma-mapping.h:274:16: warning: unused variable 'pfn' [-Wunused-variable]
     unsigned long pfn = __phys_to_pfn(phys_addr);
                   ^~~

The pfn value is only used once in the call to pfn_valid(), remove the
variable and calculate the pfn when it's needed. Note that the kbuild
report is old and PHYS_PFN() is now used instead of __phys_to_pfn() to
calculate the pfn.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-09-29 17:38:02 +05:30
Niklas Söderlund
2895e1f804 dma-mapping: fix ia64 build, use PHYS_PFN
kbuild test robot reports:

   In file included from include/linux/skbuff.h:34:0,
                    from include/linux/tcp.h:21,
                    from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:119:
   include/linux/dma-mapping.h: In function 'dma_map_resource':
>> include/linux/dma-mapping.h:274:22: error: implicit declaration of function '__phys_to_pfn' [-Werror=implicit-function-declaration]
     unsigned long pfn = __phys_to_pfn(phys_addr);
                         ^~~~~~~~~~~~~

ia64 does not provide __phys_to_pfn(), use the PHYS_PFN() alias.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-09-29 17:38:02 +05:30
Colin Ian King
6c3f70ac7c HID: add missing \n to end of dev_warn messages
Trival fix, dev_warn messages are missing a \n, so add it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-09-29 10:40:13 +02:00
Josef Bacik
484611357c bpf: allow access into map value arrays
Suppose you have a map array value that is something like this

struct foo {
	unsigned iter;
	int array[SOME_CONSTANT];
};

You can easily insert this into an array, but you cannot modify the contents of
foo->array[] after the fact.  This is because we have no way to verify we won't
go off the end of the array at verification time.  This patch provides a start
for this work.  We accomplish this by keeping track of a minimum and maximum
value a register could be while we're checking the code.  Then at the time we
try to do an access into a MAP_VALUE we verify that the maximum offset into that
region is a valid access into that memory region.  So in practice, code such as
this

unsigned index = 0;

if (foo->iter >= SOME_CONSTANT)
	foo->iter = index;
else
	index = foo->iter++;
foo->array[index] = bar;

would be allowed, as we can verify that index will always be between 0 and
SOME_CONSTANT-1.  If you wish to use signed values you'll have to have an extra
check to make sure the index isn't less than 0, or do something like index %=
SOME_CONSTANT.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-29 01:35:35 -04:00
Andrey Smirnov
2481366afd dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
When CONFIG_DMA_API_DEBUG is enabled we need to preserve unmapping address
even if "unmap" is a no-op for our architecutre because we need
debug_dma_unmap_page() to correctly cleanup all of the debug bookkeeping.
Failing to do so results in a false positive warnings about previously
mapped areas never being unmapped.

Link: http://lkml.kernel.org/r/1474387125-3713-1-git-send-email-andrew.smirnov@gmail.com
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Geliang Tang <geliangtang@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:01 -07:00
Linus Walleij
ac2a8bca03 Merge branch 'ib-move-htc-egpio' into devel 2016-09-28 09:30:21 -07:00
Linus Walleij
3c6e8d05d6 mfd/gpio: Move HTC GPIO driver to GPIO subsystem
The HTC GPIO driver is a pure GPIO driver and I just can not
see what it is doing inside MFD. Let's just move it to GPIO
and take this opportunity to move the platform data to
<linux/platform_data/gpio-htc-egpio.h>

Cc: arm@kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-09-28 09:28:34 -07:00
Aleksey Makarov
ad1696f6f0 ACPI: parse SPCR and enable matching console
'ARM Server Base Boot Requiremets' [1] mentions SPCR (Serial Port
Console Redirection Table) [2] as a mandatory ACPI table that
specifies the configuration of serial console.

Defer initialization of DT earlycon until ACPI/DT decision is made.

Parse the ACPI SPCR table, setup earlycon if required,
enable specified console.

Thanks to Peter Hurley for explaining how this should work.

[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0044a/index.html
[2] https://msdn.microsoft.com/en-us/library/windows/hardware/dn639132(v=vs.85).aspx

Signed-off-by: Aleksey Makarov <aleksey.makarov@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-28 17:46:46 +02:00
Leif Lindholm
d503187b6c of/serial: move earlycon early_param handling to serial
We have multiple "earlycon" early_param handlers - merge the DT one into
the main earlycon one.  It's a cleanup that also will be useful
to defer setting up DT console until ACPI/DT decision is made.

Rename the exported function to avoid clashing with the function from
arch/microblaze/kernel/prom.c

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Aleksey Makarov <aleksey.makarov@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-28 17:43:15 +02:00
Peter Ujfalusi
a8db115e47 dmaengine/ARM: omap-dma: Fix the DMAengine compile test on non OMAP configs
The DMAengine driver for omap-dma use three function calls from the
plat-omap legacy driver. When the DMAengine driver is built when ARCH_OMAP
is not set, the compilation will fail due to missing symbols.
Add empty inline functions to allow the DMAengine driver to be compiled
with COMPILE_TEST.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-09-28 08:53:14 +05:30
Dave Airlie
ca09fb9f60 Merge tag 'v4.8-rc8' into drm-next
Linux 4.8-rc8

There was a lot of fallout in the imx/amdgpu/i915 drivers, so backmerge
it now to avoid troubles.

* tag 'v4.8-rc8': (1442 commits)
  Linux 4.8-rc8
  fault_in_multipages_readable() throws set-but-unused error
  mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
  radix tree: fix sibling entry handling in radix_tree_descend()
  radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
  fix memory leaks in tracing_buffers_splice_read()
  tracing: Move mutex to protect against resetting of seq data
  MIPS: Fix delay slot emulation count in debugfs
  MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
  mm: delete unnecessary and unsafe init_tlb_ubc()
  huge tmpfs: fix Committed_AS leak
  shmem: fix tmpfs to handle the huge= option properly
  blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
  MIPS: Fix pre-r6 emulation FPU initialisation
  arm64: kgdb: handle read-only text / modules
  arm64: Call numa_store_cpu_info() earlier.
  locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
  nvme-rdma: only clear queue flags after successful connect
  i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
  perf/core: Limit matching exclusive events to one PMU
  ...
2016-09-28 12:08:49 +10:00
Andreas Gruenbacher
bc8bcf3b15 posix_acl: uapi header split
Export the base definitions and the xattr representation of POSIX ACLs
to user space.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:52:00 -04:00
Andreas Gruenbacher
2211d5ba5c posix_acl: xattr representation cleanups
Remove the unnecessary typedefs and the zero-length a_entries array in
struct posix_acl_xattr_header.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:52:00 -04:00