Commit Graph

53479 Commits

Author SHA1 Message Date
Al Viro
0e0162bb8c Merge branch 'ovl-fixes' into for-linus
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
2016-05-17 02:17:59 -04:00
Vinod Koul
f9114a54c1 Merge branch 'topic/xilinx' into for-linus 2016-05-17 10:15:34 +05:30
Vinod Koul
4dc50060c9 Merge branch 'topic/pl08x' into for-linus 2016-05-17 10:14:59 +05:30
Vinod Koul
56214883c5 Merge branch 'topic/dw' into for-linus 2016-05-17 10:14:16 +05:30
Linus Torvalds
16490980e3 Merge tag 'device-properties-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties update from Rafael Wysocki:
 "Generic device properties framework update.

  Just one commit reworking the handling of built-in properties
  initialization and updating a few drivers in accordance with the core
  framework changes"

* tag 'device-properties-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: don't bother the drivers with struct property_set
2016-05-16 19:51:04 -07:00
Linus Torvalds
46c1345062 Merge tag 'acpi-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
 "The new features here are ACPI 6.1 support (and some previously
  missing bits of ACPI 6.0 support) in ACPICA and two new drivers, a
  driver for the ACPI Generic Event Device (GED) feature introduced by
  ACPI 6.1 and the INT3406 thermal driver for display thermal
  management.  Also the value returned by the _HRV (hardware revision)
  ACPI object will be exported to user space via sysfs now.

  In addition to that, ACPI on ARM64 will not depend on EXPERT any more.

  The rest is mostly fixes and cleanups and some code reorganization.

  Specifics:

   - In-kernel ACPICA code update to the upstream release 20160422
     adding support for ACPI 6.1 along with some previously missing bits
     of ACPI 6.0 support, making a fair amount of fixes and cleanups and
     reducing divergences between the upstream ACPICA and the in-kernel
     code (Bob Moore, Lv Zheng, Al Stone, Aleksey Makarov, Will Miles)

   - ACPI Generic Event Device (GED) support and a fix for it (Sinan
     Kaya, Paul Gortmaker)

   - INT3406 thermal driver for display thermal management and ACPI
     backlight support code reorganization related to it (Aaron Lu, Arnd
     Bergmann)

   - Support for exporting the value returned by the _HRV (hardware
     revision) ACPI object via sysfs (Betty Dall)

   - Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown)

   - Rework of the handling of ACPI _OSI mechanism allowing the
     _OSI("Darwin") support to be overridden from the kernel command
     line among other things (Lv Zheng, Chen Yu)

   - Rework of the ACPI tables override mechanism to prepare it for the
     introduction of overlays support going forward (Lv Zheng, Rafael
     Wysocki)

   - Fixes related to the ECDT support and module-level execution of AML
     (Lv Zheng)

   - ACPI PCI interrupts management update to make it work better on
     ARM64 mostly (Sinan Kaya)

   - ACPI SRAT handling update to make the code process all entires in
     the table order regardless of the entry type (Lukasz Anaczkowski)

   - EFI power off support for full-hardware ACPI platforms that don't
     support ACPI S5 (Chen Yu)

   - Fixes and cleanups related to the ACPI core's sysfs interface (Dan
     Carpenter, Betty Dall)

   - acpi_dev_present() API rework to reduce possible confusion related
     to it (Lukas Wunner)

   - Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd)"

* tag 'acpi-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (82 commits)
  ACPI / video: mark acpi_video_get_levels() inline
  Thermal / ACPI / video: add INT3406 thermal driver
  ACPI / GED: make evged.c explicitly non-modular
  ACPI / tables: Fix DSDT override mechanism
  ACPI / sysfs: fix error code in get_status()
  ACPICA: Update version to 20160422
  ACPICA: Move all ASCII utilities to a common file
  ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support for acpi_hw_write()
  ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support in acpi_hw_read()
  ACPICA: Executer: Introduce a set of macros to handle bit width mask generation
  ACPICA: Hardware: Add optimized access bit width support
  ACPICA: Utilities: Add ACPI_IS_ALIGNED() macro
  ACPICA: Renamed some #defined flag constants for clarity
  ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors
  ACPICA: ACPI 6.0: Update _BIX support for new package element
  ACPICA: ACPI 6.1: Support for new PCCT subtable
  ACPICA: Refactor evaluate_object to reduce nesting
  ACPICA: Divergence: remove unwanted spaces for typedef
  ACPI,PCI,IRQ: remove SCI penalize function
  ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
  ..
2016-05-16 19:41:41 -07:00
Linus Torvalds
d57d394319 Merge tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "The majority of changes go into the cpufreq subsystem this time.

  To me, quite obviously, the biggest ticket item is the new "schedutil"
  governor.  Interestingly enough, it's the first new cpufreq governor
  since the beginning of the git era (except for some out-of-the-tree
  ones).

  There are two main differences between it and the existing governors.
  First, it uses the information provided by the scheduler directly for
  making its decisions, so it doesn't have to track anything by itself.
  Second, it can invoke drivers (supporting that feature) to adjust CPU
  performance right away without having to spawn work items to be
  executed in process context or similar.  Currently, the acpi-cpufreq
  driver is the only one supporting that mode of operation, but then it
  is used on a large number of systems.

  The "schedutil" governor as included here is very simple and mostly
  regarded as a foundation for future work on the integration of the
  scheduler with CPU power management (in fact, there is work in
  progress on top of it already).  Nevertheless it works and the
  preliminary results obtained with it are encouraging.

  There also is some consolidation of CPU frequency management for ARM
  platforms that can add their machine IDs the the new stub dt-platdev
  driver now and that will take care of creating the requisite platform
  device for cpufreq-dt, so it is not necessary to do that in platform
  code any more.  Several ARM platforms are switched over to using this
  generic mechanism.

  In addition to that, the intel_pstate driver is now going to respect
  CPU frequency limits set by the platform firmware (or a BMC) and
  provided via the ACPI _PPC object.

  The devfreq subsystem is getting a new "passive" governor for SoCs
  subsystems that will depend on somebody else to manage their voltage
  rails and its support for Samsung Exynos SoCs is consolidated.

  The rest is support for new hardware (Intel Broxton support in
  intel_idle for one example), bug fixes, optimizations and cleanups in
  a number of places.

  Specifics:

   - New cpufreq "schedutil" governor (making decisions based on CPU
     utilization information provided by the scheduler and capable of
     switching CPU frequencies right away if the underlying driver
     supports that) and support for fast frequency switching in the
     acpi-cpufreq driver (Rafael Wysocki)

   - Consolidation of CPU frequency management on ARM platforms allowing
     them to get rid of some platform-specific boilerplate code if they
     are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao,
     Marc Gonzalez)

   - Support for ACPI _PPC and CPU frequency limits in the intel_pstate
     driver (Srinivas Pandruvada)

   - Fixes and cleanups in the cpufreq core and generic governor code
     (Rafael Wysocki, Sai Gurrappadi)

   - intel_pstate driver optimizations and cleanups (Rafael Wysocki,
     Philippe Longepe, Chen Yu, Joe Perches)

   - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri
     Bhat)

   - cpufreq qoriq driver fixes and cleanups (Jia Hongtao)

   - ACPI cpufreq driver cleanups (Viresh Kumar)

   - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang,
     Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla)

   - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann)

   - Fixes and cleanups in the OPP (Operating Performance Points)
     framework, mostly related to OPP sharing, and reorganization of
     OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla)

   - New "passive" governor for devfreq (for SoC subsystems that will
     rely on someone else for the management of their power resources)
     and consolidation of devfreq support for Exynos platforms, coding
     style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham)

   - PM core fixes and cleanups, mostly to make it work better with the
     generic power domains (genpd) framework, and updates for that
     framework (Ulf Hansson, Thierry Reding, Colin Ian King)

   - Intel Broxton support for the intel_idle driver (Len Brown)

   - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach)

   - ARM cpuidle cleanups (Jisheng Zhang)

   - Intel Kabylake support for the RAPL power capping driver (Jacob
     Pan)

   - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko
     Stuebner)

   - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King,
     Mattia Dongili, Thomas Renninger)"

* tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits)
  intel_pstate: Clean up get_target_pstate_use_performance()
  intel_pstate: Use sample.core_avg_perf in get_avg_pstate()
  intel_pstate: Clarify average performance computation
  intel_pstate: Avoid unnecessary synchronize_sched() during initialization
  cpufreq: schedutil: Make default depend on CONFIG_SMP
  cpufreq: powernv: del_timer_sync when global and local pstate are equal
  cpufreq: powernv: Move smp_call_function_any() out of irq safe block
  intel_pstate: Clean up intel_pstate_get()
  cpufreq: schedutil: Make it depend on CONFIG_SMP
  cpufreq: governor: Fix handling of special cases in dbs_update()
  PM / OPP: Move CONFIG_OF dependent code in a separate file
  cpufreq: intel_pstate: Ignore _PPC processing under HWP
  cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table
  PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
  cpufreq: tango: Use generic platdev driver
  PM / OPP: pass cpumask by reference
  cpufreq: Fix GOV_LIMITS handling for the userspace governor
  cpupower: fix potential memory leak
  PM / devfreq: style/typo fixes
  PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus
  ..
2016-05-16 19:17:22 -07:00
Arnaldo Carvalho de Melo
c85b033496 perf core: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.

A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:53 -03:00
Arnaldo Carvalho de Melo
3e4de4ec4c perf core: Add perf_callchain_store_context() helper
We need have different helpers to account how many contexts we have in
the sample and for real addresses, so do it now as a prep patch, to
ease review.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:52 -03:00
Arnaldo Carvalho de Melo
3b1fff0803 perf core: Add a 'nr' field to perf_event_callchain_context
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.

This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:51 -03:00
Arnaldo Carvalho de Melo
cfbcf46845 perf core: Pass max stack as a perf_callchain_entry context
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:50 -03:00
Linus Torvalds
3e21e5dda4 Merge tag 'mmc-v4.7' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC updates from Ulf Hansson:
 "MMC core:
   - Add TRACE support to be able to debug request flow
   - Extend/improve reset support for (e)MMC
   - Convert MMC pwrseq to platform device drivers
   - Use IDA for indexes
   - Some additional minor improvements

  MMC host:
   - sdhci: Re-factoring, clean-ups and improvements
   - sdhci-acpi|pci: Use MMC_CAP_AGGRESSIVE_PM for Broxton
   - omap/omap_hsmmc: Convert to use dma_request_chan()
   - usdhi6rol0: Add support for UHS modes
   - sh_mmcif: Update runtime PM support
   - tmio: Wolfram Sang steps in as maintainer
   - tmio: Add UHS-I mode support
   - sh_mobile_sdhi: Add UHS-I mode support
   - tmio/sdhi: Re-factoring, clean-ups and improvements
   - dw_mmc: Re-factoring and clean-ups
   - davinci: Convert to use dma_request_chan()"

* tag 'mmc-v4.7' of git://git.linaro.org/people/ulf.hansson/mmc: (99 commits)
  mmc: mmc: Fix partition switch timeout for some eMMCs
  mmc: sh_mobile_sdhi: enable SDIO IRQs for RCar Gen3
  mmc: sdio: fall back to SDIO 1.0 for broken 1.1 cards
  mmc: sdhci-st: correct name of sd-uhs-sdr50 property
  MAINTAINERS: update entry for TMIO MMC driver
  mmc: block: improve logging of handling emmc timeouts
  mmc: sdhci: removed unneeded function wrappers
  mmc: core: remove the invalid message in mmc_select_timing
  mmc: core: fix using wrong io voltage if mmc_select_hs200 fails
  mmc: sdhci-of-arasan: fix set_clock when a phy is supported
  mmc: omap: Use dma_request_chan() for requesting DMA channel
  mmc: mmc: Attempt to flush cache before reset
  mmc: sh_mobile_sdhi: check return value when changing clk
  mmc: sh_mobile_sdhi: only change the clock on RCar Gen2+
  mmc: tmio/sdhi: introduce flag for RCar 2+ specific features
  mmc: sh_mobile_sdhi: make clk_update function more compact
  mmc: omap_hsmmc: Use dma_request_chan() for requesting DMA channel
  mmc: sdhci-of-at91: add presets setup
  mmc: usdhi6rol0: add pinctrl to set pin drive strength
  mmc: usdhi6rol0: add support for UHS modes
  ...
2016-05-16 19:10:40 -07:00
Linus Torvalds
d9dce51c9b Merge tag 'regulator-v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
 "A few core enhancements to deal with some of the slightly more
  complicated edge cases that have started cropping up in systems, both
  new ones and old ones that people started worrying about upstream, but
  otherwise a quiet release for the regulator API:

   - When applying constraints at system image if we have a voltage
     range specified and the regulator is currently configured outside
     the bounds of that range bring the regulator to the nearest end of
     the range.

   - When regulators are in non-regulating bypass modes make sure that
     we always use the voltage from the parent regulator.

   - Support for LP873x, PV88080, PM8894 and FAN53555 chips"

* tag 'regulator-v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (71 commits)
  regulator: rk808: Migrate to regulator core's simplified DT parsing code
  regulator: lp873x: Add support for lp873x PMIC regulators
  regulator: tps65917/palmas: Simplify multiple dereference of match->of_node
  regulator: tps65917/palmas: Handle possible memory allocation failure
  regulator: tps65917/palmas: Simplify multiple dereference of pdata->reg_init[idx]
  regulator: tps65917/palmas: Simplify multiple dereference of ddata->palmas_matches[idx]
  regulator: pwm: Use pwm_get_args() where appropriate
  pwm: Introduce the pwm_args concept
  regulator: max77686: Configure enable time to properly handle regulator enable
  regulator: rk808: Add rk808_reg_ops_ranges for LDO3
  regulator: core: Add early supply resolution for regulators
  regulator: axp20x: Fix axp22x ldo_io voltage ranges
  regulator: tps65917/palmas: Add bypass "On" value
  regulator: rk808: remove unused rk808_reg_ops_ranges
  regulator: refactor valid_ops_mask checking code
  regulator: rk808: remove linear range definitions with a single range
  regulator: max77620: Add support for device specific ramp rate setting
  regulator: max77620: Add details of device specific ramp rate setting
  regulator: helpers: Ensure bypass register field matches ON value
  regulator: core: Move registration of regulator device
  ...
2016-05-16 19:04:53 -07:00
Linus Torvalds
490e142209 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
 "In this merge cycle we had an interaction with MTD subsystem, that
  included converting drivers/mtd/nand/nand_base.c to use newly
  introduced MTD (NAND/NOR) LED trigger instead of implementing it on
  its own.

  Related MTD patches are intended to be merged through the LED tree,
  before MTD tree is merged, since further MTD development is based on
  those modifications.

  Summary:

  LEDs:
   - Introduce a kernel panic LED trigger

   - Introduce a MTD (NAND/NOR) trigger

   - led-tca6507: silence an uninitialized variable warning

   - ledtrig-ide-disk: Move ide_blink_delay to ledtrig_ide_activity()

   - leds-ss4200: Add depend on x86 arch

   - leds-ss4200: add DMI data for FSC SCALEO Home Server

   - leds-triggers: Allow to switch the trigger to "panic" on a kernel panic

   - devicetree: leds: Introduce "panic-indicator" optional property

   - leds-gpio: Support the "panic-indicator" firmware property

  MTD:
   - Uninline mtd_write_oob and move it to mtdcore.c

   - Remove the "nand-disk" LED trigger

   - Hook I/O activity to the MTD LED trigger"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: gpio: Support the "panic-indicator" firmware property
  devicetree: leds: Introduce "panic-indicator" optional property
  leds: triggers: Allow to switch the trigger to "panic" on a kernel panic
  leds: ss4200: add DMI data for FSC SCALEO Home Server
  leds: ss4200: Add depend on x86 arch
  leds: ledtrig-ide-disk: Move ide_blink_delay to ledtrig_ide_activity()
  leds: tca6507: silence an uninitialized variable warning
  mtd: Hook I/O activity to the MTD LED trigger
  mtd: nand: Remove the "nand-disk" LED trigger
  leds: trigger: Introduce a MTD (NAND/NOR) trigger
  mtd: Uninline mtd_write_oob and move it to mtdcore.c
  leds: trigger: Introduce a kernel panic LED trigger
2016-05-16 18:37:06 -07:00
Linus Torvalds
b6ae4055f4 Merge tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 perf updates from Will Deacon:
 "The main addition here is support for Broadcom's Vulcan core using the
  architected ID registers for discovering supported events.

   - Support for the PMU in Broadcom's Vulcan CPU

   - Dynamic event detection using the PMCEIDn_EL0 ID registers"

* tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: perf: don't expose CHAIN event in sysfs
  arm64/perf: Add Broadcom Vulcan PMU support
  arm64/perf: Filter common events based on PMCEIDn_EL0
  arm64/perf: Access pmu register using <read/write>_sys_reg
  arm64/perf: Define complete ARMv8 recommended implementation defined events
  arm64/perf: Changed events naming as per the ARM ARM
  arm64: dts: Add Broadcom Vulcan PMU in dts
  Documentation: arm64: pmu: Add Broadcom Vulcan PMU binding
2016-05-16 17:39:29 -07:00
Linus Torvalds
be092017b6 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:

 - virt_to_page/page_address optimisations

 - support for NUMA systems described using device-tree

 - support for hibernate/suspend-to-disk

 - proper support for maxcpus= command line parameter

 - detection and graceful handling of AArch64-only CPUs

 - miscellaneous cleanups and non-critical fixes

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
  arm64: do not enforce strict 16 byte alignment to stack pointer
  arm64: kernel: Fix incorrect brk randomization
  arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str
  arm64: secondary_start_kernel: Remove unnecessary barrier
  arm64: Ensure pmd_present() returns false after pmd_mknotpresent()
  arm64: Replace hard-coded values in the pmd/pud_bad() macros
  arm64: Implement pmdp_set_access_flags() for hardware AF/DBM
  arm64: Fix typo in the pmdp_huge_get_and_clear() definition
  arm64: mm: remove unnecessary EXPORT_SYMBOL_GPL
  arm64: always use STRICT_MM_TYPECHECKS
  arm64: kvm: Fix kvm teardown for systems using the extended idmap
  arm64: kaslr: increase randomization granularity
  arm64: kconfig: drop CONFIG_RTC_LIB dependency
  arm64: make ARCH_SUPPORTS_DEBUG_PAGEALLOC depend on !HIBERNATION
  arm64: hibernate: Refuse to hibernate if the boot cpu is offline
  arm64: kernel: Add support for hibernate/suspend-to-disk
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: Add new asm macro copy_page
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  arm64: kernel: Include _AC definition in page.h
  ...
2016-05-16 17:17:24 -07:00
Jan Kara
02fbd13975 dax: Remove complete_unwritten argument
Fault handlers currently take complete_unwritten argument to convert
unwritten extents after PTEs are updated. However no filesystem uses
this anymore as the code is racy. Remove the unused argument.

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2016-05-16 18:11:51 -06:00
NeilBrown
e4b2749158 DAX: move RADIX_DAX_ definitions to dax.c
These don't belong in radix-tree.c any more than PAGECACHE_TAG_* do.
Let's try to maintain the idea that radix-tree simply implements an
abstract data type.

Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2016-05-16 18:11:51 -06:00
Linus Torvalds
9a45f036af Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "The biggest changes in this cycle were:

   - prepare for more KASLR related changes, by restructuring, cleaning
     up and fixing the existing boot code.  (Kees Cook, Baoquan He,
     Yinghai Lu)

   - simplifly/concentrate subarch handling code, eliminate
     paravirt_enabled() usage.  (Luis R Rodriguez)"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  x86/KASLR: Clarify purpose of each get_random_long()
  x86/KASLR: Add virtual address choosing function
  x86/KASLR: Return earliest overlap when avoiding regions
  x86/KASLR: Add 'struct slot_area' to manage random_addr slots
  x86/boot: Add missing file header comments
  x86/KASLR: Initialize mapping_info every time
  x86/boot: Comment what finalize_identity_maps() does
  x86/KASLR: Build identity mappings on demand
  x86/boot: Split out kernel_ident_mapping_init()
  x86/boot: Clean up indenting for asm/boot.h
  x86/KASLR: Improve comments around the mem_avoid[] logic
  x86/boot: Simplify pointer casting in choose_random_location()
  x86/KASLR: Consolidate mem_avoid[] entries
  x86/boot: Clean up pointer casting
  x86/boot: Warn on future overlapping memcpy() use
  x86/boot: Extract error reporting functions
  x86/boot: Correctly bounds-check relocations
  x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size'
  x86/boot: Fix "run_size" calculation
  x86/boot: Calculate decompression size during boot not build
  ...
2016-05-16 15:54:01 -07:00
Linus Torvalds
825a3b2605 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - massive CPU hotplug rework (Thomas Gleixner)

 - improve migration fairness (Peter Zijlstra)

 - CPU load calculation updates/cleanups (Yuyang Du)

 - cpufreq updates (Steve Muckle)

 - nohz optimizations (Frederic Weisbecker)

 - switch_mm() micro-optimization on x86 (Andy Lutomirski)

 - ... lots of other enhancements, fixes and cleanups.

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
  ARM: Hide finish_arch_post_lock_switch() from modules
  sched/core: Provide a tsk_nr_cpus_allowed() helper
  sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed
  sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
  sched/fair: Correct unit of load_above_capacity
  sched/fair: Clean up scale confusion
  sched/nohz: Fix affine unpinned timers mess
  sched/fair: Fix fairness issue on migration
  sched/core: Kill sched_class::task_waking to clean up the migration logic
  sched/fair: Prepare to fix fairness problems on migration
  sched/fair: Move record_wakee()
  sched/core: Fix comment typo in wake_q_add()
  sched/core: Remove unused variable
  sched: Make hrtick_notifier an explicit call
  sched/fair: Make ilb_notifier an explicit call
  sched/hotplug: Make activate() the last hotplug step
  sched/hotplug: Move migration CPU_DYING to sched_cpu_dying()
  sched/migration: Move CPU_ONLINE into scheduler state
  sched/migration: Move calc_load_migrate() into CPU_DYING
  sched/migration: Move prepare transition to SCHED_STARTING state
  ...
2016-05-16 14:47:16 -07:00
Linus Torvalds
36db171cc7 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Bigger kernel side changes:

   - Add backwards writing capability to the perf ring-buffer code,
     which is preparation for future advanced features like robust
     'overwrite support' and snapshot mode.  (Wang Nan)

   - Add pause and resume ioctls for the perf ringbuffer (Wang Nan)

   - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)

   - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
     Zijlstra)

   - x86 AUX (Intel PT) related enhancements and updates (Alexander
     Shishkin)

   - x86 MSR PMU driver enhancements and updates (Huang Rui)

   - ... and lots of other changes spread out over 40+ commits.

  Biggest tooling side changes:

   - 'perf trace' features and enhancements.  (Arnaldo Carvalho de Melo)

   - BPF tooling updates (Wang Nan)

   - 'perf sched' updates (Jiri Olsa)

   - 'perf probe' updates (Masami Hiramatsu)

   - ... plus 200+ other enhancements, fixes and cleanups to tools/

  The merge commits, the shortlog and the changelogs contain a lot more
  details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
  perf/core: Disable the event on a truncated AUX record
  perf/x86/intel/pt: Generate PMI in the STOP region as well
  perf buildid-cache: Use lsdir() for looking up buildid caches
  perf symbols: Use lsdir() for the search in kcore cache directory
  perf tools: Use SBUILD_ID_SIZE where applicable
  perf tools: Fix lsdir to set errno correctly
  perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
  perf trace: Move flock op beautifier to tools/perf/trace/beauty/
  perf build: Add build-test for debug-frame on arm/arm64
  perf build: Add build-test for libunwind cross-platforms support
  perf script: Fix export of callchains with recursion in db-export
  perf script: Fix callchain addresses in db-export
  perf script: Fix symbol insertion behavior in db-export
  perf symbols: Add dso__insert_symbol function
  perf scripting python: Use Py_FatalError instead of die()
  perf tools: Remove xrealloc and ALLOC_GROW
  perf help: Do not use ALLOC_GROW in add_cmd_list
  perf pmu: Make pmu_formats_string to check return value of strbuf
  perf header: Make topology checkers to check return value of strbuf
  perf tools: Make alias handler to check return value of strbuf
  ...
2016-05-16 14:08:43 -07:00
Linus Torvalds
3469d261ea Merge branch 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull support for killable rwsems from Ingo Molnar:
 "This, by Michal Hocko, implements down_write_killable().

  The main usecase will be to update mm_sem usage sites to use this new
  API, to allow the mm-reaper introduced in commit aac4536355 ("mm,
  oom: introduce oom reaper") to tear down oom victim address spaces
  asynchronously with minimum latencies and without deadlock worries"

[ The vfs will want it too as the inode lock is changed from a mutex to
  a rwsem due to the parallel lookup and readdir updates ]

* 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Fix comment on register clobbering
  locking/rwsem: Fix down_write_killable()
  locking/rwsem, x86: Add frame annotation for call_rwsem_down_write_failed_killable()
  locking/rwsem: Provide down_write_killable()
  locking/rwsem, x86: Provide __down_write_killable()
  locking/rwsem, s390: Provide __down_write_killable()
  locking/rwsem, ia64: Provide __down_write_killable()
  locking/rwsem, alpha: Provide __down_write_killable()
  locking/rwsem: Introduce basis for down_write_killable()
  locking/rwsem, sparc: Drop superfluous arch specific implementation
  locking/rwsem, sh: Drop superfluous arch specific implementation
  locking/rwsem, xtensa: Drop superfluous arch specific implementation
  locking/rwsem: Drop explicit memory barriers
  locking/rwsem: Get rid of __down_write_nested()
2016-05-16 13:41:02 -07:00
Linus Torvalds
1c19b68a27 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - pvqspinlock statistics fixes (Davidlohr Bueso)

   - flip atomic_fetch_or() arguments (Peter Zijlstra)

   - locktorture simplification (Paul E.  McKenney)

   - documentation updates (SeongJae Park, David Howells, Davidlohr
     Bueso, Paul E McKenney, Peter Zijlstra, Will Deacon)

   - various fixes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Flip atomic_fetch_or() arguments
  locking/pvqspinlock: Robustify init_qspinlock_stat()
  locking/pvqspinlock: Avoid double resetting of stats
  lcoking/locktorture: Simplify the torture_runnable computation
  locking/Documentation: Clarify that ACQUIRE applies to loads, RELEASE applies to stores
  locking/Documentation: State purpose of memory-barriers.txt
  locking/Documentation: Add disclaimer
  locking/Documentation/lockdep: Fix spelling mistakes
  locking/lockdep: Deinline register_lock_class(), save 2328 bytes
  locking/locktorture: Fix NULL pointer dereference for cleanup paths
  locking/locktorture: Fix deboosting NULL pointer dereference
  locking/Documentation: Mention smp_cond_acquire()
  locking/Documentation: Insert white spaces consistently
  locking/Documentation: Fix formatting inconsistencies
  locking/Documentation: Add missed subsection in TOC
  locking/Documentation: Fix missed s/lock/acquire renames
  locking/Documentation: Clarify relationship of barrier() to control dependencies
2016-05-16 13:17:56 -07:00
Alex Williamson
92efb1bd9b PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
Resource flags are exposed to userspace via the sysfs "resource" file.
lspci reads the sysfs file to determine resource properties.

Add a "BAR Equivalent Indicator" flag so lspci can distinguish between
[virtual] and [enhanced] resources.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-05-16 15:12:02 -05:00
Linus Torvalds
49817c3343 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Drop the unused EFI_SYSTEM_TABLES efi.flags bit and ensure the
     ARM/arm64 EFI System Table mapping is read-only (Ard Biesheuvel)

   - Add a comment to explain that one of the code paths in the x86/pat
     code is only executed for EFI boot (Matt Fleming)

   - Improve Secure Boot status checks on arm64 and handle unexpected
     errors (Linn Crosetto)

   - Remove the global EFI memory map variable 'memmap' as the same
     information is already available in efi::memmap (Matt Fleming)

   - Add EFI Memory Attribute table support for ARM/arm64 (Ard
     Biesheuvel)

   - Add EFI GOP framebuffer support for ARM/arm64 (Ard Biesheuvel)

   - Add EFI Bootloader Control driver for storing reboot(2) data in EFI
     variables for consumption by bootloaders (Jeremy Compostella)

   - Add Core EFI capsule support (Matt Fleming)

   - Add EFI capsule char driver (Kweh, Hock Leong)

   - Unify EFI memory map code for ARM and arm64 (Ard Biesheuvel)

   - Add generic EFI support for detecting when firmware corrupts CPU
     status register bits (like IRQ flags) when performing EFI runtime
     service calls (Mark Rutland)

  ... and other misc cleanups"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  efivarfs: Make efivarfs_file_ioctl() static
  efi: Merge boolean flag arguments
  efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()
  efibc: Fix excessive stack footprint warning
  efi/capsule: Make efi_capsule_pending() lockless
  efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver
  efi/runtime-wrappers: Remove ARCH_EFI_IRQ_FLAGS_MASK #ifdef
  x86/efi: Enable runtime call flag checking
  arm/efi: Enable runtime call flag checking
  arm64/efi: Enable runtime call flag checking
  efi/runtime-wrappers: Detect firmware IRQ flag corruption
  efi/runtime-wrappers: Remove redundant #ifdefs
  x86/efi: Move to generic {__,}efi_call_virt()
  arm/efi: Move to generic {__,}efi_call_virt()
  arm64/efi: Move to generic {__,}efi_call_virt()
  efi/runtime-wrappers: Add {__,}efi_call_virt() templates
  efi/arm-init: Reserve rather than unmap the memory map for ARM as well
  efi: Add misc char driver interface to update EFI firmware
  x86/efi: Force EFI reboot to process pending capsules
  efi: Add 'capsule' update support
  ...
2016-05-16 13:06:27 -07:00
Linus Torvalds
230e51f211 Merge branch 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core signal updates from Ingo Molnar:
 "These updates from Stas Sergeev and Andy Lutomirski, improve the
  sigaltstack interface by extending its ABI with the SS_AUTODISARM
  feature, which makes it possible to use swapcontext() in a sighandler
  that works on sigaltstack.  Without this flag, the subsequent signal
  will corrupt the state of the switched-away sighandler.

  The inspiration is more robust dosemu signal handling"

* 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)
  signals/sigaltstack: Report current flag bits in sigaltstack()
  selftests/sigaltstack: Fix the sigaltstack test on old kernels
  signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
  selftests/sigaltstack: Add new testcase for sigaltstack(SS_ONSTACK|SS_AUTODISARM)
  signals/sigaltstack: Implement SS_AUTODISARM flag
  signals/sigaltstack: Prepare to add new SS_xxx flags
  signals/sigaltstack, x86/signals: Unify the x86 sigaltstack check with other architectures
2016-05-16 12:25:25 -07:00
Linus Torvalds
a3871bd434 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes are:

   - Documentation updates, including fixes to the design-level
     requirements documentation and a fixed version of the design-level
     data-structure documentation.  These fixes include removing
     cartoons and getting rid of the html/htmlx duplication.

   - Further improvements to the new-age expedited grace periods.

   - Miscellaneous fixes.

   - Torture-test changes, including a new rcuperf module for measuring
     RCU grace-period performance and scalability, which is useful for
     the expedited-grace-period changes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  rcutorture: Add boot-time adjustment of leaf fanout
  rcutorture: Add irqs-disabled test for call_rcu()
  rcutorture: Dump trace buffer upon shutdown
  rcutorture: Don't rebuild identical kernel
  rcutorture: Add OS-jitter capability
  documentation: Add documentation for RCU's major data structures
  rcutorture: Convert test duration to seconds early
  torture: Kill qemu, not parent process
  torture: Clarify refusal to run more than one torture test
  rcutorture: Consider FROZEN hotplug notifier transitions
  rcutorture: Remove redundant initialization to zero
  rcuperf: Do not wake up shutdown wait queue if "shutdown" is false.
  rcutorture: Add largish-system rcuperf scenario
  rcutorture: Avoid RCU CPU stall warning and RT throttling
  rcutorture: Add rcuperf holdoff boot parameter to reduce interference
  rcutorture: Make scripts analyze rcuperf trace data, if present
  rcutorture: Make rcuperf collect expedited event-trace data
  rcutorture: Print measure of batching efficiency
  rcutorture: Set rcuperf writer kthreads to real-time priority
  rcutorture: Bind rcuperf reader/writer kthreads to CPUs
  ...
2016-05-16 12:02:08 -07:00
Linus Torvalds
0052af4411 Merge branch 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/lib update from Ingo Molnar:
 "This contains a single commit that removes an unused facility that the
  scheduler used to make use of"

* 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/proportions: Remove unused code
2016-05-16 11:36:02 -07:00
Daniel Borkmann
4f3446bb80 bpf: add generic constant blinding for use in jits
This work adds a generic facility for use from eBPF JIT compilers
that allows for further hardening of JIT generated images through
blinding constants. In response to the original work on BPF JIT
spraying published by Keegan McAllister [1], most BPF JITs were
changed to make images read-only and start at a randomized offset
in the page, where the rest was filled with trap instructions. We
have this nowadays in x86, arm, arm64 and s390 JIT compilers.
Additionally, later work also made eBPF interpreter images read
only for kernels supporting DEBUG_SET_MODULE_RONX, that is, x86,
arm, arm64 and s390 archs as well currently. This is done by
default for mentioned JITs when JITing is enabled. Furthermore,
we had a generic and configurable constant blinding facility on our
todo for quite some time now to further make spraying harder, and
first implementation since around netconf 2016.

We found that for systems where untrusted users can load cBPF/eBPF
code where JIT is enabled, start offset randomization helps a bit
to make jumps into crafted payload harder, but in case where larger
programs that cross page boundary are injected, we again have some
part of the program opcodes at a page start offset. With improved
guessing and more reliable payload injection, chances can increase
to jump into such payload. Elena Reshetova recently wrote a test
case for it [2, 3]. Moreover, eBPF comes with 64 bit constants, which
can leave some more room for payloads. Note that for all this,
additional bugs in the kernel are still required to make the jump
(and of course to guess right, to not jump into a trap) and naturally
the JIT must be enabled, which is disabled by default.

For helping mitigation, the general idea is to provide an option
bpf_jit_harden that admins can tweak along with bpf_jit_enable, so
that for cases where JIT should be enabled for performance reasons,
the generated image can be further hardened with blinding constants
for unpriviledged users (bpf_jit_harden == 1), with trading off
performance for these, but not for privileged ones. We also added
the option of blinding for all users (bpf_jit_harden == 2), which
is quite helpful for testing f.e. with test_bpf.ko. There are no
further e.g. hardening levels of bpf_jit_harden switch intended,
rationale is to have it dead simple to use as on/off. Since this
functionality would need to be duplicated over and over for JIT
compilers to use, which are already complex enough, we provide a
generic eBPF byte-code level based blinding implementation, which is
then just transparently JITed. JIT compilers need to make only a few
changes to integrate this facility and can be migrated one by one.

This option is for eBPF JITs and will be used in x86, arm64, s390
without too much effort, and soon ppc64 JITs, thus that native eBPF
can be blinded as well as cBPF to eBPF migrations, so that both can
be covered with a single implementation. The rule for JITs is that
bpf_jit_blind_constants() must be called from bpf_int_jit_compile(),
and in case blinding is disabled, we follow normally with JITing the
passed program. In case blinding is enabled and we fail during the
process of blinding itself, we must return with the interpreter.
Similarly, in case the JITing process after the blinding failed, we
return normally to the interpreter with the non-blinded code. Meaning,
interpreter doesn't change in any way and operates on eBPF code as
usual. For doing this pre-JIT blinding step, we need to make use of
a helper/auxiliary register, here BPF_REG_AX. This is strictly internal
to the JIT and not in any way part of the eBPF architecture. Just like
in the same way as JITs internally make use of some helper registers
when emitting code, only that here the helper register is one
abstraction level higher in eBPF bytecode, but nevertheless in JIT
phase. That helper register is needed since f.e. manually written
program can issue loads to all registers of eBPF architecture.

The core concept with the additional register is: blind out all 32
and 64 bit constants by converting BPF_K based instructions into a
small sequence from K_VAL into ((RND ^ K_VAL) ^ RND). Therefore, this
is transformed into: BPF_REG_AX := (RND ^ K_VAL), BPF_REG_AX ^= RND,
and REG <OP> BPF_REG_AX, so actual operation on the target register
is translated from BPF_K into BPF_X one that is operating on
BPF_REG_AX's content. During rewriting phase when blinding, RND is
newly generated via prandom_u32() for each processed instruction.
64 bit loads are split into two 32 bit loads to make translation and
patching not too complex. Only basic thing required by JITs is to
call the helper bpf_jit_blind_constants()/bpf_jit_prog_release_other()
pair, and to map BPF_REG_AX into an unused register.

Small bpf_jit_disasm extract from [2] when applied to x86 JIT:

echo 0 > /proc/sys/net/core/bpf_jit_harden

  ffffffffa034f5e9 + <x>:
  [...]
  39:   mov    $0xa8909090,%eax
  3e:   mov    $0xa8909090,%eax
  43:   mov    $0xa8ff3148,%eax
  48:   mov    $0xa89081b4,%eax
  4d:   mov    $0xa8900bb0,%eax
  52:   mov    $0xa810e0c1,%eax
  57:   mov    $0xa8908eb4,%eax
  5c:   mov    $0xa89020b0,%eax
  [...]

echo 1 > /proc/sys/net/core/bpf_jit_harden

  ffffffffa034f1e5 + <x>:
  [...]
  39:   mov    $0xe1192563,%r10d
  3f:   xor    $0x4989b5f3,%r10d
  46:   mov    %r10d,%eax
  49:   mov    $0xb8296d93,%r10d
  4f:   xor    $0x10b9fd03,%r10d
  56:   mov    %r10d,%eax
  59:   mov    $0x8c381146,%r10d
  5f:   xor    $0x24c7200e,%r10d
  66:   mov    %r10d,%eax
  69:   mov    $0xeb2a830e,%r10d
  6f:   xor    $0x43ba02ba,%r10d
  76:   mov    %r10d,%eax
  79:   mov    $0xd9730af,%r10d
  7f:   xor    $0xa5073b1f,%r10d
  86:   mov    %r10d,%eax
  89:   mov    $0x9a45662b,%r10d
  8f:   xor    $0x325586ea,%r10d
  96:   mov    %r10d,%eax
  [...]

As can be seen, original constants that carry payload are hidden
when enabled, actual operations are transformed from constant-based
to register-based ones, making jumps into constants ineffective.
Above extract/example uses single BPF load instruction over and
over, but of course all instructions with constants are blinded.

Performance wise, JIT with blinding performs a bit slower than just
JIT and faster than interpreter case. This is expected, since we
still get all the performance benefits from JITing and in normal
use-cases not every single instruction needs to be blinded. Summing
up all 296 test cases averaged over multiple runs from test_bpf.ko
suite, interpreter was 55% slower than JIT only and JIT with blinding
was 8% slower than JIT only. Since there are also some extremes in
the test suite, I expect for ordinary workloads that the performance
for the JIT with blinding case is even closer to JIT only case,
f.e. nmap test case from suite has averaged timings in ns 29 (JIT),
35 (+ blinding), and 151 (interpreter).

BPF test suite, seccomp test suite, eBPF sample code and various
bigger networking eBPF programs have been tested with this and were
running fine. For testing purposes, I also adapted interpreter and
redirected blinded eBPF image to interpreter and also here all tests
pass.

  [1] http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
  [2] https://github.com/01org/jit-spray-poc-for-ksp/
  [3] http://www.openwall.com/lists/kernel-hardening/2016/05/03/5

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Elena Reshetova <elena.reshetova@intel.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:32 -04:00
Daniel Borkmann
d1c55ab5e4 bpf: prepare bpf_int_jit_compile/bpf_prog_select_runtime apis
Since the blinding is strictly only called from inside eBPF JITs,
we need to change signatures for bpf_int_jit_compile() and
bpf_prog_select_runtime() first in order to prepare that the
eBPF program we're dealing with can change underneath. Hence,
for call sites, we need to return the latest prog. No functional
change in this patch.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:32 -04:00
Daniel Borkmann
c237ee5eb3 bpf: add bpf_patch_insn_single helper
Move the functionality to patch instructions out of the verifier
code and into the core as the new bpf_patch_insn_single() helper
will be needed later on for blinding as well. No changes in
functionality.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:32 -04:00
Daniel Borkmann
c94987e40e bpf: move bpf_jit_enable declaration
Move the bpf_jit_enable declaration to the filter.h file where
most other core code is declared, also since we're going to add
a second knob there.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:31 -04:00
Amir Vadai
43a335e055 net/mlx5_core: Flow counters infrastructure
If a counter has the aging flag set when created, it is added to a list
of counters that will be queried periodically from a workqueue.  query
result and last use timestamp are cached.
add/del counter must be very efficient since thousands of such
operations might be issued in a second.
There is only a single reference to counters without aging, therefore
no need for locks.
But, counters with aging enabled are stored in a list. In order to make
code as lockless as possible, all the list manipulation and access to
hardware is done from a single context - the periodic counters query
thread.

The hardware supports multiple counters per FTE, however currently we
are using one counter for each FTE.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai
bd5251dbf1 net/mlx5_core: Introduce flow steering destination of type counter
When adding a flow steering rule with a counter, need to supply a
destination of type MLX5_FLOW_DESTINATION_TYPE_COUNTER, with a pointer
to a struct mlx5_fc.
Also, MLX5_FLOW_CONTEXT_ACTION_COUNT bit should be set in the action.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai
9dc0b289c4 net/mlx5_core: Firmware commands to support flow counters
Getting packet/byte statistics on flows is done through flow counters.
Implement the firmware commands to alloc, free and query flow counters.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
David Bond
b3c8eb5038 ibft: Expose iBFT acpi header via sysfs
Some ethernet adapter vendors are supplying products which support optional
(payed license) features. On some adapters this includes a hardware iscsi
initiator.  The same adapters in a normal (no extra licenses) mode of
operation can be used as a software iscsi initiator.  In addition, software
iscsi boot initiators are becoming a standard part of many vendors uefi
implementations.  This is creating difficulties during early boot/install
determining the proper configuration method for these adapters when they
are used as a boot device.

The attached patch creates sysfs entries to expose information from the
acpi header of the ibft table.  This information allows for a method to
easily determining if an ibft table was created by a ethernet card's
firmware or the system uefi/bios.  In the case of a hardware initiator this
information in combination with the pci vendor and device id can be used
to ascertain any vendor specific behaviors that need to be accommodated.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: David Bond <dbond@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-05-16 11:14:29 -04:00
Hannes Reinecke
9a99425f07 iscsi_ibft: Add prefix-len attr and display netmask
The iBFT table only specifies a prefix length, not a netmask.
And the netmask is pretty much pointless for IPv6.
So introduce a new attribute 'prefix-len'.

Some older user-space code might rely on the netmask attribute
being present, so we should always display it.

Changes from v1:
 - Combined two patches into one

Changes from v2:
 - Cleaned up/corrected wording for patch description

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
2016-05-16 11:14:23 -04:00
Rafael J. Wysocki
fc72395780 Merge branches 'acpi-pci', 'acpi-misc' and 'acpi-tools'
* acpi-pci:
  ACPI,PCI,IRQ: remove SCI penalize function
  ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
  ACPI,PCI,IRQ: reduce static IRQ array size to 16
  ACPI,PCI,IRQ: reduce resource requirements

* acpi-misc:
  ACPI / sysfs: fix error code in get_status()
  ACPI / device_sysfs: Clean up checkpatch errors
  ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO
  ACPI / device_sysfs: Add sysfs support for _HRV hardware revision
  arm64: defconfig: Enable ACPI
  ACPI / ARM64: Remove EXPERT dependency for ACPI on ARM64
  ACPI / ARM64: Don't enable ACPI by default on ARM64
  acer-wmi: Use acpi_dev_found()
  eeepc-wmi: Use acpi_dev_found()
  ACPI / utils: Rename acpi_dev_present()

* acpi-tools:
  tools/power/acpi: close file only if it is open
2016-05-16 16:45:48 +02:00
Rafael J. Wysocki
efc499f980 Merge branches 'acpi-numa', 'acpi-tables' and 'acpi-osi'
* acpi-numa:
  ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present

* acpi-tables:
  ACPI / tables: Fix DSDT override mechanism
  ACPI / tables: Convert initrd table override to table upgrade mechanism
  ACPI / x86: Cleanup initrd related code
  ACPI / tables: Move table override mechanisms to tables.c

* acpi-osi:
  ACPI / osi: Collect _OSI handling into one single file
  ACPI / osi: Cleanup coding style issues before creating a separate OSI source file
  ACPI / osi: Cleanup OSI handling code to use bool
  ACPI / osi: Fix default _OSI(Darwin) support
  ACPI / osi: Add acpi_osi=!! to allow reverting acpi_osi=!
  ACPI / osi: Cleanup _OSI("Linux") related code before introducing new support
  ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings

Conflicts:
	drivers/acpi/internal.h
2016-05-16 16:45:25 +02:00
Rafael J. Wysocki
a6becfbaba Merge branches 'acpi-drivers', 'acpi-pm', 'acpi-ec' and 'acpi-video'
* acpi-drivers:
  ACPI / GED: make evged.c explicitly non-modular
  ACPI / amba: Remove CLK_IS_ROOT
  ACPI / APD: Remove CLK_IS_ROOT
  ACPI: implement Generic Event Device

* acpi-pm:
  ACPI / PM: Introduce efi poweroff for HW-full platforms without _S5

* acpi-ec:
  ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis
  ACPI 2.0 / ECDT: Enable correct ECDT initialization order
  ACPI 2.0 / ECDT: Remove early namespace reference from EC
  ACPI 2.0 / ECDT: Split EC_FLAGS_HANDLERS_INSTALLED

* acpi-video:
  ACPI / video: mark acpi_video_get_levels() inline
  Thermal / ACPI / video: add INT3406 thermal driver
  ACPI/video: export acpi_video_get_levels
  video / backlight: remove the backlight_device_registered API
  video / backlight: add two APIs for drivers to use
2016-05-16 16:44:41 +02:00
Rafael J. Wysocki
aa24781b1c Merge branches 'pm-core' and 'pm-domains'
* pm-core:
  PM / sleep: Drop unused `info' variable
  PM / Runtime: Move ignore_children flag under CONFIG_PM
  PM / Runtime: Fix error path in pm_runtime_force_resume()

* pm-domains:
  PM / Domains: Drop unnecessary wakeup code from pm_genpd_prepare()
  PM / Domains: Remove redundant pm_runtime_get|put*() in pm_genpd_prepare()
  PM / Domains: Remove ->save|restore_state() callbacks
  PM / Domains: Rename pm_genpd_runtime_suspend|resume()
  PM / Domains: Rename stop_ok to suspend_ok for the genpd governor
2016-05-16 14:31:29 +02:00
Rafael J. Wysocki
7777c2785b Merge branch 'pm-devfreq'
* pm-devfreq:
  PM / devfreq: style/typo fixes
  PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus
  PM / devfreq: event: Find the instance of devfreq-event device by using phandle
  PM / devfreq: event: Add new Exynos NoC probe driver
  MAINTAINERS: Add samsung bus frequency driver entry
  PM / devfreq: exynos: Remove unused exynos4/5 busfreq driver
  PM / devfreq: exynos: Add the detailed correlation between sub-blocks and power line
  PM / devfreq: exynos: Update documentation for bus devices using passive governor
  PM / devfreq: exynos: Add support of bus frequency of sub-blocks using passive governor
  PM / devfreq: Add new passive governor
  PM / devfreq: Add new DEVFREQ_TRANSITION_NOTIFIER notifier
  PM / devfreq: Add devfreq_get_devfreq_by_phandle()
  PM / devfreq: exynos: Add documentation for generic exynos bus frequency driver
  PM / devfreq: exynos: Add generic exynos bus frequency driver
2016-05-16 14:31:15 +02:00
Rafael J. Wysocki
c47b3bd0d3 Merge branch 'pm-cpufreq'
* pm-cpufreq: (63 commits)
  intel_pstate: Clean up get_target_pstate_use_performance()
  intel_pstate: Use sample.core_avg_perf in get_avg_pstate()
  intel_pstate: Clarify average performance computation
  intel_pstate: Avoid unnecessary synchronize_sched() during initialization
  cpufreq: schedutil: Make default depend on CONFIG_SMP
  cpufreq: powernv: del_timer_sync when global and local pstate are equal
  cpufreq: powernv: Move smp_call_function_any() out of irq safe block
  intel_pstate: Clean up intel_pstate_get()
  cpufreq: schedutil: Make it depend on CONFIG_SMP
  cpufreq: governor: Fix handling of special cases in dbs_update()
  cpufreq: intel_pstate: Ignore _PPC processing under HWP
  cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table
  cpufreq: tango: Use generic platdev driver
  cpufreq: Fix GOV_LIMITS handling for the userspace governor
  cpufreq: mvebu: Move cpufreq code into drivers/cpufreq/
  cpufreq: dt: Kill platform-data
  mvebu: Use dev_pm_opp_set_sharing_cpus() to mark OPP tables as shared
  cpufreq: dt: Identify cpu-sharing for platforms without operating-points-v2
  cpufreq: governor: Change confusing struct field and variable names
  cpufreq: intel_pstate: Enable PPC enforcement for servers
  ...
2016-05-16 14:30:43 +02:00
Rafael J. Wysocki
29cff1844a Merge branch 'pm-opp'
* pm-opp:
  PM / OPP: Move CONFIG_OF dependent code in a separate file
  PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
  PM / OPP: pass cpumask by reference
  PM / OPP: Add dev_pm_opp_get_sharing_cpus()
  PM / OPP: Mark cpumask as const in dev_pm_opp_set_sharing_cpus()
  PM / OPP: -ENOSYS is applicable only to syscalls
  PM / OPP: Mark shared-opp for non-dt case
  PM / OPP: Relocate dev_pm_opp_set_sharing_cpus()
  PM / OPP: dev_pm_opp_set_sharing_cpus() doesn't depend on CONFIG_OF
  PM / OPP: Add missing doc style comments
  PM / OPP: Propagate the error returned by _find_opp_table()
2016-05-16 14:30:14 +02:00
Gavin Shan
83262418b0 drivers/of: Return allocated memory from of_fdt_unflatten_tree()
This returns the allocate memory chunk, storing the unflattened device
tree, from of_fdt_unflatten_tree() so that memory chunk can be released
on demand in PowerNV PCI hotplug driver.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2016-05-16 07:22:35 -05:00
Gavin Shan
c4263233f3 drivers/of: Specify parent node in of_fdt_unflatten_tree()
This adds one more argument to of_fdt_unflatten_tree() to specify
the parent node of the FDT blob that is going to be unflattened.
In the result, the function can be used to unflatten FDT blob that
represents device sub-tree in PowerNV PCI hotplug driver.

Cc: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2016-05-16 07:22:35 -05:00
David S. Miller
909b27f706 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.

The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.

Conflicts:
	drivers/net/wireless/intel/iwlwifi/mvm/tx.c
	net/ipv4/ip_gre.c
	net/netfilter/nf_conntrack_core.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-15 13:32:48 -04:00
Linus Torvalds
6ba5b85fd4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Overlayfs fixes from Miklos, assorted fixes from me.

  Stable fodder of varying severity, all sat in -next for a while"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: ignore permissions on underlying lookup
  vfs: add lookup_hash() helper
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  get_rock_ridge_filename(): handle malformed NM entries
  ecryptfs: fix handling of directory opening
  atomic_open(): fix the handling of create_error
  fix the copy vs. map logics in blk_rq_map_user_iov()
  do_splice_to(): cap the size before passing to ->splice_read()
2016-05-14 11:59:43 -07:00
Linus Torvalds
1410b74e40 Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "During v4.6-rc1 cgroup namespace support was merged.  There is an
  issue where it's impossible to tell whether a given cgroup mount point
  is bind mounted or namespaced.  Serge has been working on the issue
  but it took longer than expected to resolve, so the late pull request.

  Given that it's a completely new feature and the patches don't touch
  anything else, the risk seems acceptable.  However, if this is too
  late, an alternative is plugging new cgroup ns creation for v4.6 and
  retrying for v4.7"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix compile warning
  kernfs: kernfs_sop_show_path: don't return 0 after seq_dentry call
  cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces
  kernfs_path_from_node_locked: don't overwrite nlen
2016-05-13 16:26:46 -07:00
Bjorn Andersson
f79a917e69 Merge tag 'qcom-soc-for-4.7-2' into net-next
This merges the Qualcomm SOC tree with the net-next, solving the
merge conflict in the SMD API between the two.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-05-13 14:42:23 -07:00