Commit Graph

43938 Commits

Author SHA1 Message Date
Linus Torvalds
072bc448cc Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "Main changes:

   - Move efivarfs from the misc filesystem section to pseudo filesystem

   - Expose firmware platform size in sysfs

   - Improve robustness of get_memory_map() by removing assumptions on
     the size of efi_memory_desc_t.

  - various cleanups and fixes

  The biggest risk is the get_memory_map() change, which changes the way
  that both the arm64 and x86 EFI boot stub build the early memory map.
  There are no known regressions with it at the moment, BYMMV"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: Don't look for chosen@0 node on DT platforms
  firmware: efi: Remove unneeded guid unparse
  efi/libstub: Call get_memory_map() to obtain map and desc sizes
  efi: Small leak on error in runtime map code
  efi: rtc-efi: Mark UIE as unsupported
  arm64/efi: efistub: Apply __init annotation
  efi: Expose underlying UEFI firmware platform size to userland
  efi: Rename efi_guid_unparse to efi_guid_to_str
  efi: Update the URLs for efibootmgr
  fs: Make efivarfs a pseudo filesystem, built by default with EFI
2015-02-09 17:53:53 -08:00
Linus Torvalds
9d43bade34 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC updates from Ingo Molnar:
 "Continued fallout of the conversion of the x86 IRQ code to the
  hierarchical irqdomain framework: more cleanups, simplifications,
  memory allocation behavior enhancements, mainly in the interrupt
  remapping and APIC code"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  x86, init: Fix UP boot regression on x86_64
  iommu/amd: Fix irq remapping detection logic
  x86/acpi: Make acpi_[un]register_gsi_ioapic() depend on CONFIG_X86_LOCAL_APIC
  x86: Consolidate boot cpu timer setup
  x86/apic: Reuse apic_bsp_setup() for UP APIC setup
  x86/smpboot: Sanitize uniprocessor init
  x86/smpboot: Move apic init code to apic.c
  init: Get rid of x86isms
  x86/apic: Move apic_init_uniprocessor code
  x86/smpboot: Cleanup ioapic handling
  x86/apic: Sanitize ioapic handling
  x86/ioapic: Add proper checks to setp/enable_IO_APIC()
  x86/ioapic: Provide stub functions for IOAPIC%3Dn
  x86/smpboot: Move smpboot inlines to code
  x86/x2apic: Use state information for disable
  x86/x2apic: Split enable and setup function
  x86/x2apic: Disable x2apic from nox2apic setup
  x86/x2apic: Add proper state tracking
  x86/x2apic: Clarify remapping mode for x2apic enablement
  x86/x2apic: Move code in conditional region
  ...
2015-02-09 16:57:56 -08:00
Linus Torvalds
0ba97bc4b4 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Ingo Molnar:
 "The main changes in this cycle were:

   - rework hrtimer expiry calculation in hrtimer_interrupt(): the
     previous code had a subtle bug where expiry caching would miss an
     expiry, resulting in occasional bogus (late) expiry of hrtimers.

   - continuing Y2038 fixes

   - ktime division optimization

   - misc smaller fixes and cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Make __hrtimer_get_next_event() static
  rtc: Convert rtc_set_ntp_time() to use timespec64
  rtc: Remove redundant rtc_valid_tm() from rtc_hctosys()
  rtc: Modify rtc_hctosys() to address y2038 issues
  rtc: Update rtc-dev to use y2038-safe time interfaces
  rtc: Update interface.c to use y2038-safe time interfaces
  time: Expose get_monotonic_boottime64 for in-kernel use
  time: Expose getboottime64 for in-kernel uses
  ktime: Optimize ktime_divns for constant divisors
  hrtimer: Prevent stale expiry time in hrtimer_interrupt()
  ktime.h: Introduce ktime_ms_delta
2015-02-09 16:33:07 -08:00
Linus Torvalds
5b9b28a63f Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main scheduler changes in this cycle were:

   - various sched/deadline fixes and enhancements

   - rescheduling latency fixes/cleanups

   - rework the rq->clock code to be more consistent and more robust.

   - minor micro-optimizations

   - ->avg.decay_count fixes

   - add a stack overflow check to might_sleep()

   - idle-poll handler fix, possibly resulting in power savings

   - misc smaller updates and fixes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/Documentation: Remove unneeded word
  sched/wait: Introduce wait_on_bit_timeout()
  sched: Pull resched loop to __schedule() callers
  sched/deadline: Remove cpu_active_mask from cpudl_find()
  sched: Fix hrtick_start() on UP
  sched/deadline: Avoid pointless __setscheduler()
  sched/deadline: Fix stale yield state
  sched/deadline: Fix hrtick for a non-leftmost task
  sched/deadline: Modify cpudl::free_cpus to reflect rd->online
  sched/idle: Add missing checks to the exit condition of cpu_idle_poll()
  sched: Fix missing preemption opportunity
  sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target
  sched/debug: Print rq->clock_task
  sched/core: Rework rq->clock update skips
  sched/core: Validate rq_clock*() serialization
  sched/core: Remove check of p->sched_class
  sched/fair: Fix sched_entity::avg::decay_count initialization
  sched/debug: Fix potential call to __ffs(0) in sched_show_task()
  sched/debug: Check for stack overflow in ___might_sleep()
  sched/fair: Fix the dealing with decay_count in __synchronize_entity_decay()
2015-02-09 16:06:06 -08:00
Linus Torvalds
a4cbbf549a Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - AMD range breakpoints support:

     Extend breakpoint tools and core to support address range through
     perf event with initial backend support for AMD extended
     breakpoints.

     The syntax is:

         perf record -e mem:addr/len:type

     For example set write breakpoint from 0x1000 to 0x1200 (0x1000 + 512)

         perf record -e mem:0x1000/512:w

   - event throttling/rotating fixes

   - various event group handling fixes, cleanups and general paranoia
     code to be more robust against bugs in the future.

    - kernel stack overhead fixes

  User-visible tooling side changes:

   - Show precise number of samples in at the end of a 'record' session,
     if processing build ids, since we will then traverse the whole
     perf.data file and see all the PERF_RECORD_SAMPLE records,
     otherwise stop showing the previous off-base heuristicly counted
     number of "samples" (Namhyung Kim).

   - Support to read compressed module from build-id cache (Namhyung
     Kim)

   - Enable sampling loads and stores simultaneously in 'perf mem'
     (Stephane Eranian)

   - 'perf diff' output improvements (Namhyung Kim)

   - Fix error reporting for evsel pgfault constructor (Arnaldo Carvalho
     de Melo)

  Tooling side infrastructure changes:

   - Cache eh/debug frame offset for dwarf unwind (Namhyung Kim)

   - Support parsing parameterized events (Cody P Schafer)

   - Add support for IP address formats in libtraceevent (David Ahern)

  Plus other misc fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  perf: Decouple unthrottling and rotating
  perf: Drop module reference on event init failure
  perf: Use POLLIN instead of POLL_IN for perf poll data in flag
  perf: Fix put_event() ctx lock
  perf: Fix move_group() order
  perf: Fix event->ctx locking
  perf: Add a bit of paranoia
  perf symbols: Convert lseek + read to pread
  perf tools: Use perf_data_file__fd() consistently
  perf symbols: Support to read compressed module from build-id cache
  perf evsel: Set attr.task bit for a tracking event
  perf header: Set header version correctly
  perf record: Show precise number of samples
  perf tools: Do not use __perf_session__process_events() directly
  perf callchain: Cache eh/debug frame offset for dwarf unwind
  perf tools: Provide stub for missing pthread_attr_setaffinity_np
  perf evsel: Don't rely on malloc working for sz 0
  tools lib traceevent: Add support for IP address formats
  perf ui/tui: Show fatal error message only if exists
  perf tests: Fix typo in sample-parsing.c
  ...
2015-02-09 15:43:55 -08:00
Linus Torvalds
8308756f45 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main changes are:

   - mutex, completions and rtmutex micro-optimizations
   - lock debugging fix
   - various cleanups in the MCS and the futex code"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Optimize setting task running after being blocked
  locking/rwsem: Use task->state helpers
  sched/completion: Add lock-free checking of the blocking case
  sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state
  locking/mutex: Explicitly mark task as running after wakeup
  futex: Fix argument handling in futex_lock_pi() calls
  doc: Fix misnamed FUTEX_CMP_REQUEUE_PI op constants
  locking/Documentation: Update code path
  softirq/preempt: Add missing current->preempt_disable_ip update
  locking/osq: No need for load/acquire when acquire-polling
  locking/mcs: Better differentiate between MCS variants
  locking/mutex: Introduce ww_mutex_set_context_slowpath()
  locking/mutex: Move MCS related comments to proper location
  locking/mutex: Checking the stamp is WW only
2015-02-09 15:24:03 -08:00
Linus Torvalds
23e8fe2e16 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle are:

   - Documentation updates.

   - Miscellaneous fixes.

   - Preemptible-RCU fixes, including fixing an old bug in the
     interaction of RCU priority boosting and CPU hotplug.

   - SRCU updates.

   - RCU CPU stall-warning updates.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  rcu: Initialize tiny RCU stall-warning timeouts at boot
  rcu: Fix RCU CPU stall detection in tiny implementation
  rcu: Add GP-kthread-starvation checks to CPU stall warnings
  rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
  rcu: Optionally run grace-period kthreads at real-time priority
  ksoftirqd: Use new cond_resched_rcu_qs() function
  ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  rcutorture: Add more diagnostics in rcu_barrier() test failure case
  torture: Flag console.log file to prevent holdovers from earlier runs
  torture: Add "-enable-kvm -soundhw pcspk" to qemu command line
  rcutorture: Handle different mpstat versions
  rcutorture: Check from beginning to end of grace period
  rcu: Remove redundant rcu_batches_completed() declaration
  rcutorture: Drop rcu_torture_completed() and friends
  rcu: Provide rcu_batches_completed_sched() for TINY_RCU
  rcutorture: Use unsigned for Reader Batch computations
  rcutorture: Make build-output parsing correctly flag RCU's warnings
  rcu: Make _batches_completed() functions return unsigned long
  rcutorture: Issue warnings on close calls due to Reader Batch blows
  documentation: Fix smp typo in memory-barriers.txt
  ...
2015-02-09 14:28:42 -08:00
Yishai Hadas
35f05dabf9 IB/mlx4: Reset flow support for IB kernel ULPs
The driver exposes interfaces that directly relate to HW state. Upon fatal
error, consumers of these interfaces (ULPs) that rely on completion of
all their posted work-request could hang, thereby introducing dependencies
in shutdown order.  To prevent this from happening, we manage the
relevant resources (CQs, QPs) that are used by the device. Upon a fatal error,
we now generate simulated completions for outstanding WQEs that were not
completed at the time the HW was reset.

It includes invoking the completion event handler for all involved CQs so that
the ULPs will poll those CQs. When polled we return simulated CQEs with
IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and
not wait forever for completions upon receiving remove_one.

The above change requires an extra check in the data path to make sure that when
device is in error state, the simulated CQEs will be returned and no further
WQEs will be posted.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 14:03:53 -08:00
Linus Torvalds
30d46827c2 Merge tag 'regulator-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
 "This has not been a busy release for the regulator framework, though
  we do have the first parts of some ongoing work from Bjorn Andersson
  to allow us to support more complex modern systems with dynamic
  configuration of regulators in suspend and idle states.

   - Support for device-specific properties on regulator nodes when
     using simplified DT parsing in the core from Krzysztof Kozlowski.

   - Restructuring of the load tracking code, intended to support future
     improvements in this area for more complex system designs.

   - New drivers for Maxim MAX77843 and Mediatek MT6397.

   - Lots of smaller fixes and improvements"

* tag 'regulator-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (29 commits)
  regulator: max77843: Add max77843 regulator driver
  regulator: Fix build breakage on !REGULATOR
  regulator: Build sysfs entries with static attribute groups
  regulator: qcom-rpm: Make it possible to specify supply
  regulator: core: Consolidate drms update handling
  regulator: qcom-rpm: signedness bug in probe()
  regulator: da9211: Add gpio control for enable/disable of buck
  regulator: qcom_rpm: Don't update vreg->uV/mV if rpm_reg_write fails
  regulator: lp872x: Remove **regulators from struct lp872x
  regulator: da9211: fix unmatched of_node
  regulator: Update documentation after renaming function argument
  regulator: axp20x: Migrate to regulator core's simplified DT parsing code
  regulator: axp20x: Fill regulators_node and of_match descriptor fields
  regulator: pfuze100-regulator: add pfuze3000 support
  regulator: max77686: Document gpio properties
  regulator: Allow parsing custom properties when using simplified DT parsing
  regulator: max77686: Add GPIO control
  regulator: Copy config passed during registration
  regulator: tps65023: Constify struct regmap_config and regulator_ops
  regulator: max8649: Constify struct regmap_config and regulator_ops
  ...
2015-02-09 13:46:28 -08:00
Linus Torvalds
b0c1936c44 Merge tag 'spi-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
 "The major highlight this release is a refactoring of the core to allow
  us to run synchronous transfers in the context of the caller when
  there is no contention for the bus.  This improves performance in the
  very common case by eliminating context switches and reducing the
  number of hardware setup and teardown operations we need to perform.

  Other changes:

   - New drivers for DLN-2 USB-SPI adapter and ST SPI controllers.

   - A big round of cleanups, performance and feature improvements for
     the xilinx driver from Ricardo Ribalda Delgado.

   - A wide range of smaller cleanups, fixes and feature improvements
     throughout the subsystem"

* tag 'spi-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (68 commits)
  spi: mxs: cleanup wait_for_completion return handling
  spi: ti-qspi: cleanup wait_for_completion return handling
  spi: spi-imx: cleanup wait_for_completion handling
  spi: sh-msiof: cleanup wait_for_completion return handling
  spi: match var type to return type of wait_for_completion
  spi: spi-pxa2xx: only include mach/dma.h for legacy DMA
  spi: atmel: cleanup wait_for_completion return handling
  spi: fsl-dspi: Remove possible memory leak of 'chip'
  spi: sh-msiof: Update calculation of frequency dividing
  spi: spidev: Convert buf pointers for 32-bit compat SPI_IOC_MESSAGE(n)
  spi/xilinx: Fix access invalid memory on xilinx_spi_tx
  spi: Revert "spi/xilinx: Remove iowrite/ioread wrappers"
  spi/xilinx: Check number of slaves range
  spi/xilinx: Use polling mode on small transfers
  spi/xilinx: Remove remaining_words driver data variable
  spi/xilinx: Remove iowrite/ioread wrappers
  spi/xilinx: Convert bits_per_word in bytes_per_word
  spi/xilinx: Convert remainding_bytes in remaining words
  spi/xilinx: Make spi_tx and spi_rx simmetric
  spi/xilinx: Remove rx_fn and tx_fn pointer
  ...
2015-02-09 13:36:20 -08:00
Linus Torvalds
f381f90695 Merge tag 'regmap-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
 "A very quiet release for regmap this time around:

   - Fix an endianness issue for I2C devices connected via SMBus where
     we were getting two layers both trying to do endianness handling.
   - Use a union to reduce the size of the regmap struct.
   - A couple of smaller fixes"

* tag 'regmap-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Fix i2c word access when using SMBus access functions
  regmap: Export regmap_get_val_endian
  regmap: ac97: Clean up indentation
  regmap: correct the description of structure element in reg_field
  regmap: Move spinlock_flags into the union
2015-02-09 12:52:18 -08:00
David S. Miller
c8ac18f200 Merge tag 'wireless-drivers-next-for-davem-2015-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Major changes:

iwlwifi:

* more work for new devices (4165 / 8260)
* cleanups / improvemnts in rate control
* fixes for TDLS
* major statistics work from Johannes - more to come
* improvements for the fw error dump infrastructure
* usual amount of small fixes here and there (scan, D0i3 etc...)
* add support for beamforming
* enable stuck queue detection for iwlmvm
* a few fixes for EBS scan
* fixes for various failure paths
* improvements for TDLS Offchannel

wil6210:

* performance tuning
* some AP features

brcm80211:

* rework some code in SDIO part of the brcmfmac driver related to
  suspend/resume that were found doing stress testing
* in PCIe part scheduling of worker thread needed to be relaxed
* minor fixes and exposing firmware revision information to
  user-space, ie. ethtool.

mwifiex:

* enhancements for change virtual interface handling
* remove coupling between netdev and FW supported interface
  combination, now conversion from any type of supported interface
  types to any other type is possible
* DFS support in AP mode

ath9k:

* fix calibration issues on some boards
* Wake-on-WLAN improvements

ath10k:

* add support for qca6174 hardware
* enable RX batching to reduce CPU load

Conflicts:
	drivers/net/wireless/rtlwifi/pci.c

Conflict resolution is to get rid of the 'end' label and keep
the rest.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 12:13:58 -08:00
Mike Snitzer
e5863d9ad7 dm: allocate requests in target when stacking on blk-mq devices
For blk-mq request-based DM the responsibility of allocating a cloned
request is transfered from DM core to the target type.  Doing so
enables the cloned request to be allocated from the appropriate
blk-mq request_queue's pool (only the DM target, e.g. multipath, can
know which block device to send a given cloned request to).

Care was taken to preserve compatibility with old-style block request
completion that requires request-based DM _not_ acquire the clone
request's queue lock in the completion path.  As such, there are now 2
different request-based DM target_type interfaces:
1) the original .map_rq() interface will continue to be used for
   non-blk-mq devices -- the preallocated clone request is passed in
   from DM core.
2) a new .clone_and_map_rq() and .release_clone_rq() will be used for
   blk-mq devices -- blk_get_request() and blk_put_request() are used
   respectively from these hooks.

dm_table_set_type() was updated to detect if the request-based target is
being stacked on blk-mq devices, if so DM_TYPE_MQ_REQUEST_BASED is set.
DM core disallows switching the DM table's type after it is set.  This
means that there is no mixing of non-blk-mq and blk-mq devices within
the same request-based DM table.

[This patch was started by Keith and later heavily modified by Mike]

Tested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-02-09 13:06:47 -05:00
Mike Snitzer
dbf9782c10 dm: remove exports for request-based interfaces without external callers
Remove exports for dm_dispatch_request, dm_requeue_unmapped_request,
and dm_kill_unmapped_request.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-02-09 12:59:48 -05:00
Trond Myklebust
9e2b9f3776 SUNRPC: Remove the redundant XPRT_CONNECTION_CLOSE flag
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-09 11:26:06 -05:00
Trond Myklebust
505936f59f SUNRPC: Cleanup to remove remaining uses of XPRT_CONNECTION_ABORT
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-09 09:20:40 -05:00
Tejun Heo
b12aa1f25e Merge branch 'for-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata into for-3.20
09c32aaa36 ("ahci_xgene: Fix the dma state machine lockup for the
ATA_CMD_SMART PIO mode command.") missed 3.19 release.  Fold it into
for-3.20.

Signed-off-by: Tejun Heo <tj@kernel.org>
2015-02-09 07:54:41 -05:00
Eric Dumazet
93c1af6ca9 net:rfs: adjust table size checking
Make sure root user does not try something stupid.

Also make sure mask field in struct rps_sock_flow_table
does not share a cache line with the potentially often dirtied
flow table.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 567e4b7973 ("net: rfs: add hash collision detection")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 21:54:09 -08:00
Trond Myklebust
4efdd92c92 SUNRPC: Remove TCP client connection reset hack
Instead we rely on SO_REUSEPORT to provide the reconnection semantics
that we need for NFSv2/v3.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-08 21:47:30 -05:00
Trond Myklebust
718ba5b873 SUNRPC: Add helpers to prevent socket create from racing
The socket lock is currently held by the task that is requesting the
connection be established. While that is efficient in the case where
the connection happens quickly, it is racy in the case where it doesn't.
What we really want is for the connect helper to be able to block access
to the socket while it is being set up.

This patch does so by arranging to transfer the socket lock from the
task that is requesting the connect attempt, and then releasing that
lock once everything is done.
This scheme also gives us automatic protection against collisions with
the RPC close code, so we can kill the cancel_delayed_work_sync()
call in xs_close().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-08 21:47:29 -05:00
Linus Torvalds
4e02370f64 Merge tag 'trace-fixes-v3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fixes from Steven Rostedt:
 "During testing Sedat Dilek hit a "suspicious RCU usage" splat that
  pointed out a real bug.  During suspend and resume the tlb_flush
  tracepoint is called when the CPU is going offline.  As the CPU has
  been noted as offline, RCU is ignoring that CPU, which means that it
  can not use RCU protected locks.  When tracepoints are activated, they
  require RCU locking, and if RCU is ignoring a CPU that runs a
  tracepoint, there is a chance that the tracepoint could cause
  corruption.

  The solution was to change the tracepoint into a
  TRACE_EVENT_CONDITION() which allows us to check a condition to
  determine if the tracepoint should be called or not.  If the condition
  is not met, the rcu protected code will not be executed.  By adding
  the condition "cpu_online(smp_processor_id())", this will prevent the
  RCU protected code from being executed if the CPU is marked offline.

  After adding this, another bug was discovered.  As RCU checks rcu
  callers, if a rcu call is not done, there is no check (obviously).  We
  found that tracepoints could be added in RCU ignored locations and not
  have lockdep complain until the tracepoint is activated.  This missed
  places where tracepoints were added in places they should not have
  been.  To fix this, code was added in 3.18 that if lockdep is enabled,
  any tracepoint will still call the rcu checks even if the tracepoint
  is not enabled.  The bug here, is that the check does not take the
  CONDITION into account.  As the condition may prevent tracepoints from
  being activated in RCU ignored areas (as the one patch does), we get
  false positives when we enable lockdep and hit a tracepoint that the
  condition prevents it from being called in a RCU ignored location.

  The fix for this is to add the CONDITION to the rcu checks, even if
  the tracepoint is not enabled"

* tag 'trace-fixes-v3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  x86/tlb/trace: Do not trace on CPU that is offline
  tracing: Add condition check to RCU lockdep checks
2015-02-08 18:08:14 -08:00
Eric Dumazet
567e4b7973 net: rfs: add hash collision detection
Receive Flow Steering is a nice solution but suffers from
hash collisions when a mix of connected and unconnected traffic
is received on the host, when flow hash table is populated.

Also, clearing flow in inet_release() makes RFS not very good
for short lived flows, as many packets can follow close().
(FIN , ACK packets, ...)

This patch extends the information stored into global hash table
to not only include cpu number, but upper part of the hash value.

I use a 32bit value, and dynamically split it in two parts.

For host with less than 64 possible cpus, this gives 6 bits for the
cpu number, and 26 (32-6) bits for the upper part of the hash.

Since hash bucket selection use low order bits of the hash, we have
a full hash match, if /proc/sys/net/core/rps_sock_flow_entries is big
enough.

If the hash found in flow table does not match, we fallback to RPS (if
it is enabled for the rxqueue).

This means that a packet for an non connected flow can avoid the
IPI through a unrelated/victim CPU.

This also means we no longer have to clear the table at socket
close time, and this helps short lived flows performance.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 16:53:57 -08:00
Sabrina Dubroca
096a4cfa58 net: fix a typo in skb_checksum_validate_zero_check
Remove trailing underscore.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 16:28:26 -08:00
Neal Cardwell
4fb17a6091 tcp: mitigate ACK loops for connections as tcp_timewait_sock
Ensure that in state FIN_WAIT2 or TIME_WAIT, where the connection is
represented by a tcp_timewait_sock, we rate limit dupacks in response
to incoming packets (a) with TCP timestamps that fail PAWS checks, or
(b) with sequence numbers that are out of the acceptable window.

We do not send a dupack in response to out-of-window packets if it has
been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we
last sent a dupack in response to an out-of-window packet.

Reported-by: Avery Fay <avery@mixpanel.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 01:03:13 -08:00
Neal Cardwell
f2b2c582e8 tcp: mitigate ACK loops for connections as tcp_sock
Ensure that in state ESTABLISHED, where the connection is represented
by a tcp_sock, we rate limit dupacks in response to incoming packets
(a) with TCP timestamps that fail PAWS checks, or (b) with sequence
numbers or ACK numbers that are out of the acceptable window.

We do not send a dupack in response to out-of-window packets if it has
been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we
last sent a dupack in response to an out-of-window packet.

There is already a similar (although global) rate-limiting mechanism
for "challenge ACKs". When deciding whether to send a challence ACK,
we first consult the new per-connection rate limit, and then the
global rate limit.

Reported-by: Avery Fay <avery@mixpanel.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 01:03:12 -08:00
Neal Cardwell
a9b2c06dbe tcp: mitigate ACK loops for connections as tcp_request_sock
In the SYN_RECV state, where the TCP connection is represented by
tcp_request_sock, we now rate-limit SYNACKs in response to a client's
retransmitted SYNs: we do not send a SYNACK in response to client SYN
if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms)
since we last sent a SYNACK in response to a client's retransmitted
SYN.

This allows the vast majority of legitimate client connections to
proceed unimpeded, even for the most aggressive platforms, iOS and
MacOS, which actually retransmit SYNs 1-second intervals for several
times in a row. They use SYN RTO timeouts following the progression:
1,1,1,1,1,2,4,8,16,32.

Reported-by: Avery Fay <avery@mixpanel.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 01:03:12 -08:00
Mark Brown
4f9f4548a5 Merge remote-tracking branches 'spi/topic/orion', 'spi/topic/pxa2xx', 'spi/topic/qup', 'spi/topic/rockchip' and 'spi/topic/samsung' into spi-next 2015-02-08 11:16:55 +08:00
Mark Brown
8328509c4b Merge remote-tracking branches 'spi/topic/img-spfi', 'spi/topic/imx', 'spi/topic/inline', 'spi/topic/meson' and 'spi/topic/mxs' into spi-next 2015-02-08 11:16:52 +08:00
Mark Brown
d6cd09bea9 Merge remote-tracking branches 'spi/topic/falcon', 'spi/topic/fsf', 'spi/topic/fsl', 'spi/topic/fsl-dspi' and 'spi/topic/gpio' into spi-next 2015-02-08 11:16:46 +08:00
Mark Brown
81306d53da Merge remote-tracking branch 'spi/topic/sh-msiof' into spi-next 2015-02-08 11:16:43 +08:00
Mark Brown
ffe167b0f2 Merge remote-tracking branches 'regulator/topic/max8649', 'regulator/topic/mode', 'regulator/topic/mt6397', 'regulator/topic/pfuze100' and 'regulator/topic/qcom-rpm' into regulator-next 2015-02-08 11:16:27 +08:00
Mark Brown
30c5c53042 Merge remote-tracking branches 'regulator/topic/axp20x', 'regulator/topic/da9211' and 'regulator/topic/fan53555' into regulator-next 2015-02-08 11:16:23 +08:00
Mark Brown
fca8e13f50 Merge remote-tracking branch 'regulator/topic/dt-cb' into regulator-next 2015-02-08 11:16:22 +08:00
Mark Brown
14ac3213ba Merge tag 'regulator-v3.19-rc7' into regulator-linus
regulator: Fix !REGULATOR builds of systems using system suspend calls

The system suspend calls (used to synchronize between system suspend of
the CPU and it's PMIC) are called from board code but not stubbed when
regulator is disabled causing build failures in such cases.  This patch
fixes those build failures.

# gpg: Signature made Thu 05 Feb 2015 05:10:43 HKT using RSA key ID 5D5487D0
# gpg: WARNING: digest algorithm MD5 is deprecated
# gpg: please see http://www.gnupg.org/faq/weak-digest-algos.html for more information
# gpg: Oops: keyid_from_fingerprint: no pubkey
# gpg: key AF88CD16: no public key for trusted key - skipped
# gpg: key AF88CD16 marked as ultimately trusted
# gpg: key 5621E907: no public key for trusted key - skipped
# gpg: key 5621E907 marked as ultimately trusted
# gpg: Good signature from "Mark Brown <broonie@sirena.org.uk>"
# gpg:                 aka "Mark Brown <broonie@debian.org>"
# gpg:                 aka "Mark Brown <broonie@kernel.org>"
# gpg:                 aka "Mark Brown <broonie@tardis.ed.ac.uk>"
# gpg:                 aka "Mark Brown <broonie@linaro.org>"
# gpg:                 aka "Mark Brown <Mark.Brown@linaro.org>"
2015-02-08 11:16:17 +08:00
Mark Brown
1aff0310eb Merge remote-tracking branches 'regmap/topic/ac97', 'regmap/topic/doc' and 'regmap/topic/smbus' into regmap-next 2015-02-08 11:16:11 +08:00
Steven Rostedt (Red Hat)
a05d59a567 tracing: Add condition check to RCU lockdep checks
The trace_tlb_flush() tracepoint can be called when a CPU is going offline.
When a CPU is offline, RCU is no longer watching that CPU and since the
tracepoint is protected by RCU, it must not be called. To prevent the
tlb_flush tracepoint from being called when the CPU is offline, it was
converted to a TRACE_EVENT_CONDITION where the condition checks if the
CPU is online before calling the tracepoint.

Unfortunately, this was not enough to stop lockdep from complaining about
it. Even though the RCU protected code of the tracepoint will never be
called, the condition is hidden within the tracepoint, and even though the
condition prevents RCU code from being called, the lockdep checks are
outside the tracepoint (this is to test tracepoints even when they are not
enabled).

Even though tracepoints should be checked to be RCU safe when they are not
enabled, the condition should still be considered when checking RCU.

Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com

Fixes: 3a630178fd "tracing: generate RCU warnings even when tracepoints are disabled"
Cc: stable@vger.kernel.org # 3.18+
Acked-by: Dave Hansen <dave@sr71.net>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-07 19:34:25 -05:00
Bjorn Helgaas
cb8e92d8e4 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Add pci_device_to_OF_node() stub for !CONFIG_OF
2015-02-07 08:22:24 -06:00
Linus Torvalds
396e9099ea Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix deadline parameter modification handling
  sched/wait: Remove might_sleep() from wait_event_cmd()
  sched: Fix crash if cpuset_cpumask_can_shrink() is passed an empty cpumask
  sched/fair: Avoid using uninitialized variable in preferred_group_nid()
2015-02-06 13:34:26 -08:00
Olof Johansson
6f4554bdff Merge tag 'samsung-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/drivers
Merge "Samsung CPUIdle updates for v3.20" from Kukjin Kim:

- adds coupled cpuidle support for exynos4210
  : fix for Exynos platform PM code preparing it for the coupled
  cpuidle support and adds coupled cpuidle AFTR mode on exynos4210

Note this is mostrly based on earlier cpuidle-exynos4210 driver
from Daniel Lezcano and Bart updated.

* tag 'samsung-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  cpuidle: exynos: add coupled cpuidle support for exynos4210
  ARM: EXYNOS: apply S5P_CENTRAL_SEQ_OPTION fix only when necessary

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-02-06 01:07:33 -08:00
Trond Myklebust
5a0ec8acb9 NFSv4.1: Pin the inode and super block in asynchronous layoutreturns
If we're sending an asynchronous layoutreturn, then we need to ensure
that the inode and the super block remain pinned.

Cc: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
2015-02-05 22:16:45 -05:00
Trond Myklebust
472e259449 NFSv4.1: Pin the inode and super block in asynchronous layoutcommit
If we're sending an asynchronous layoutcommit, then we need to ensure
that the inode and the super block remain pinned.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
2015-02-05 22:16:33 -05:00
David S. Miller
6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Linus Torvalds
9d82f5eb33 MMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Stretch ACKs can kill performance with Reno and CUBIC congestion
    control, largely due to LRO and GRO.  Fix from Neal Cardwell.

 2) Fix userland breakage because we accidently emit zero length netlink
    messages from the bridging code.  From Roopa Prabhu.

 3) Carry handling in generic csum_tcpudp_nofold is broken, fix from
    Karl Beldan.

 4) Remove bogus dev_set_net() calls from CAIF driver, from Nicolas
    Dichtel.

 5) Make sure PPP deflation never returns a length greater then the
    output buffer, otherwise we overflow and trigger skb_over_panic().
    Fix from Florian Westphal.

 6) COSA driver needs VIRT_TO_BUS Kconfig dependencies, from Arnd
    Bergmann.

 7) Don't increase route cached MTU on datagram too big ICMPs.  From Li
    Wei.

 8) Fix error path leaks in nf_tables, from Pablo Neira Ayuso.

 9) Fix bitmask handling regression in netlink that broke things like
    acpi userland tools.  From Pablo Neira Ayuso.

10) Wrong header pointer passed to param_type2af() in SCTP code, from
    Saran Maruti Ramanara.

11) Stacked vlans not handled correctly by vlan_get_protocol(), from
    Toshiaki Makita.

12) Add missing DMA memory barrier to xgene driver, from Iyappan
    Subramanian.

13) Fix crash in rate estimators, from Eric Dumazet.

14) We've been adding various workarounds, one after another, for the
    change which added the per-net tcp_sock.  It was meant to reduce
    socket contention but added lots of problems.

    Reduce this instead to a proper per-cpu socket and that rids us of
    all the daemons.

    From Eric Dumazet.

15) Fix memory corruption and OOPS in mlx4 driver, from Jack
    Morgenstein.

16) When we disabled UFO in the virtio_net device, it introduces some
    serious performance regressions.  The orignal problem was IPV6
    fragment ID generation, so fix that properly instead.  From Vlad
    Yasevich.

17) sr9700 driver build breaks on xtensa because it defines macros with
    the same name as those used by the arch code.  Use more unique
    names.  From Chen Gang.

18) Fix endianness in new virio 1.0 mode of the vhost net driver, from
    Michael S Tsirkin.

19) Several sysctls were setting the maxlen attribute incorrectly, from
    Sasha Levin.

20) Don't accept an FQ scheduler quantum of zero, that leads to crashes.
    From Kenneth Klette Jonassen.

21) Fix dumping of non-existing actions in the packet scheduler
    classifier.  From Ignacy Gawędzki.

22) Return the write work_done value when doing TX work in the qlcnic
    driver.

23) ip6gre_err accesses the info field with the wrong endianness, from
    Sabrina Dubroca.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  sit: fix some __be16/u16 mismatches
  ipv6: fix sparse errors in ip6_make_flowlabel()
  net: remove some sparse warnings
  flow_keys: n_proto type should be __be16
  ip6_gre: fix endianness errors in ip6gre_err
  qlcnic: Fix NAPI poll routine for Tx completion
  amd-xgbe: Set RSS enablement based on hardware features
  amd-xgbe: Adjust for zero-based traffic class count
  cls_api.c: Fix dumping of non-existing actions' stats.
  pkt_sched: fq: avoid hang when quantum 0
  net: rds: use correct size for max unacked packets and bytes
  vhost/net: fix up num_buffers endian-ness
  gianfar: correct the bad expression while writing bit-pattern
  net: usb: sr9700: Use 'SR_' prefix for the common register macros
  Revert "drivers/net: Disable UFO through virtio"
  Revert "drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets"
  ipv6: Select fragment id during UFO segmentation if not set.
  xen-netback: stop the guest rx thread after a fatal error
  net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than 80 VFs
  isdn: off by one in connect_res()
  ...
2015-02-05 11:23:45 -08:00
Christoph Hellwig
37f19e57a0 block: merge __bio_map_user_iov into bio_map_user_iov
And also remove the unused bdev argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05 09:30:43 -07:00
Kent Overstreet
26e49cfc7e block: pass iov_iter to the BLOCK_PC mapping functions
Make use of a new interface provided by iov_iter, backed by
scatter-gather list of iovec, instead of the old interface based on
sg_iovec. Also use iov_iter_advance() instead of manual iteration.

This commit should contain only literal replacements, without
functional changes.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
[hch: fixed to do a deep clone of the iov_iter, and to properly use
      the iov_iter direction]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05 09:30:40 -07:00
Christoph Hellwig
ddad8dd0a1 block: use blk_rq_map_user_iov to implement blk_rq_map_user
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-05 09:30:37 -07:00
Yinghai Lu
ecf5636dcd ACPI: Add interfaces to parse IOAPIC ID for IOAPIC hotplug
We need to parse APIC ID for IOAPIC registration for IOAPIC hotplug.
ACPI _MAT method and MADT table are used to figure out IOAPIC ID, just
like parsing CPU APIC ID for CPU hotplug.

[ tglx: Fixed docbook comment ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1414387308-27148-8-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:09:26 +01:00
Jiang Liu
14d76b68f2 PCI: Use common resource list management code instead of private implementation
Use common resource list management data structure and interfaces
instead of private implementation.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:09:25 +01:00
Jiang Liu
90e9782061 resources: Move struct resource_list_entry from ACPI into resource core
Currently ACPI, PCI and pnp all implement the same resource list
management with different data structure. We need to transfer from
one data structure into another when passing resources from one
subsystem into another subsystem. So move struct resource_list_entry
from ACPI into resource core and rename it as resource_entry,
then it could be reused by different subystems and avoid the data
structure conversion.

Introduce dedicated header file resource_ext.h instead of embedding
it into ioport.h to avoid header file inclusion order issues.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:09:25 +01:00
Christoph Hellwig
7fbc1067f0 exportfs: add methods for block layout exports
Add three methods to allow exporting pnfs block layout volumes:

 - get_uuid: get a filesystem unique signature exposed to clients
 - map_blocks: map and if nessecary allocate blocks for a layout
 - commit_blocks: commit blocks in a layout once the client is done with them

For now we stick the external pnfs block layout interfaces into s_export_op to
avoid mixing them up with the internal interface between the NFS server and
the layout drivers.  Once we've fully internalized the latter interface we
can redecide if these methods should stay in s_export_ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-05 14:35:17 +01:00