Commit Graph

90886 Commits

Author SHA1 Message Date
David S. Miller
f9aa9dc7d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
All conflicts were simple overlapping changes except perhaps
for the Thunder driver.

That driver has a change_mtu method explicitly for sending
a message to the hardware.  If that fails it returns an
error.

Normally a driver doesn't need an ndo_change_mtu method becuase those
are usually just range changes, which are now handled generically.
But since this extra operation is needed in the Thunder driver, it has
to stay.

However, if the message send fails we have to restore the original
MTU before the change because the entire call chain expects that if
an error is thrown by ndo_change_mtu then the MTU did not change.
Therefore code is added to nicvf_change_mtu to remember the original
MTU, and to restore it upon nicvf_update_hw_max_frs() failue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 13:27:16 -05:00
Eric W. Biederman
64b875f7ac ptrace: Capture the ptracer's creds not PT_PTRACE_CAP
When the flag PT_PTRACE_CAP was added the PTRACE_TRACEME path was
overlooked.  This can result in incorrect behavior when an application
like strace traces an exec of a setuid executable.

Further PT_PTRACE_CAP does not have enough information for making good
security decisions as it does not report which user namespace the
capability is in.  This has already allowed one mistake through
insufficient granulariy.

I found this issue when I was testing another corner case of exec and
discovered that I could not get strace to set PT_PTRACE_CAP even when
running strace as root with a full set of caps.

This change fixes the above issue with strace allowing stracing as
root a setuid executable without disabling setuid.  More fundamentaly
this change allows what is allowable at all times, by using the correct
information in it's decision.

Cc: stable@vger.kernel.org
Fixes: 4214e42f96d4 ("v2.4.9.11 -> v2.4.9.12")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-11-22 11:49:49 -06:00
Eric W. Biederman
bfedb58925 mm: Add a user_ns owner to mm_struct and fix ptrace permission checks
During exec dumpable is cleared if the file that is being executed is
not readable by the user executing the file.  A bug in
ptrace_may_access allows reading the file if the executable happens to
enter into a subordinate user namespace (aka clone(CLONE_NEWUSER),
unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER).

This problem is fixed with only necessary userspace breakage by adding
a user namespace owner to mm_struct, captured at the time of exec, so
it is clear in which user namespace CAP_SYS_PTRACE must be present in
to be able to safely give read permission to the executable.

The function ptrace_may_access is modified to verify that the ptracer
has CAP_SYS_ADMIN in task->mm->user_ns instead of task->cred->user_ns.
This ensures that if the task changes it's cred into a subordinate
user namespace it does not become ptraceable.

The function ptrace_attach is modified to only set PT_PTRACE_CAP when
CAP_SYS_PTRACE is held over task->mm->user_ns.  The intent of
PT_PTRACE_CAP is to be a flag to note that whatever permission changes
the task might go through the tracer has sufficient permissions for
it not to be an issue.  task->cred->user_ns is always the same
as or descendent of mm->user_ns.  Which guarantees that having
CAP_SYS_PTRACE over mm->user_ns is the worst case for the tasks
credentials.

To prevent regressions mm->dumpable and mm->user_ns are not considered
when a task has no mm.  As simply failing ptrace_may_attach causes
regressions in privileged applications attempting to read things
such as /proc/<pid>/stat

Cc: stable@vger.kernel.org
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Fixes: 8409cca705 ("userns: allow ptrace from non-init user namespaces")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-11-22 11:49:48 -06:00
Bandan Das
ae0f549951 kvm: x86: don't print warning messages for unimplemented msrs
Change unimplemented msrs messages to use pr_debug.
If CONFIG_DYNAMIC_DEBUG is set, then these messages can be
enabled at run time or else -DDEBUG can be used at compile
time to enable them. These messages will still be printed if
ignore_msrs=1.

Signed-off-by: Bandan Das <bsd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-22 18:29:10 +01:00
NeilBrown
688834e6ae md/failfast: add failfast flag for md to be used by some personalities.
This patch just adds a 'failfast' per-device flag which can be stored
in v0.90 or v1.x metadata.
The flag is not used yet but the intent is that it can be used for
mirrored (raid1/raid10) arrays where low latency is more important
than keeping all devices on-line.

Setting the flag for a device effectively gives permission for that
device to be marked as Faulty and excluded from the array on the first
error.  The underlying driver will be directed not to retry requests
that result in failures.  There is a proviso that the device must not
be marked faulty if that would cause the array as a whole to fail, it
may only be marked Faulty if the array remains functional, but is
degraded.

Failures on read requests will cause the device to be marked
as Faulty immediately so that further reads will avoid that
device.  No attempt will be made to correct read errors by
over-writing with the correct data.

It is expected that if transient errors, such as cable unplug, are
possible, then something in user-space will revalidate failed
devices and re-add them when they appear to be working again.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-22 08:58:17 -08:00
Ming Lei
3a83f46775 block: bio: pass bvec table to bio_init()
Some drivers often use external bvec table, so introduce
this helper for this case. It is always safe to access the
bio->bi_io_vec in this way for this case.

After converting to this usage, it will becomes a bit easier
to evaluate the remaining direct access to bio->bi_io_vec,
so it can help to prepare for the following multipage bvec
support.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Fixed up the new O_DIRECT cases.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-22 08:57:21 -07:00
Robert Bragg
d79651522e drm/i915: Enable i915 perf stream for Haswell OA unit
Gen graphics hardware can be set up to periodically write snapshots of
performance counters into a circular buffer via its Observation
Architecture and this patch exposes that capability to userspace via the
i915 perf interface.

v2:
   Make sure to initialize ->specific_ctx_id when opening, without
   relying on _pin_notify hook, in case ctx already pinned.
v3:
   Revert back to pinning ctx upfront when opening stream, removing
   need to hook in to pinning and to update OACONTROL on the fly.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-7-robert@sixbynine.org
2016-11-22 14:38:13 +01:00
Robert Bragg
eec688e142 drm/i915: Add i915 perf infrastructure
Adds base i915 perf infrastructure for Gen performance metrics.

This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64
properties to configure a stream of metrics and returns a new fd usable
with standard VFS system calls including read() to read typed and sized
records; ioctl() to enable or disable capture and poll() to wait for
data.

A stream is opened something like:

  uint64_t properties[] = {
      /* Single context sampling */
      DRM_I915_PERF_PROP_CTX_HANDLE,        ctx_handle,

      /* Include OA reports in samples */
      DRM_I915_PERF_PROP_SAMPLE_OA,         true,

      /* OA unit configuration */
      DRM_I915_PERF_PROP_OA_METRICS_SET,    metrics_set_id,
      DRM_I915_PERF_PROP_OA_FORMAT,         report_format,
      DRM_I915_PERF_PROP_OA_EXPONENT,       period_exponent,
   };
   struct drm_i915_perf_open_param parm = {
      .flags = I915_PERF_FLAG_FD_CLOEXEC |
               I915_PERF_FLAG_FD_NONBLOCK |
               I915_PERF_FLAG_DISABLED,
      .properties_ptr = (uint64_t)properties,
      .num_properties = sizeof(properties) / 16,
   };
   int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param);

Records read all start with a common { type, size } header with
DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records
contain an extensible number of fields and it's the
DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that
determine what's included in every sample.

No specific streams are supported yet so any attempt to open a stream
will return an error.

v2:
    use i915_gem_context_get() - Chris Wilson
v3:
    update read() interface to avoid passing state struct - Chris Wilson
    fix some rebase fallout, with i915-perf init/deinit
v4:
    s/DRM_IORW/DRM_IOW/ - Emil Velikov

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-2-robert@sixbynine.org
2016-11-22 14:27:18 +01:00
Pan Xinhui
4ec6e86362 kvm: Introduce kvm_write_guest_offset_cached()
It allows us to update some status or field of a structure partially.

We can also save a kvm_read_guest_cached() call if we just update one
fild of the struct regardless of its current value.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-8-git-send-email-xinhui.pan@linux.vnet.ibm.com
[ Typo fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:07 +01:00
Pan Xinhui
d9345c65eb sched/core: Introduce the vcpu_is_preempted(cpu) interface
This patch is the first step to add support to improve lock holder
preemption beaviour.

vcpu_is_preempted(cpu) does the obvious thing: it tells us whether a
vCPU is preempted or not.

Defaults to false on architectures that don't support it.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Translated the changelog to English. ]
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-2-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:05 +01:00
Ingo Molnar
02cb689b2c Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:37:38 +01:00
Oleg Nesterov
8e5bfa8c1f sched/autogroup: Do not use autogroup->tg in zombie threads
Exactly because for_each_thread() in autogroup_move_group() can't see it
and update its ->sched_task_group before _put() and possibly free().

So the exiting task needs another sched_move_task() before exit_notify()
and we need to re-introduce the PF_EXITING (or similar) check removed by
the previous change for another reason.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Cc: vlovejoy@redhat.com
Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:33:43 +01:00
Hans de Goede
eb1610b4c2 led: core: Fix blink_brightness setting race
All 3 of led_timer_func, led_set_brightness and led_set_software_blink
set blink_brightness. If led_timer_func or led_set_software_blink race
with led_set_brightness they may end up overwriting the new
blink_brightness. The new atomic work_flags does not protect against
this as it just protects the flags and not blink_brightness.

This commit introduces a new new_blink_brightness value which gets
set by led_set_brightness and read by led_timer_func on LED on, fixing
this.

Dealing with the new brightness at LED on time, makes the new
brightness apply sooner, which also fixes a led_set_brightness which
happens while a oneshot blink which ends in LED on is running not
getting applied.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-11-22 12:07:05 +01:00
Hans de Goede
a9c6ce57ec led: core: Use atomic bit-field for the blink-flags
All the LED_BLINK* flags are accessed read-modify-write from e.g.
led_set_brightness and led_blink_set_oneshot while both
set_brightness_work and the blink_timer may be running.

If these race then the modify step done by one of them may be lost,
switch the LED_BLINK* flags to a new atomic work_flags bit-field
to avoid this race.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-11-22 12:07:05 +01:00
David Lechner
e381322b01 leds: Introduce userspace LED class driver
This driver creates a userspace leds driver similar to uinput.

New LEDs are created by opening /dev/uleds and writing a uleds_user_dev
struct. A new LED class device is registered with the name given in the
struct. Reading will return a single byte that is the current brightness.
The poll() syscall is also supported. It will be triggered whenever the
brightness changes. Closing the file handle to /dev/uleds will remove
the leds class device.

Signed-off-by: David Lechner <david@lechnology.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-11-22 12:07:02 +01:00
Daniel Vetter
1ea0c02e70 drm/atomic: Unconfuse the old_state mess in commmit_tail
I totally butcherd the job on typing the kernel-doc for these, and no
one realized. Noticed by Russell. Maarten has a more complete approach
to this confusion, by making it more explicit what the new/old state
is, instead of this magic switching behaviour.

v2:
- Liviu pointed out that wait_for_fences is even more magic. Leave
that as @state, and document @pre_swap better.
- While at it, patch in header for the reference section.
- Fix spelling issues Russell noticed.

v3: Fix up the @pre_swap note (Liviu): Also s/synchronous/blocking/,
since async flip is something else than non-blocking.

Cc: Liviu Dudau <liviu.dudau@arm.com>
Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
Fixes: 9f2a7950e7 ("drm/atomic-helper: nonblocking commit support")
Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161121171802.24147-1-daniel.vetter@ffwll.ch
2016-11-22 11:11:57 +01:00
Mauro Carvalho Chehab
820b1a93f4 Merge tag 'v4.9-rc6' into patchwork
Linux 4.9-rc6

* tag 'v4.9-rc6': (305 commits)
  Linux 4.9-rc6
  ext4: sanity check the block and cluster size at mount time
  fscrypto: don't use on-stack buffer for key derivation
  fscrypto: don't use on-stack buffer for filename encryption
  i2c: i2c-mux-pca954x: fix deselect enabling for device-tree
  kvm: x86: merge kvm_arch_set_irq and kvm_arch_set_irq_inatomic
  KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
  KVM: async_pf: avoid recursive flushing of work items
  kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
  KVM: Disable irq while unregistering user notifier
  KVM: x86: do not go through vcpu in __get_kvmclock_ns
  MAINTAINERS: Add LED subsystem co-maintainer
  crypto: algif_hash - Fix NULL hash crash with shash
  powerpc/mm: Fix missing update of HID register on secondary CPUs
  KVM: arm64: Fix the issues when guest PMCCFILTR is configured
  arm64: KVM: pmu: Fix AArch32 cycle counter access
  powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1
  i2c: digicolor: use clk_disable_unprepare instead of clk_unprepare
  ipmi/bt-bmc: change compatible node to 'aspeed, ast2400-ibt-bmc'
  Revert "drm/mediatek: set vblank_disable_allowed to true"
  ...
2016-11-22 05:20:06 -02:00
Rafael J. Wysocki
08b98d3291 PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag
Modify the ACPI system sleep support setup code to select
suspend-to-idle as the default system sleep state if the
ACPI_FADT_LOW_POWER_S0 flag is set in the FADT and the
default sleep state was not selected from the kernel command
line.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-21 22:48:10 +01:00
Linus Torvalds
27e7ab99db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Clear congestion control state when changing algorithms on an
    existing socket, from Florian Westphal.

 2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
    from Jia Jie Ho.

 3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
    CAVALLARO.

 4) Fix udplite multicast delivery handling, it ignores the udp_table
    parameter passed into the lookups, from Pablo Neira Ayuso.

 5) Synchronize the space estimated by rtnl_vfinfo_size and the space
    actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.

 6) Fix memory leak in fib_info when splitting nodes, from Alexander
    Duyck.

 7) If a driver does a napi_hash_del() explicitily and not via
    netif_napi_del(), it must perform RCU synchronization as needed. Fix
    this in virtio-net and bnxt drivers, from Eric Dumazet.

 8) Likewise, it is not necessary to invoke napi_hash_del() is we are
    also doing neif_napi_del() in the same code path. Remove such calls
    from be2net and cxgb4 drivers, also from Eric Dumazet.

 9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
    from WANG Cong.

10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.

11) We cannot cache routes in ip6_tunnel when using inherited traffic
    classes, from Paolo Abeni.

12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.

13) Splice operations cannot use freezable blocking calls in AF_UNIX,
    from WANG Cong.

14) Link dump filtering by master device and kind support added an error
    in loop index updates during the dump if we actually do filter, fix
    from Zhang Shengju.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
  tcp: zero ca_priv area when switching cc algorithms
  net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
  ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
  tipc: eliminate obsolete socket locking policy description
  rtnl: fix the loop index update error in rtnl_dump_ifinfo()
  l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
  net: macb: add check for dma mapping error in start_xmit()
  rtnetlink: fix FDB size computation
  netns: fix get_net_ns_by_fd(int pid) typo
  af_unix: conditionally use freezable blocking calls in read
  net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
  net: ethernet: ti: cpsw: add missing sanity check
  net: ethernet: ti: cpsw: fix secondary-emac probe error path
  net: ethernet: ti: cpsw: fix of_node and phydev leaks
  net: ethernet: ti: cpsw: fix deferred probe
  net: ethernet: ti: cpsw: fix mdio device reference leak
  net: ethernet: ti: cpsw: fix bad register access in probe error path
  net: sky2: Fix shutdown crash
  cfg80211: limit scan results cache size
  net sched filters: pass netlink message flags in event notification
  ...
2016-11-21 13:26:28 -08:00
Liviu Dudau
8c0b55e22a drm/atomic: cleanup debugfs entries on un-registering the driver.
Cleanup the debugfs entries created by
commit 6559c901cb:  drm/atomic: add debugfs file to dump out atomic state
when the driver's minor gets un-registered. Without it, DRM drivers
compiled as modules cannot be rmmod-ed and modprobed again.

Tested-by: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161117114129.2627-1-Liviu.Dudau@arm.com
Fixes: 6559c901cb ("drm/atomic: add debugfs file to dump out atomic state")
2016-11-21 13:22:08 -05:00
Florian Westphal
e97991832a tcp: make undo_cwnd mandatory for congestion modules
The undo_cwnd fallback in the stack doubles cwnd based on ssthresh,
which un-does reno halving behaviour.

It seems more appropriate to let congctl algorithms pair .ssthresh
and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it
up for all congestion algorithms that used to rely on the fallback.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:20:17 -05:00
Nikolay Aleksandrov
aa2ae3e71c bridge: mcast: add MLDv2 querier support
This patch adds basic support for MLDv2 queries, the default is MLDv1
as before. A new multicast option - multicast_mld_version, adds the
ability to change it between 1 and 2 via netlink and sysfs.
The MLD option is disabled if CONFIG_IPV6 is disabled.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Nikolay Aleksandrov
5e9235853d bridge: mcast: add IGMPv3 query support
This patch adds basic support for IGMPv3 queries, the default is IGMPv2
as before. A new multicast option - multicast_igmp_version, adds the
ability to change it between 2 and 3 via netlink and sysfs. The option
struct member is in a 4 byte hole in net_bridge.

There also a few minor style adjustments in br_multicast_new_group and
br_multicast_add_group.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Christoph Hellwig
93c5bdf7ab block: clear all of bi_opf in bio_set_op_attrs
Since commit 87374179 ("block: add a proper block layer data direction
encoding") we only or the new op and flags into bi_opf in bio_set_op_attrs
instead of clearing the old value.  I've not seen any breakage with the
new behavior, but it seems dangerous.

Also convert it to an inline function to make the argument passing
safer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-21 09:35:05 -07:00
Daniel Borkmann
6d67942dd0 bpf: add __must_check attributes to refcount manipulating helpers
Helpers like bpf_prog_add(), bpf_prog_inc(), bpf_map_inc() can fail
with an error, so make sure the caller properly checks their return
value and not just ignores it, which could worst-case lead to use
after free.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:58 -05:00
Mikko Rapeli
209e236791 uapi dm-log-userspace.h: use __u32, __s32, __u64 and __s64 from linux/types.h
Fixes userspace compilation errors like:

linux/dm-log-userspace.h:416:2: error: unknown type name ‘uint64_t’

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-11-21 09:52:02 -05:00
PrasannaKumar Muralidharan
ed424bb368 hwrng: Make explicit that max >= 32 always
As hw_random core calls ->read with max > 32 or more, make it explicit.
Also remove checks involving 'max' being less than 8.

Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-21 22:50:45 +08:00
Rafael J. Wysocki
30248feff5 cpufreq: Make cpufreq_update_policy() void
The return value of cpufreq_update_policy() is never used, so make
it void.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-11-21 14:35:43 +01:00
Rafael J. Wysocki
bca5f557dc ACPI / processor: Make acpi_processor_ppc_has_changed() void
The return value of acpi_processor_ppc_has_changed() is never used,
so make it void.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-11-21 14:35:42 +01:00
Icenowy Zheng
3f89586bc1 mfd: axp20x: Add adc volatile ranges for axp22x
AXP22x has also some different register map than axp20x, they're also
added here.

Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-11-21 13:00:17 +00:00
Mika Kuoppala
841021713a drm/i915: Add bannable context parameter
Now when driver has per context scoring of 'hanging badness'
and also subsequent hangs during short windows are allowed,
if there is progress made in between, it does not make sense
to expose a ban timing window as a context parameter anymore.

Let the scoring be the sole indicator for ban policy and substitute
ban period context parameter as a boolean to get/set context
bannable property.

v2: allow non root to opt into being banned (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-11-21 14:36:40 +02:00
Yazen Ghannam
d12a969ebb EDAC, amd64: Add Deferred Error type
Currently, deferred errors are classified as correctable in EDAC. Add a
new error type for deferred errors so that they are correctly reported
to the user.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1479423463-8536-7-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-21 10:57:19 +01:00
Waiman Long
194a6b5b9c sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q
Currently the wake_q data structure is defined by the WAKE_Q() macro.
This macro, however, looks like a function doing something as "wake" is
a verb. Even checkpatch.pl was confused as it reported warnings like

  WARNING: Missing a blank line after declarations
  #548: FILE: kernel/futex.c:3665:
  +	int ret;
  +	WAKE_Q(wake_q);

This patch renames the WAKE_Q() macro to DEFINE_WAKE_Q() which clarifies
what the macro is doing and eliminates the checkpatch.pl warnings.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479401198-1765-1-git-send-email-longman@redhat.com
[ Resolved conflict and added missing rename. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-21 10:29:01 +01:00
Yazen Ghannam
1e8096bb20 EDAC: Add LRDDR4 DRAM type
AMD Fam17h systems can support Load-Reduced DDR4 DIMMs. So add this new
type to edac.h in preparation for the Fam17h EDAC update. Also, let's
fix a format issue with the LRDDR3 line while we're here.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1479423463-8536-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-21 09:31:59 +01:00
Jan Kara
dd936e4313 dax: rip out get_block based IO support
No one uses functions using the get_block callback anymore. Rip them
out and update documentation.

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-20 20:48:36 -05:00
Alexey Dobriyan
c72d8cdaa5 net: fix bogus cast in skb_pagelen() and use unsigned variables
1) cast to "int" is unnecessary:
   u8 will be promoted to int before decrementing,
   small positive numbers fit into "int", so their values won't be changed
   during promotion.

   Once everything is int including loop counters, signedness doesn't
   matter: 32-bit operations will stay 32-bit operations.

   But! Someone tried to make this loop smart by making everything of
   the same type apparently in an attempt to optimise it.
   Do the optimization, just differently.
   Do the cast where it matters. :^)

2) frag size is unsigned entity and sum of fragments sizes is also
   unsigned.

Make everything unsigned, leave no MOVSX instruction behind.

	add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-4 (-4)
	function                                     old     new   delta
	skb_cow_data                                 835     834      -1
	ip_do_fragment                              2549    2548      -1
	ip6_fragment                                3130    3128      -2
	Total: Before=154865032, After=154865028, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-19 22:11:25 -05:00
Alexey Dobriyan
3b2c75d371 netlink: use "unsigned int" in nla_next()
->nla_len is unsigned entity (it's length after all) and u16,
thus it can't overflow when being aligned into int/unsigned int.

(nlmsg_next has the same code, but I didn't yet convince myself
it is correct to do so).

There is pointer arithmetic in this function and offset being
unsigned is better:

	add/remove: 0/0 grow/shrink: 1/64 up/down: 5/-309 (-304)
	function                                     old     new   delta
	nl80211_set_wiphy                           1444    1449      +5
	team_nl_cmd_options_set                      997     995      -2
	tcf_em_tree_validate                         872     870      -2
	switchdev_port_bridge_setlink                352     350      -2
	switchdev_port_br_afspec                     312     310      -2
	rtm_to_fib_config                            428     426      -2
	qla4xxx_sysfs_ddb_set_param                 2193    2191      -2
	qla4xxx_iface_set_param                     4470    4468      -2
	ovs_nla_free_flow_actions                    152     150      -2
	output_userspace                             518     516      -2
		...
	nl80211_set_reg                              654     649      -5
	validate_scan_freqs                          148     142      -6
	validate_linkmsg                             288     282      -6
	nl80211_parse_connkeys                       489     483      -6
	nlattr_set                                   231     224      -7
	nf_tables_delsetelem                         267     260      -7
	do_setlink                                  3416    3408      -8
	netlbl_cipsov4_add_std                      1672    1659     -13
	nl80211_parse_sched_scan                    2902    2888     -14
	nl80211_trigger_scan                        1738    1720     -18
	do_execute_actions                          2821    2738     -83
	Total: Before=154865355, After=154865051, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-19 22:11:25 -05:00
Linus Torvalds
dce9ce3615 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
 "ARM:
   - Fix handling of the 32bit cycle counter
   - Fix cycle counter filtering

  x86:
   - Fix a race leading to double unregistering of user notifiers
   - Amend oversight in kvm_arch_set_irq that turned Hyper-V code dead
   - Use SRCU around kvm_lapic_set_vapic_addr
   - Avoid recursive flushing of asynchronous page faults
   - Do not rely on deferred update in KVM_GET_CLOCK, which fixes #GP
   - Let userspace know that KVM_GET_CLOCK is useful with master clock;
     4.9 changed the return value to better match the guest clock, but
     didn't provide means to let guests take advantage of it"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: merge kvm_arch_set_irq and kvm_arch_set_irq_inatomic
  KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
  KVM: async_pf: avoid recursive flushing of work items
  kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
  KVM: Disable irq while unregistering user notifier
  KVM: x86: do not go through vcpu in __get_kvmclock_ns
  KVM: arm64: Fix the issues when guest PMCCFILTR is configured
  arm64: KVM: pmu: Fix AArch32 cycle counter access
2016-11-19 13:31:40 -08:00
Paolo Bonzini
e3fd9a93a1 kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
Userspace can read the exact value of kvmclock by reading the TSC
and fetching the timekeeping parameters out of guest memory.  This
however is brittle and not necessary anymore with KVM 4.11.  Provide
a mechanism that lets userspace know if the new KVM_GET_CLOCK
semantics are in effect, and---since we are at it---if the clock
is stable across all VCPUs.

Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-19 19:04:16 +01:00
David Woodhouse
9101704429 iommu/vt-d: Fix PASID table allocation
Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.

In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables — still not ideal, but better than before.

Reported by Mika Kuoppala <mika.kuoppala@linux.intel.com> and also by
Xunlei Pang <xlpang@redhat.com> who submitted a simpler patch to fix
only the allocation (and not the free) to the "correct" limit... which
was still problematic.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org
2016-11-19 09:42:35 -08:00
Jarno Rajahalme
9403cd7cbb virtio_net: Do not clear memory for struct virtio_net_hdr twice.
virtio_net_hdr_from_skb() clears the memory for the header, so there
is no point for the callers to do the same.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-19 10:37:03 -05:00
Jarno Rajahalme
d66016a777 virtio_net.h: Fix comment.
Fix incorrent comment after the final #endif.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-19 10:37:03 -05:00
Marc Gonzalez
3371d663bb mtd: nand: Support controllers with custom page
If your controller already sends the required NAND commands when
reading or writing a page, then the framework is not supposed to
send READ0 and SEQIN/PAGEPROG respectively.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2016-11-19 09:43:07 +01:00
Olof Johansson
b029ffe00c Merge tag 'tegra-for-4.10-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers
soc: tegra: Core SoC changes for v4.10-rc1

This contains mostly cleanup and new feature work on the power
management controller as well as the addition of a Kconfig symbol for
the new Tegra186 (Parker) SoC generation.

* tag 'tegra-for-4.10-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  soc/tegra: pmc: Use consistent naming for PM domains
  soc/tegra: pmc: Remove genpd when adding provider fails
  soc/tegra: pmc: Check return code for pm_genpd_init()
  soc/tegra: pmc: Clean-up I/O rail error messages
  soc/tegra: pmc: Simplify IO rail bit handling
  soc/tegra: pmc: Guard against uninitialised PMC clock
  soc/tegra: pmc: Add I/O pad voltage support
  soc/tegra: pmc: Use consistent ordering of bit definitions
  soc/tegra: pmc: Correct type of variable for tegra_pmc_readl()
  soc/tegra: pmc: Use BIT macro for register field definition

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 18:42:33 -08:00
Olof Johansson
e40719dd01 Merge tag 'tegra-for-4.10-firmware' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers
firmware: Add Tegra IVC and BPMP support

IVC is an inter-processor communication protocol that uses shared memory
to exchange data between processors. The BPMP driver makes use of this
to communicate with the Boot and Power Management Processor (BPMP) and
uses an additional hardware synchronization primitive from the HSP block
to signal availability of new data (doorbell).

Firmware running on the BPMP implements a number of services such as the
control of clocks and resets within the system, or the ability to ungate
or gate power partitions.

* tag 'tegra-for-4.10-firmware' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  dt-bindings: firmware: Allow child nodes inside the Tegra BPMP
  dt-bindings: Add power domains to Tegra BPMP firmware
  firmware: tegra: Add BPMP support
  firmware: tegra: Add IVC library
  dt-bindings: firmware: Add bindings for Tegra BPMP

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 18:28:14 -08:00
Olof Johansson
dd3eedd338 Merge tag 'tegra-for-4.10-mailbox' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers
mailbox: Add Tegra HSP driver

This contains the device tree bindings and a driver for the Tegra HSP, a
hardware block that provides hardware synchronization primitives and is
the foundation for inter-processor communication between CPU and BPMP.

* tag 'tegra-for-4.10-mailbox' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  mailbox: tegra-hsp: Use after free in tegra_hsp_remove_doorbells()
  mailbox: Add Tegra HSP driver
  dt-bindings: mailbox: Add Tegra HSP binding
  soc/tegra: Add Tegra186 support

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 18:23:06 -08:00
Linus Torvalds
20afa6e2f9 Merge tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "They fix an ACPI thermal management regression introduced by a recent
  FADT handling cleanup, an ACPI tools build issue introduced by a
  recent ACPICA commit and a PCC mailbox initialization bug causing
  lockdep to complain loudly.

  Specifics:

   - Revert a recent ACPICA cleanup that attempted to get rid of all
     FADT version 2 legacy, but broke ACPI thermal management on at
     least one system (Rafael Wysocki).

   - Fix cross-compiled builds of ACPI tools that stopped working after
     a recent cleanup related to the handling of header files in ACPICA
     (Lv Zheng).

   - Fix a locking issue in the PCC channel initialization code that
     invokes devm_request_irq() under a spinlock (among other things)
     and causes lockdep to complain (Hoan Tran)"

* tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  tools/power/acpi: Remove direct kernel source include reference
  mailbox: PCC: Fix lockdep warning when request PCC channel
  Revert "ACPICA: FADT support cleanup"
2016-11-18 17:21:58 -08:00
Olof Johansson
a90a6f9c3d Merge tag 'omap-for-v4.10/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/dt
Device tree changes for omaps for v4.10 merge window:

- A series of patches to configure tps65217 PMIC interrupts for
  power button, charger and usb and use them on am335x

- Configure EEPROM, LEDs and USR1 button for omap5 boards

- Add tscadc DMA properites for am33xx and am4372

- Configure baltos-ir5221 both musb channels to host mode

- Configure internal and external RTC clocks for am335x boards

- Don't reset gpio3 block on baltos

- Remove pinmux for dra72-evm for erratum i869, fix the regulators
  and seprate out tps65917 support

- Add dra718-evm support

- Add minimal droid 4 xt894 support

* tag 'omap-for-v4.10/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (22 commits)
  ARM: dts: Add minimal support for motorola droid 4 xt894
  ARM: dts: Add support for dra718-evm
  ARM: dts: dra72: Add separate dtsi for tps65917
  ARM: dts: dra72-evm: Fix modelling of regulators
  ARM: dts: dra72-evm: Remove pinmux configurations for erratum i869
  ARM: dts: am335x-baltos: don't reset gpio3 block
  ARM: dts: AM335X-evmsk: Add the internal and external clock nodes for rtc
  ARM: dts: AM335X-evm: Add the internal and external clock nodes for rtc
  ARM: dts: AM335X-bone-common: Add the internal and external clock nodes for rtc
  ARM: dts: am335x-baltos-ir5221: use both musb channels in host mode
  ARM: dts: am4372: add DMA properties for tscadc
  ARM: dts: am33xx: add DMA properties for tscadc
  ARM: dts: omap5 uevm: add USR1 button
  ARM: dts: omap5 uevm: add LEDs
  ARM: dts: omap5 uevm: add EEPROM
  ARM: dts: am335x: Add the power button interrupt
  ARM: dts: am335x: Add the charger interrupt
  dt-bindings: mfd: Provide human readable defines for TPS65217 interrupts
  ARM: dts: am335x: Support the PMIC interrupt
  ARM: dts: tps65217: Add the power button device
  ...

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:43:19 -08:00
Olof Johansson
9aa29e9e5e Merge tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/fixes-non-critical
Non-urgent fixes for omaps for v4.10 merge window:

- Fix mismatched interrupt numbers for tps65217, these are not yet
  used

- Remove unused omapdss_early_init_of()

- Use seq_putc() for pm-debug.c

* tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: pm-debug: Use seq_putc() in two functions
  ARM: OMAP2+: Remove the omapdss_early_init_of() function
  mfd: tps65217: Fix mismatched interrupt number

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:41:41 -08:00
Linus Torvalds
aad931a30f Merge tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfix from Bruce Fields:
 "Just one fix for an NFS/RDMA crash"

* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
  sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports
2016-11-18 16:32:21 -08:00