Commit Graph

55673 Commits

Author SHA1 Message Date
Thomas Gleixner
d59f6617ee genirq: Allow fwnode to carry name information only
In order to provide proper debug interface it's required to have domain
names available when the domain is added. Non fwnode based architectures
like x86 have no way to do so.

It's not possible to use domain ops or host data for this as domain ops
might be the same for several instances, but the names have to be unique.

Extend the irqchip fwnode to allow transporting the domain name. If no node
is supplied, create a 'unknown-N' placeholder.

Warn if an invalid node is supplied and treat it like no node. This happens
e.g. with i2 devices on x86 which hand in an ACPI type node which has no
interface for retrieving the name.

[ Folded a fix from Marc to make DT name parsing work ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.588784933@linutronix.de
2017-06-22 18:21:08 +02:00
Bartosz Golaszewski
30fd8fc5c9 irq/generic-chip: Provide devm_irq_setup_generic_chip()
Provide a resource managed variant of irq_setup_generic_chip().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-6-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:11 +02:00
Bartosz Golaszewski
1c3e36309f irq/generic-chip: Provide devm_irq_alloc_generic_chip()
Provide a resource managed variant of irq_alloc_generic_chip().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-5-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:11 +02:00
Bartosz Golaszewski
32bb6cbb3b irq/generic-chip: Provide irq_destroy_generic_chip()
Most users of irq_alloc_generic_chip() call irq_setup_generic_chip()
too. To simplify the cleanup provide a function that both removes a
generic chip and frees its memory.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-3-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:10 +02:00
Bartosz Golaszewski
707188f5f2 irq/generic-chip: Provide irq_free_generic_chip()
Currently there's no way for users of irq_alloc_generic_chip() to free
the allocated memory other than calling kfree() manually on the
returned pointer. This may lead to errors if the internals of
irq_alloc_generic_chip() ever change. Provide a routine to free the
generic chip.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-2-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:10 +02:00
Thomas Gleixner
b50fb7c992 Merge branch 'linus' into irq/core
Get upstream changes so pending patches won't conflict.
2017-06-20 22:08:32 +02:00
Hugh Dickins
1be7107fbe mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19 21:50:20 +08:00
Linus Torvalds
ab2789b72d Merge tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs
Pull configfs updates from Christoph Hellwig:
 "A fix from Nic for a race seen in production (including a stable tag).

  And while I'm sending you this I'm also sneaking in a trivial new
  helper from Bart so that we don't need inter-tree dependencies for the
  next merge window"

* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
  configfs: Introduce config_item_get_unless_zero()
  configfs: Fix race between create_link and configfs_rmdir
2017-06-16 18:45:47 +09:00
Linus Torvalds
e78e4626d4 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fix from Jens Axboe:
 "Just a single fix this week, fixing a regression introduced in this
  release.

  When we put the final reference to the queue, we may need to block.
  Ensure that we can safely do so. From Bart"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: Fix a blk_exit_rl() regression
2017-06-16 17:26:10 +09:00
Linus Torvalds
cbfb749737 Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fixes from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi_scan: Check DMI structure length
  firmware: dmi: Fix permissions of product_family
  firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
  firmware: dmi_scan: Look for SMBIOS 3 entry point first
2017-06-16 17:13:06 +09:00
Andy Lutomirski
c926820085 firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
Currently they return -1 on error, which will confuse callers if
they try to interpret it as a normal negative error code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
2017-06-15 13:46:00 +02:00
Linus Torvalds
a090bd4ff8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) The netlink attribute passed in to dev_set_alias() is not
    necessarily NULL terminated, don't use strlcpy() on it. From
    Alexander Potapenko.

 2) Fix implementation of atomics in arm64 bpf JIT, from Daniel
    Borkmann.

 3) Correct the release of netdevs and driver private data in certain
    circumstances.

 4) Sanitize netlink message length properly in decnet, from Mateusz
    Jurczyk.

 5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From
    Yuval Mintz.

 6) Hash secret is never initialized in ipv6 ILA translation code, from
    Arnd Bergmann. I guess those clang warnings about unused inline
    functions are useful for something!

 7) Fix endian selection in bpf_endian.h, from Daniel Borkmann.

 8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX
    and CAIF. From Mateusz Jurczyk.

 9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario
    Molitor.

10) Do not leak netdev on dev_alloc_name() errors in mac80211, from
    Johannes Berg.

11) Fix locking in sctp_for_each_endpoint(), from Xin Long.

12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle.

13) Fix use after free in ip_mc_clear_src(), from WANG Cong.

14) Fix regressions caused by ICMP rate limiting changes in 4.11, from
    Jesper Dangaard Brouer.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
  i40e: Fix a sleep-in-atomic bug
  net: don't global ICMP rate limit packets originating from loopback
  net/act_pedit: fix an error code
  net: update undefined ->ndo_change_mtu() comment
  net_sched: move tcf_lock down after gen_replace_estimator()
  caif: Add sockaddr length check before accessing sa_family in connect handler
  qed: fix dump of context data
  qmi_wwan: new Telewell and Sierra device IDs
  net: phy: Fix MDIO_THUNDER dependencies
  netconsole: Remove duplicate "netconsole: " logging prefix
  igmp: acquire pmc lock for ip_mc_clear_src()
  r8152: give the device version
  net: rps: fix uninitialized symbol warning
  mac80211: don't send SMPS action frame in AP mode when not needed
  mac80211/wpa: use constant time memory comparison for MACs
  mac80211: set bss_info data before configuring the channel
  mac80211: remove 5/10 MHz rate code from station MLME
  mac80211: Fix incorrect condition when checking rx timestamp
  mac80211: don't look at the PM bit of BAR frames
  i40e: fix handling of HW ATR eviction
  ...
2017-06-15 18:09:47 +09:00
Bart Van Assche
dc9edc44de block: Fix a blk_exit_rl() regression
Avoid that the following complaint is reported:

 BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
 in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
 1 lock held by rcuop/3/41:
  #0:  (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
 Call Trace:
  dump_stack+0x86/0xcf
  ___might_sleep+0x174/0x260
  __might_sleep+0x4a/0x80
  flush_work+0x7e/0x2e0
  __cancel_work_timer+0x143/0x1c0
  cancel_work_sync+0x10/0x20
  blk_throtl_exit+0x25/0x60
  blkcg_exit_queue+0x35/0x40
  blk_release_queue+0x42/0x130
  kobject_put+0xa9/0x190

This happens since we invoke callbacks that need to block from the
queue release handler. Fix this by pushing the final release to
a workqueue.

Reported-by: Ross Zwisler <zwisler@gmail.com>
Fixes: commit b425e50492 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>

Updated changelog
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-14 13:27:50 -06:00
Magnus Damm
db46a0e1be net: update undefined ->ndo_change_mtu() comment
Update ->ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.

Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:14:51 -04:00
Bart Van Assche
19e72d3abb configfs: Introduce config_item_get_unless_zero()
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
[hch: minor style tweak]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-12 13:20:20 +02:00
Linus Torvalds
32627645e9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull key subsystem fixes from James Morris:
 "Here are a bunch of fixes for Linux keyrings, including:

   - Fix up the refcount handling now that key structs use the
     refcount_t type and the refcount_t ops don't allow a 0->1
     transition.

   - Fix a potential NULL deref after error in x509_cert_parse().

   - Don't put data for the crypto algorithms to use on the stack.

   - Fix the handling of a null payload being passed to add_key().

   - Fix incorrect cleanup an uninitialised key_preparsed_payload in
     key_update().

   - Explicit sanitisation of potentially secure data before freeing.

   - Fixes for the Diffie-Helman code"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
  KEYS: fix refcount_inc() on zero
  KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API
  crypto : asymmetric_keys : verify_pefile:zero memory content before freeing
  KEYS: DH: add __user annotations to keyctl_kdf_params
  KEYS: DH: ensure the KDF counter is properly aligned
  KEYS: DH: don't feed uninitialized "otherinfo" into KDF
  KEYS: DH: forbid using digest_null as the KDF hash
  KEYS: sanitize key structs before freeing
  KEYS: trusted: sanitize all key material
  KEYS: encrypted: sanitize all key material
  KEYS: user_defined: sanitize key payloads
  KEYS: sanitize add_key() and keyctl() key payloads
  KEYS: fix freeing uninitialized memory in key_update()
  KEYS: fix dereferencing NULL payload with nonzero length
  KEYS: encrypted: use constant-time HMAC comparison
  KEYS: encrypted: fix race causing incorrect HMAC calculations
  KEYS: encrypted: fix buffer overread in valid_master_desc()
  KEYS: encrypted: avoid encrypting/decrypting stack buffers
  KEYS: put keyring if install_session_keyring_to_cred() fails
  KEYS: Delete an error message for a failed memory allocation in get_derived_key()
  ...
2017-06-11 16:17:29 -07:00
Linus Torvalds
6d53cefb18 compiler, clang: properly override 'inline' for clang
Commit abb2ea7dfd ("compiler, clang: suppress warning for unused
static inline functions") just caused more warnings due to re-defining
the 'inline' macro.

So undef it before re-defining it, and also add the 'notrace' attribute
like the gcc version that this is overriding does.

Maybe this makes clang happier.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-11 15:51:56 -07:00
Linus Torvalds
5e38b72ac1 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Fix various bug fixes in ext4 caused by races and memory allocation
  failures"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix fdatasync(2) after extent manipulation operations
  ext4: fix data corruption for mmap writes
  ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO
  ext4: fix quota charging for shared xattr blocks
  ext4: remove redundant check for encrypted file on dio write path
  ext4: remove unused d_name argument from ext4_search_dir() et al.
  ext4: fix off-by-one error when writing back pages before dio read
  ext4: fix off-by-one on max nr_pages in ext4_find_unwritten_pgoff()
  ext4: keep existing extra fields when inode expands
  ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors
  ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()
  ext4: fix SEEK_HOLE
  jbd2: preserve original nofs flag during journal restart
  ext4: clear lockdep subtype for quota files on quota off
2017-06-11 11:57:47 -07:00
Linus Torvalds
9d0eb46246 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Bug fixes (ARM, s390, x86)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: async_pf: avoid async pf injection when in guest mode
  KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation
  arm: KVM: Allow unaligned accesses at HYP
  arm64: KVM: Allow unaligned accesses at EL2
  arm64: KVM: Preserve RES1 bits in SCTLR_EL2
  KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
  KVM: nVMX: Fix exception injection
  kvm: async_pf: fix rcu_irq_enter() with irqs enabled
  KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
  KVM: s390: fix ais handling vs cpu model
  KVM: arm/arm64: Fix isues with GICv2 on GICv3 migration
2017-06-11 11:07:25 -07:00
Linus Torvalds
6b7ed4588c Merge branch 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fixes from Ingo Molnar:
 "Fix an SRCU bug affecting KVM IRQ injection"

* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  srcu: Allow use of Classic SRCU from both process and interrupt context
  srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context
2017-06-10 10:22:35 -07:00
Linus Torvalds
179145e631 Merge tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:

 - another compile-fix for my header cleanup

 - a couple of fixes for the recently merged IOMMU probe deferal code

 - fixes for ACPI/IORT code necessary with IOMMU probe deferal

* tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  arm: dma-mapping: Reset the device's dma_ops
  ACPI/IORT: Move the check to get iommu_ops from translated fwspec
  ARM: dma-mapping: Don't tear down third-party mappings
  ACPI/IORT: Ignore all errors except EPROBE_DEFER
  iommu/of: Ignore all errors except EPROBE_DEFER
  iommu/of: Fix check for returning EPROBE_DEFER
  iommu/dma: Fix function declaration
2017-06-09 22:30:55 -07:00
Linus Torvalds
42211f6cb6 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A set of fixes in the area of block IO, that should go into the next
  -rc release. This contains:

   - An OOPS fix from Dmitry, fixing a regression with the bio integrity
     code in this series.

   - Fix truncation of elevator io context cache name, from Eric
     Biggers.

   - NVMe pull from Christoph includes FC fixes from James, APST
     fixes/tweaks from Kai-Heng, removal fix from Rakesh, and an RDMA
     fix from Sagi.

   - Two tweaks for the block throttling code. One from Joseph Qi,
     fixing an oops from the timer code, and one from Shaohua, improving
     the behavior on rotatonal storage.

   - Two blk-mq fixes from Ming, fixing corner cases with the direct
     issue code.

   - Locking fix for bfq cgroups from Paolo"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block, bfq: access and cache blkg data only when safe
  Fix loop device flush before configure v3
  blk-throttle: set default latency baseline for harddisk
  blk-throttle: fix NULL pointer dereference in throtl_schedule_pending_timer
  nvme: relax APST default max latency to 100ms
  nvme: only consider exit latency when choosing useful non-op power states
  nvme-fc: fix missing put reference on controller create failure
  nvme-fc: on lldd/transport io error, terminate association
  nvme-rdma: fast fail incoming requests while we reconnect
  nvme-pci: fix multiple ctrl removal scheduling
  nvme: fix hang in remove path
  elevator: fix truncation of icq_cache_name
  blk-mq: fix direct issue
  blk-mq: pass correct hctx to blk_mq_try_issue_directly
  bio-integrity: Do not allocate integrity context for bio w/o data
2017-06-09 22:18:41 -07:00
Ingo Molnar
8affb06737 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into rcu/urgent
Pull RCU fix from Paul E. McKenney:

" This series enables srcu_read_lock() and srcu_read_unlock() to be used from
  interrupt handlers, which fixes a bug in KVM's use of SRCU in delivery
  of interrupts to guest OSes. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-09 08:17:10 +02:00
Eric Biggers
0620fddb56 KEYS: sanitize key structs before freeing
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:

     "Having a payload is not required; and the payload can, in fact,
     just be a value stored in the struct key itself."

In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it.  We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.

This is safe because the key_jar cache does not use a constructor.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Linus Torvalds
0d22df90c7 Merge tag 'pm-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These revert one problematic commit related to system sleep and fix
  one recent intel_pstate regression.

  Specifics:

   - Revert a recent commit that attempted to avoid spurious wakeups
     from suspend-to-idle via ACPI SCI, but introduced regressions on
     some systems (Rafael Wysocki).

     We will get back to the problem it tried to address in the next
     cycle.

   - Fix a possible division by 0 during intel_pstate initialization
     due to a missing check (Rafael Wysocki)"

* tag 'pm-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
  cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()
2017-06-08 17:40:32 -07:00
Rafael J. Wysocki
fbd78afe34 Merge branches 'intel_pstate' and 'pm-sleep'
* intel_pstate:
  cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()

* pm-sleep:
  Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
2017-06-09 01:25:16 +02:00
Paolo Bonzini
1123a60416 srcu: Allow use of Classic SRCU from both process and interrupt context
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
down a guest running iperf on a VFIO assigned device.  This happens
because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
context, while a worker thread does the same inside kvm_set_irq().  If the
interrupt happens while the worker thread is executing __srcu_read_lock(),
updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
->srcu_lock_count[] field can be lost.

The docs say you are not supposed to call srcu_read_lock() and
srcu_read_unlock() from irq context, but KVM interrupt injection happens
from (host) interrupt context and it would be nice if SRCU supported the
use case.  KVM is using SRCU here not really for the "sleepable" part,
but rather due to its IPI-free fast detection of grace periods.  It is
therefore not desirable to switch back to RCU, which would effectively
revert commit 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
2014-01-16).

However, the docs are overly conservative.  You can have an SRCU instance
only has users in irq context, and you can mix process and irq context
as long as process context users disable interrupts.  In addition,
__srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
Classic SRCU.  For those two implementations, only srcu_read_lock()
is unsafe.

When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
in commit 5a41344a3d ("srcu: Simplify __srcu_read_unlock() via
this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
the caller.  Tree SRCU however only does one increment, so on most
architectures it is more efficient for __srcu_read_lock() to use
this_cpu_inc(), and any performance differences appear to be down in
the noise.

Cc: stable@vger.kernel.org
Fixes: 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
Reported-by: Linu Cherian <linuc.decode@gmail.com>
Suggested-by: Linu Cherian <linuc.decode@gmail.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 08:25:19 -07:00
David Ahern
8397ed36b7 net: ipv6: Release route when device is unregistering
Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:

$ ifdown bond2    # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2

Steps to reproduce:
    echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
    ip link add dev bond12 type bond
    ip link add dev bond13 type bond
    ip addr add 2001:db8:2::0/64 dev bond12
    ip addr add 2001:db8:3::0/64 dev bond13
    ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
    ip link del dev bond12
    ip link del dev bond13

The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.

Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 11:12:39 -04:00
Paolo Bonzini
38a4f43d56 Merge tag 'kvm-arm-for-v4.12-rc5-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM Fixes for v4.12-rc5 - Take 2

Changes include:
 - Fix an issue with migrating GICv2 VMs on GICv3 systems.
 - Squashed a bug for gicv3 when figuring out preemption levels.
 - Fix a potential null pointer derefence in KVM happening under memory
   pressure.
 - Maintain RES1 bits in the SCTLR_EL2 to make sure KVM works on new
   architecture revisions.
 - Allow unaligned accesses at EL2/HYP
2017-06-08 15:04:38 +02:00
David S. Miller
cf124db566 net: Fix inconsistent teardown and release of private netdev state.
Network devices can allocate reasources and private memory using
netdev_ops->ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops->ndo_uninit() or netdev->destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev->destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
it is not able to invoke netdev->destructor().

This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev->destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().

netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07 15:53:24 -04:00
Rafael J. Wysocki
f3b7eaae1b Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
Revert commit eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups
from suspend-to-idle) as it turned out to be premature and triggered
a number of different issues on various systems.

That includes, but is not limited to, premature suspend-to-RAM aborts
on Dell XPS 13 (9343) reported by Dominik.

The issue the commit in question attempted to address is real and
will need to be taken care of going forward, but evidently more work
is needed for this purpose.

Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-07 00:57:37 +02:00
Linus Torvalds
b29794ec95 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Made TCP congestion control documentation match current reality,
    from Anmol Sarma.

 2) Various build warning and failure fixes from Arnd Bergmann.

 3) Fix SKB list leak in ipv6_gso_segment().

 4) Use after free in ravb driver, from Eugeniu Rosca.

 5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.

 6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
    Piccoli.

 7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
    Zhang.

 8) Use after free in vxlan deletion, from Mark Bloch.

 9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
    Filippov.

10) Fix stmmac hangs with TSO, from Niklas Cassel.

11) Fix crash in CALIPSO ipv6, from Richard Haines.

12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.

13) Fix regression in sk_err socket error queue handling, noticed by
    ping applications. From Soheil Hassas Yeganeh.

14) Update mlx4/mlx5 MAINTAINERS information.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
  net: stmmac: fix a broken u32 less than zero check
  net: stmmac: fix completely hung TX when using TSO
  net: ethoc: enable NAPI before poll may be scheduled
  net: bridge: fix a null pointer dereference in br_afspec
  ravb: Fix use-after-free on `ifconfig eth0 down`
  net/ipv6: Fix CALIPSO causing GPF with datagram support
  net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
  Revert "sit: reload iphdr in ipip6_rcv"
  i40e/i40evf: proper update of the page_offset field
  i40e: Fix state flags for bit set and clean operations of PF
  iwlwifi: fix host command memory leaks
  iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
  iwlwifi: mvm: clear new beacon command template struct
  iwlwifi: mvm: don't fail when removing a key from an inexisting sta
  iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
  iwlwifi: mvm: fix firmware debug restart recording
  iwlwifi: tt: move ucode_loaded check under mutex
  iwlwifi: mvm: support ibss in dqa mode
  iwlwifi: mvm: Fix command queue number on d0i3 flow
  iwlwifi: mvm: rs: start using LQ command color
  ...
2017-06-06 14:30:17 -07:00
David Rientjes
abb2ea7dfd compiler, clang: suppress warning for unused static inline functions
GCC explicitly does not warn for unused static inline functions for
-Wunused-function.  The manual states:

	Warn whenever a static function is declared but not defined or
	a non-inline static function is unused.

Clang does warn for static inline functions that are unused.

It turns out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.

Suppress the warning for clang.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-06 14:09:22 -07:00
Eric Biggers
9bd2bbc01d elevator: fix truncation of icq_cache_name
gcc 7.1 reports the following warning:

    block/elevator.c: In function ‘elv_register’:
    block/elevator.c:898:5: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
         "%s_io_cq", e->elevator_name);
         ^~~~~~~~~~
    block/elevator.c:897:3: note: ‘snprintf’ output between 7 and 22 bytes into a destination of size 21
       snprintf(e->icq_cache_name, sizeof(e->icq_cache_name),
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         "%s_io_cq", e->elevator_name);
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The bug is that the name of the icq_cache is 6 characters longer than
the elevator name, but only ELV_NAME_MAX + 5 characters were reserved
for it --- so in the case of a maximum-length elevator name, the 'q'
character in "_io_cq" would be truncated by snprintf().  Fix it by
reserving ELV_NAME_MAX + 6 characters instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-06 11:20:47 -06:00
Linus Torvalds
ba7b2387ad Merge branch 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "Two cgroup fixes. One to address RCU delay of cpuset removal affecting
  userland visible behaviors. The other fixes a race condition between
  controller disable and cgroup removal"

* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: consider dying css as offline
  cgroup: Prevent kill_css() from being called more than once
2017-06-05 15:37:03 -07:00
Talat Batheesh
6dc06c08be net/mlx4: Fix the check in attaching steering rules
Our previous patch (cited below) introduced a regression
for RAW Eth QPs.

Fix it by checking if the QP number provided by user-space
exists, hence allowing steering rules to be added for valid
QPs only.

Fixes: 89c557687a ("net/mlx4_en: Avoid adding steering rules with invalid ring")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:10:05 -04:00
Linus Torvalds
55cbdaf639 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
 "For the most part this is just a minor -rc cycle for the rdma
  subsystem. Even given that this is all of the -rc patches since the
  merge window closed, it's still only about 25 patches:

   - Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes

   - A few upper layer protocol fixes (IPoIB, iSER, SRP)

   - A modest number of core fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits)
  RDMA/SA: Fix kernel panic in CMA request handler flow
  RDMA/umem: Fix missing mmap_sem in get umem ODP call
  RDMA/core: not to set page dirty bit if it's already set.
  RDMA/uverbs: Declare local function static and add brackets to sizeof
  RDMA/netlink: Reduce exposure of RDMA netlink functions
  RDMA/srp: Fix NULL deref at srp_destroy_qp()
  RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope
  RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
  RDMA/qedr: add null check before pointer dereference
  RDMA/mlx5: set UMR wqe fence according to HCA cap
  net/mlx5: Define interface bits for fencing UMR wqe
  RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
  RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
  RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
  RDMA/hfi1: change PCI bar addr assignments to Linux API functions
  RDMA/hfi1: fix array termination by appending NULL to attr array
  RDMA/iw_cxgb4: fix the calculation of ipv6 header size
  RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
  RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
  RDMA/nes: ACK MPA Reply frame
  ...
2017-06-04 10:41:32 -07:00
Thomas Gleixner
201d7f47f3 genirq: Handle NOAUTOEN interrupt setup proper
If an interrupt is marked NOAUTOEN then request_irq() installs the action,
but does not enable the interrupt via startup_irq().  The interrupt is
enabled via enable_irq() later from the driver. enable_irq() calls
irq_enable().

That means that for interrupts which have a irq_startup() callback this
callback is never invoked. Neither is irq_domain_activate_irq() invoked for
such interrupts.

If an interrupt depends on irq_startup() or irq_domain_activate_irq() then
the enable via irq_enable() is not enough.

Add a status flag IRQD_IRQ_STARTED_UP and use this to select the proper
mechanism in enable_irq(). Use the flag also to avoid pointless calls into
the low level functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: dianders@chromium.org
Cc: jeffy <jeffy.chen@rock-chips.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: tfiga@chromium.org
Link: http://lkml.kernel.org/r/20170531100212.130986205@linutronix.de
2017-06-04 14:35:13 +02:00
Michal Hocko
864b9a393d mm: consider memblock reservations for deferred memory initialization sizing
We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:

	kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
	kthreadd cpuset=/ mems_allowed=7
	CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
	Call Trace:
	  dump_stack+0xb0/0xf0 (unreliable)
	  dump_header+0xb0/0x258
	  out_of_memory+0x5f0/0x640
	  __alloc_pages_nodemask+0xa8c/0xc80
	  kmem_getpages+0x84/0x1a0
	  fallback_alloc+0x2a4/0x320
	  kmem_cache_alloc_node+0xc0/0x2e0
	  copy_process.isra.25+0x260/0x1b30
	  _do_fork+0x94/0x470
	  kernel_thread+0x48/0x60
	  kthreadd+0x264/0x330
	  ret_from_kernel_thread+0x5c/0xa4

	Mem-Info:
	active_anon:0 inactive_anon:0 isolated_anon:0
	 active_file:0 inactive_file:0 isolated_file:0
	 unevictable:0 dirty:0 writeback:0 unstable:0
	 slab_reclaimable:5 slab_unreclaimable:73
	 mapped:0 shmem:0 pagetables:0 bounce:0
	 free:0 free_pcp:0 free_cma:0
	Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
	lowmem_reserve[]: 0 0 0 0
	Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
	0 total pagecache pages
	0 pages in swap cache
	Swap cache stats: add 0, delete 0, find 0/0
	Free swap  = 0kB
	Total swap = 0kB
	819200 pages RAM
	0 pages HighMem/MovableOnly
	817481 pages reserved
	0 pages cma reserved
	0 pages hwpoisoned

the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done.  update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range.  In this particular case we've had

	Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)

so the low 2GB is mostly depleted.

Fix this by considering memblock allocations in the initial static
initialization estimation.  Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address.  The cumulative size is than added on top
of the initial estimation.  This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.

Fixes: 3a80a7fa79 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>	[4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02 15:07:38 -07:00
James Morse
9a291a7c94 mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified
KVM uses get_user_pages() to resolve its stage2 faults.  KVM sets the
FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it
finds a VM_FAULT_HWPOISON.  KVM handles these hwpoison pages as a
special case.  (check_user_page_hwpoison())

When huge pages are involved, this doesn't work so well.
get_user_pages() calls follow_hugetlb_page(), which stops early if it
receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning
-EFAULT to the caller.  The step to map this to -EHWPOISON based on the
FOLL_ flags is missing.  The hwpoison special case is skipped, and
-EFAULT is returned to user-space, causing Qemu or kvmtool to exit.

Instead, move this VM_FAULT_ to errno mapping code into a header file
and use it from faultin_page() and follow_hugetlb_page().

With this, KVM works as expected.

This isn't a problem for arm64 today as we haven't enabled
MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86
too, so I think this should be a fix.  This doesn't apply earlier than
stable's v4.11.1 due to all sorts of cleanup.

[james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()]
suggested.
  Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[4.11.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02 15:07:38 -07:00
Matthias Kaehlcke
60b0a8c3d2 frv: declare jiffies to be located in the .data section
Commit 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with
____cacheline_aligned_in_smp") removed a section specification from the
jiffies declaration that caused conflicts on some platforms.

Unfortunately this change broke the build for frv:

  kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol
      `jiffies' defined in *ABS* section in .tmp_vmlinux1
  kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol
      `jiffies' defined in *ABS* section in .tmp_vmlinux1
  kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against
      symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1
  ...

Add __jiffy_arch_data to the declaration of jiffies and use it on frv to
include the section specification.  For all other platforms
__jiffy_arch_data (currently) has no effect.

Fixes: 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02 15:07:37 -07:00
Michal Hocko
1bde33e051 include/linux/gfp.h: fix ___GFP_NOLOCKDEP value
Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit.  At
the time commit 7e7844226f ("lockdep: allow to disable reclaim lockup
detection") was written we still had __GFP_OTHER_NODE but I have removed
it in commit 41b6167e8f ("mm: get rid of __GFP_OTHER_NODE") and forgot
to lower the bit value.

The current value is outside of __GFP_BITS_SHIFT so it cannot be used
actually.

Fixes: 7e7844226f ("lockdep: allow to disable reclaim lockup detection")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Igor Stoppa <igor.stoppa@nokia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02 15:07:37 -07:00
Linus Torvalds
3b1e342be2 Merge tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Revert patch accidentally included in the merge window pull request,
  and fix a crash that was likely a result of buggy client behavior"

* tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix null dereference on replay
  nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments"
2017-06-01 16:24:48 -07:00
Max Gurtovoy
1410a90ae4 net/mlx5: Define interface bits for fencing UMR wqe
HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:05:04 -04:00
Arnd Bergmann
4b1c88984c iommu/dma: Fix function declaration
Newly added code in the ipmmu-vmsa driver showed a small mistake
in a header file that can't be included by itself without CONFIG_IOMMU_DMA
enabled:

In file included from drivers/iommu/ipmmu-vmsa.c:13:0:
include/linux/dma-iommu.h:105:94: error: 'struct device' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]

This adds a forward declaration for 'struct device', similar to how
we treat the other struct types in this case.

Fixes: 3ae4729202 ("iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops")
Fixes: 273df96353 ("iommu/dma: Make PCI window reservation generic")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-05-30 11:25:45 +02:00
Linus Torvalds
3f173bde7e Merge tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "Here is an overdue pull request for pin control fixes, the most
  prominent feature is to make Intel Chromebooks (and I suspect any
  other Cherryview-based Intel thing) happy again, which we really want
  to see.

  There is a patch hitting drivers/firmware/* that I was uncertain to
  who actually manages, but I got Andy Shevchenko's and Dmitry Torokov's
  review tags on it and I trust them both 100% to do the right thing for
  Intel platform drivers.

  Summary:

   - Make a few Intel Chromebooks with Cherryview DMI firmware work
     smoothly.

   - A fix for some bogus allocations in the generic group management
     code.

   - Some GPIO descriptor lookup table stubs. Merged through the pin
     control tree for administrative reasons.

   - Revert the "bi-directional" and "output-enable" generic properties:
     we need more discussions around this. It seems other SoCs are using
     input/output gate enablement and these terms are not correct.

   - Fix mux and drive strength atomically in the MXS driver.

   - Fix the SPDIF function on sunxi A83T.

   - OF table terminators and other small fixes"

* tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: sunxi: Fix SPDIF function name for A83T
  pinctrl: mxs: atomically switch mux and drive strength config
  pinctrl: cherryview: Extend the Chromebook DMI quirk to Intel_Strago systems
  firmware: dmi: Add DMI_PRODUCT_FAMILY identification string
  pinctrl: core: Fix warning by removing bogus code
  gpiolib: Add stubs for gpiod lookup table interface
  Revert "pinctrl: generic: Add bi-directional and output-enable"
  pinctrl: cherryview: Add terminate entry for dmi_system_id tables
2017-05-29 10:05:19 -07:00
Linus Torvalds
249f1efd8e Merge tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
 "Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger
  than normal, which is why I had them bake in linux-next for a few
  weeks and didn't send them to you for -rc2.

  They revert a few of the serdev patches from 4.12-rc1, and bring
  things back to how they were in 4.11, to try to make things a bit more
  stable there. Rob and Johan both agree that this is the way forward,
  so this isn't people squabbling over semantics. Other than that, just
  a few minor serial driver fixes that people have had problems with.

  All of these have been in linux-next for a few weeks with no reported
  issues"

* tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: altera_uart: call iounmap() at driver remove
  serial: imx: ensure UCR3 and UFCR are setup correctly
  MAINTAINERS/serial: Change maintainer of jsm driver
  serial: enable serdev support
  tty/serdev: add serdev registration interface
  serdev: Restore serdev_device_write_buf for atomic context
  serial: core: fix crash in uart_suspend_port
  tty: fix port buffer locking
  tty: ehv_bytechan: clean up init error handling
  serial: ifx6x60: fix use-after-free on module unload
  serial: altera_jtaguart: adding iounmap()
  serial: exar: Fix stuck MSIs
  serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
  serdev: fix tty-port client deregistration
  Revert "tty_port: register tty ports with serdev bus"
  drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
2017-05-27 09:39:09 -07:00
Linus Torvalds
6741d51699 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
    Borkmann.

 2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
    Caratti.

 3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
    driver, from Jesper Brouer.

 4) Defer slave->link updates until bonding is ready to do a full commit
    to the new settings, from Nithin Sujir.

 5) Properly reference count ipv4 FIB metrics to avoid use after free
    situations, from Eric Dumazet and several others including Cong Wang
    and Julian Anastasov.

 6) Fix races in llc_ui_bind(), from Lin Zhang.

 7) Fix regression of ESP UDP encapsulation for TCP packets, from
    Steffen Klassert.

 8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.

 9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
    Dawson.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
  ipv4: add reference counting to metrics
  net: ethernet: ax88796: don't call free_irq without request_irq first
  ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
  sctp: fix ICMP processing if skb is non-linear
  net: llc: add lock_sock in llc_ui_bind to avoid a race condition
  bonding: Don't update slave->link until ready to commit
  test_bpf: Add a couple of tests for BPF_JSGE.
  bpf: add various verifier test cases
  bpf: fix wrong exposure of map_flags into fdinfo for lpm
  bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
  bpf: properly reset caller saved regs after helper call and ld_abs/ind
  bpf: fix incorrect pruning decision when alignment must be tracked
  arp: fixed -Wuninitialized compiler warning
  tcp: avoid fastopen API to be used on AF_UNSPEC
  net: move somaxconn init from sysctl code
  net: fix potential null pointer dereference
  geneve: fix fill_info when using collect_metadata
  virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
  be2net: Fix offload features for Q-in-Q packets
  vlan: Fix tcp checksum offloads in Q-in-Q vlans
  ...
2017-05-26 13:51:01 -07:00
Linus Torvalds
1b8f2ffc79 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A collection of fixes that should go into this series. This contains:

   - A set of NVMe fixes, pulled from Christoph. This includes a set of
     fixes for the fiber channel bits from James Smart, rdma queue depth
     fix from Marta, controller removal fixes from Ming, and some more
     APST quirk updates from Andy.

   - A blk-mq debugfs fix from Bart, fixing a problem with the
     untangling of the sysfs and debugfs blk-mq bits that was added in
     this series.

   - Error code fix in add_partition() from Dan.

   - A small series of fixes for the new blk-throttle code from Shaohua"

* 'for-linus' of git://git.kernel.dk/linux-block: (21 commits)
  blk-mq: Only register debugfs attributes for blk-mq queues
  nvme: Quirk APST on Intel 600P/P3100 devices
  nvme: only setup block integrity if supported by the driver
  nvme: replace is_flags field in nvme_ctrl_ops with a flags field
  nvme-pci: consistencly use ctrl->device for logging
  partitions/msdos: FreeBSD UFS2 file systems are not recognized
  block: fix an error code in add_partition()
  blk-throttle: force user to configure all settings for io.low
  blk-throttle: respect 0 bps/iops settings for io.low
  blk-throttle: output some debug info in trace
  blk-throttle: add hierarchy support for latency target and idle time
  nvme_fc: remove extra controller reference taken on reconnect
  nvme_fc: correct nvme status set on abort
  nvme_fc: set logging level on resets/deletes
  nvme_fc: revise comment on teardown
  nvme_fc: Support ctrl_loss_tmo
  nvme_fc: get rid of local reconnect_delay
  blk-mq: remove blk_mq_abort_requeue_list()
  nvme: avoid to use blk_mq_abort_requeue_list()
  nvme: use blk_mq_start_hw_queues() in nvme_kill_queues()
  ...
2017-05-26 11:05:22 -07:00
Linus Torvalds
6ce4782911 Merge tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:

 - fix PCI_ENDPOINT build error (merged for v4.12)

 - fix Switchtec driver (merged for v4.12)

 - fix imx6 config read timeouts, fallout from changing to non-postable
   reads

 - add PM "needs_resume" flag for i915 suspend issue

* tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/PM: Add needs_resume flag to avoid suspend complete optimization
  PCI: imx6: Fix config read timeout handling
  switchtec: Fix minor bug with partition ID register
  switchtec: Use new cdev_device_add() helper function
  PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
2017-05-26 10:51:18 -07:00