Commit Graph

48698 Commits

Author SHA1 Message Date
Ingo Molnar
fe36d8912c Merge branch 'linus' into irq/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 12:26:07 +01:00
Linus Torvalds
e2857b8f11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix ordering of WEXT netlink messages so we don't see a newlink
    after a dellink, from Johannes Berg.

 2) Out of bounds access in minstrel_ht_set_best_prob_rage, from
    Konstantin Khlebnikov.

 3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb.

 4) Wrong units used to set initial TCP rto from cached metrics, also
    from Konstantin Khlebnikov.

 5) Fix stale IP options data in the SKB control block from leaking
    through layers of encapsulation, from Bernie Harris.

 6) Zero padding len miscalculated in bnxt_en, from Michael Chan.

 7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix
    from Hannes Frederic Sowa.

 8) Fix suspend/resume with JME networking devices, from Diego Violat
    and Guo-Fu Tseng.

 9) Checksums not validated properly in bridge multicast support due to
    the placement of the SKB header pointers at the time of the check,
    fix from Álvaro Fernández Rojas.

10) Fix hang/tiemout with r8169 if a stats fetch is done while the
    device is runtime suspended.  From Chun-Hao Lin.

11) The forwarding database netlink dump facilities don't track the
    state of the dump properly, resulting in skipped/missed entries.
    From Minoura Makoto.

12) Fix regression from a recent 3c59x bug fix, from Neil Horman.

13) Fix list corruption in bna driver, from Ivan Vecera.

14) Big endian machines crash on vlan add in bnx2x, fix from Michal
    Schmidt.

15) Ethtool RSS configuration not propagated properly in mlx5 driver,
    from Tariq Toukan.

16) Fix regression in PHY probing in stmmac driver, from Gabriel
    Fernandez.

17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin
    Poirier.

18) A past change to skip empty routing headers in ipv6 extention header
    parsing accidently caused fragment headers to not be matched any
    longer.  Fix from Florian Westphal.

19) eTSEC-106 erratum needs to be applied to more gianfar chips, from
    Atsushi Nemoto.

20) Fix netdev reference after free via workqueues in usb networking
    drivers, from Oliver Neukum and Bjørn Mork.

21) mdio->irq is now an array rather than a pointer to dynamic memory,
    but several drivers were still trying to free it :-/ Fixes from
    Colin Ian King.

22) act_ipt iptables action forgets to set the family field, thus LOG
    netfilter targets don't work with it.  Fix from Phil Sutter.

23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon.

24) pskb_may_pull() cannot be called with interrupts disabled, fix code
    that tries to do this in vmxnet3 driver, from Neil Horman.

25) be2net driver leaks iomap'd memory on removal, fix from Douglas
    Miller.

26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths,
    from Guillaume Nault.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits)
  ppp: release rtnl mutex when interface creation fails
  cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  tcp: fix tcpi_segs_in after connection establishment
  net: hns: fix the bug about loopback
  jme: Fix device PM wakeup API usage
  jme: Do not enable NIC WoL functions on S0
  udp6: fix UDP/IPv6 encap resubmit path
  be2net: Don't leak iomapped memory on removal.
  vmxnet3: avoid calling pskb_may_pull with interrupts disabled
  net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
  ibmveth: check return of skb_linearize in ibmveth_start_xmit
  cdc_ncm: toggle altsetting to force reset before setup
  usbnet: cleanup after bind() in probe()
  mlxsw: pci: Correctly determine if descriptor queue is full
  mlxsw: spectrum: Always decrement bridge's ref count
  tipc: fix nullptr crash during subscription cancel
  net: eth: altera: do not free array priv->mdio->irq
  net/ethoc: do not free array priv->mdio->irq
  net: sched: fix act_ipt for LOG target
  asix: do not free array priv->mdio->irq
  ...
2016-03-07 15:41:10 -08:00
Linus Torvalds
21b27a74ec Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fix from Sage Weil:
 "This is a final commit we missed to align the protocol compatibility
  with the feature bits.

  It decodes a few extra fields in two different messages and reports
  EIO when they are used (not yet supported)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
2016-03-06 11:31:13 -08:00
Linus Torvalds
fab3e94a62 Merge branch 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Assorted fixes for libata drivers.

   - Turns out HDIO_GET_32BIT ioctl was subtly broken all along.

   - Recent update to ahci external port handling was incorrectly
     marking hotpluggable ports as external making userland handle
     devices connected to those ports incorrectly.

   - ahci_xgene needs its own irq handler to work around a hardware
     erratum.  libahci updated to allow irq handler override.

   - Misc driver specific updates"

* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata: ahci: don't mark HotPlugCapable Ports as external/removable
  ahci: Workaround for ThunderX Errata#22536
  libata: Align ata_device's id on a cacheline
  Adding Intel Lewisburg device IDs for SATA
  pata-rb532-cf: get rid of the irq_to_gpio() call
  libata: fix HDIO_GET_32BIT ioctl
  ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT.
  ata: Remove the AHCI_HFLAG_EDGE_IRQ support from libahci.
  libahci: Implement the capability to override the generic ahci interrupt handler.
2016-03-04 18:31:36 -08:00
Linus Torvalds
e5322c5406 Merge branch 'for-linus2' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Round 2 of this.  I cut back to the bare necessities, the patch is
  still larger than it usually would be at this time, due to the number
  of NVMe fixes in there.  This pull request contains:

   - The 4 core fixes from Ming, that fix both problems with exceeding
     the virtual boundary limit in case of merging, and the gap checking
     for cloned bio's.

   - NVMe fixes from Keith and Christoph:

        - Regression on larger user commands, causing problems with
          reading log pages (for instance). This touches both NVMe,
          and the block core since that is now generally utilized also
          for these types of commands.

        - Hot removal fixes.

        - User exploitable issue with passthrough IO commands, if !length
          is given, causing us to fault on writing to the zero
          page.

        - Fix for a hang under error conditions

   - And finally, the current series regression for umount with cgroup
     writeback, where the final flush would happen async and hence open
     up window after umount where the device wasn't consistent.  fsck
     right after umount would show this.  From Tejun"

* 'for-linus2' of git://git.kernel.dk/linux-block:
  block: support large requests in blk_rq_map_user_iov
  block: fix blk_rq_get_max_sectors for driver private requests
  nvme: fix max_segments integer truncation
  nvme: set queue limits for the admin queue
  writeback: flush inode cgroup wb switches instead of pinning super_block
  NVMe: Fix 0-length integrity payload
  NVMe: Don't allow unsupported flags
  NVMe: Move error handling to failed reset handler
  NVMe: Simplify device reset failure
  NVMe: Fix namespace removal deadlock
  NVMe: Use IDA for namespace disk naming
  NVMe: Don't unmap controller registers on reset
  block: merge: get the 1st and last bvec via helpers
  block: get the 1st and last bvec via helpers
  block: check virt boundary in bio_will_gap()
  block: bio: introduce helpers to get the 1st and last bvec
2016-03-04 18:17:17 -08:00
Linus Torvalds
78baab7aa8 Merge tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "A feature was added in 4.3 that allowed users to filter trace points
  on a tasks "comm" field.  But this prevented filtering on a comm field
  that is within a trace event (like sched_migrate_task).

  When trying to filter on when a program migrated, this change
  prevented the filtering of the sched_migrate_task.

  To fix this, the event fields are examined first, and then the extra
  fields like "comm" and "cpu" are examined.  Also, instead of testing
  to assign the comm filter function based on the field's name, the
  generic comm field is given a new filter type (FILTER_COMM).  When
  this field is used to filter the type is checked.  The same is done
  for the cpu filter field.

  Two new special filter types are added: "COMM" and "CPU".  This allows
  users to still filter the tasks comm for events that have "comm" as
  one of their fields, in cases that users would like to filter
  sched_migrate_task on the comm of the task that called the event, and
  not the comm of the task that is being migrated"

* tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not have 'comm' filter override event 'comm' field
2016-03-04 16:57:04 -08:00
Yan, Zheng
5ea5c5e0a7 ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
Add support for the format change of MClientReply/MclientCaps.
Also add code that denies access to inodes with pool_ns layouts.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-03-04 21:00:37 +01:00
Steven Rostedt (Red Hat)
e57cbaf0eb tracing: Do not have 'comm' filter override event 'comm' field
Commit 9f61668073 "tracing: Allow triggers to filter for CPU ids and
process names" added a 'comm' filter that will filter events based on the
current tasks struct 'comm'. But this now hides the ability to filter events
that have a 'comm' field too. For example, sched_migrate_task trace event.
That has a 'comm' field of the task to be migrated.

 echo 'comm == "bash"' > events/sched_migrate_task/filter

will now filter all sched_migrate_task events for tasks named "bash" that
migrates other tasks (in interrupt context), instead of seeing when "bash"
itself gets migrated.

This fix requires a couple of changes.

1) Change the look up order for filter predicates to look at the events
   fields before looking at the generic filters.

2) Instead of basing the filter function off of the "comm" name, have the
   generic "comm" filter have its own filter_type (FILTER_COMM). Test
   against the type instead of the name to assign the filter function.

3) Add a new "COMM" filter that works just like "comm" but will filter based
   on the current task, even if the trace event contains a "comm" field.

Do the same for "cpu" field, adding a FILTER_CPU and a filter "CPU".

Cc: stable@vger.kernel.org # v4.3+
Fixes: 9f61668073 "tracing: Allow triggers to filter for CPU ids and process names"
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-04 09:57:10 -05:00
Christoph Hellwig
f21018427c block: fix blk_rq_get_max_sectors for driver private requests
Driver private request types should not get the artifical cap for the
FS requests.  This is important to use the full device capabilities
for internal command or NVMe pass through commands.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:43:45 -07:00
Tejun Heo
a1a0e23e49 writeback: flush inode cgroup wb switches instead of pinning super_block
If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching.  Before 5ff8eaac16 ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching.  5ff8eaac16 fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.

This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches.  wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.

v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
    check in inode_switch_wbs() as Jan an Al suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: 5ff8eaac16 ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:50 -07:00
Ming Lei
25e71a99f1 block: get the 1st and last bvec via helpers
This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Ming Lei
e0af29171a block: check virt boundary in bio_will_gap()
In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Ming Lei
7bcd79ac50 block: bio: introduce helpers to get the 1st and last bvec
The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.

Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.

This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Benjamin Poirier
1837b2e2bc mld, igmp: Fix reserved tailroom calculation
The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
	reserved_tailroom = skb_tailroom - mtu
else:
	reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 15:41:07 -05:00
Gabriel Fernandez
88f8b1bb41 stmmac: Fix 'eth0: No PHY found' regression
This patch manages the case when you have an Ethernet MAC with
a "fixed link", and not connected to a normal MDIO-managed PHY device.

The test of phy_bus_name was not helpful because it was never affected
and replaced by the mdio test node.

Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 15:02:05 -05:00
Tariq Toukan
bdfc028de1 net/mlx5e: Fix ethtool RX hash func configuration change
We should modify TIRs explicitly to apply the new RSS configuration.
The light ndo close/open calls do not "refresh" them.

Fixes: 2d75b2bc8a ('net/mlx5e: Add ethtool RSS configuration options')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:37:25 -05:00
Linus Torvalds
f691b77b1f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull d_inode/d_flags race fix from Al Viro.

I love this fix.  Not only does it fix the race in the dentry type
handling, it entirely gets rid of the nasty and subtle memory ordering
rules for d_type and d_inode, and replaces them with the basic dentry
locking rules (sequence numbers under RCU, d_lock elsewhere).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use ->d_seq to get coherency between ->d_inode and ->d_flags
2016-03-01 15:30:45 -08:00
Al Viro
a528aca7f3 use ->d_seq to get coherency between ->d_inode and ->d_flags
Games with ordering and barriers are way too brittle.  Just
bump ->d_seq before and after updating ->d_inode and ->d_flags
type bits, so that verifying ->d_seq would guarantee they are
coherent.

Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-02-29 12:16:43 -05:00
Linus Torvalds
1b9540ce03 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A rather largish series of 12 patches addressing a maze of race
  conditions in the perf core code from Peter Zijlstra"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Robustify task_function_call()
  perf: Fix scaling vs. perf_install_in_context()
  perf: Fix scaling vs. perf_event_enable()
  perf: Fix scaling vs. perf_event_enable_on_exec()
  perf: Fix ctx time tracking by introducing EVENT_TIME
  perf: Cure event->pending_disable race
  perf: Fix race between event install and jump_labels
  perf: Fix cloning
  perf: Only update context time when active
  perf: Allow perf_release() with !event->ctx
  perf: Do not double free
  perf: Close install vs. exit race
2016-02-28 07:52:00 -08:00
Linus Torvalds
691429e13d Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  dax: move writeback calls into the filesystems
  dax: give DAX clearing code correct bdev
  ext4: online defrag not supported with DAX
  ext2, ext4: only set S_DAX for regular inodes
  block: disable block device DAX by default
  ocfs2: unlock inode if deleting inode from orphan fails
  mm: ASLR: use get_random_long()
  drivers: char: random: add get_random_long()
  mm: numa: quickly fail allocations for NUMA balancing on full nodes
  mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
2016-02-27 12:46:16 -08:00
Linus Torvalds
a9f8094aae Merge tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
 "Enumeration:
    Revert x86 pcibios_alloc_irq() to fix regression (Bjorn Helgaas)

  Marvell MVEBU host bridge driver:
    Restrict build to 32-bit ARM (Thierry Reding)"

* tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: mvebu: Restrict build to 32-bit ARM
  Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"
  Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"
  Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
2016-02-27 12:33:42 -08:00
Ross Zwisler
7f6d5b529b dax: move writeback calls into the filesystems
Previously calls to dax_writeback_mapping_range() for all DAX filesystems
(ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range().

dax_writeback_mapping_range() needs a struct block_device, and it used
to get that from inode->i_sb->s_bdev.  This is correct for normal inodes
mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw
block devices and for XFS real-time files.

Instead, call dax_writeback_mapping_range() directly from the filesystem
->writepages function so that it can supply us with a valid block
device.  This also fixes DAX code to properly flush caches in response
to sync(2).

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27 10:28:52 -08:00
Ross Zwisler
20a90f5899 dax: give DAX clearing code correct bdev
dax_clear_blocks() needs a valid struct block_device and previously it
was using inode->i_sb->s_bdev in all cases.  This is correct for normal
inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for
DAX raw block devices and for XFS real-time devices.

Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change
its arguments to take a bdev and a sector instead of an inode and a
block.  This better reflects what the function does, and it allows the
filesystem and raw block device code to pass in an appropriate struct
block_device.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27 10:28:52 -08:00
Daniel Cashman
ec9ee4acd9 drivers: char: random: add get_random_long()
Commit d07e22597d ("mm: mmap: add new /proc tunable for mmap_base
ASLR") added the ability to choose from a range of values to use for
entropy count in generating the random offset to the mmap_base address.

The maximum value on this range was set to 32 bits for 64-bit x86
systems, but this value could be increased further, requiring more than
the 32 bits of randomness provided by get_random_int(), as is already
possible for arm64.  Add a new function: get_random_long() which more
naturally fits with the mmap usage of get_random_int() but operates
exactly the same as get_random_int().

Also, fix the shifting constant in mmap_rnd() to be an unsigned long so
that values greater than 31 bits generate an appropriate mask without
overflow.  This is especially important on x86, as its shift instruction
uses a 5-bit mask for the shift operand, which meant that any value for
mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base
randomization.

Finally, replace calls to get_random_int() with get_random_long() where
appropriate.

This patch (of 2):

Add get_random_long().

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27 10:28:52 -08:00
Linus Torvalds
3d7b365490 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:

 - Two fixes for compatibility with the ACPI 6.1 specification.

   Without these fixes multi-interface DIMMs will fail to be probed, and
   address range scrub commands to find memory errors will give results
   that the kernel will mis-interpret.  For multi-interface DIMMs Linux
   will accept either the original 6.0 implementation or 6.1.

   For address range scrub we'll only support 6.1 since ACPI formalized
   this DSM differently than the original example [1] implemented in
   v4.2.  The expectation is that production systems will only ever ship
   the ACPI 6.1 address range scrub command definition.

 - The wider async address range scrub work targeting 4.6 discovered
   that the original synchronous implementation in 4.5 is not sizing its
   return buffer correctly.

 - Arnd caught that my recent fix to the size of the pfn_t flags missed
   updating the flags variable used in the pmem driver.

 - Toshi found that we mishandle the memremap() return value in
   devm_memremap().

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm: use 'u64' for pfn flags
  devm_memremap: Fix error value when memremap failed
  nfit: update address range scrub commands to the acpi 6.1 format
  libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
  nfit: fix multi-interface dimm handling, acpi6.1 compatibility
2016-02-25 18:54:53 -08:00
Linus Torvalds
1ebe3839e6 Merge tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
 "Add a regression fix for changed sysfs path of bq27xxx_battery and
  update MAINTAINERS file"

* tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: bq27xxx_battery: Restore device name
  MAINTAINERS: update bq27xxx driver
2016-02-25 18:42:08 -08:00
Harvey Hunt
4ee34ea3a1 libata: Align ata_device's id on a cacheline
The id buffer in ata_device is a DMA target, but it isn't explicitly
cacheline aligned. Due to this, adjacent fields can be overwritten with
stale data from memory on non coherent architectures. As a result, the
kernel is sometimes unable to communicate with an ATA device.

Fix this by ensuring that the id buffer is cacheline aligned.

This issue is similar to that fixed by Commit 84bda12af3
("libata: align ap->sector_buf").

Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 2.6.18
Signed-off-by: Tejun Heo <tj@kernel.org>
2016-02-25 16:37:06 -05:00
Qais Yousef
bb11cff327 MIPS: Make smp CMP, CPS and MT use the new generic IPI functions
This commit does several things to avoid breaking bisectability.

	1- Remove IPI init code from irqchip/mips-gic
	2- Implement the new irqchip->send_ipi() in irqchip/mips-gic
	3- Select GENERIC_IRQ_IPI Kconfig symbol for MIPS_GIC
	4- Change MIPS SMP to use the generic IPI implementation

Only the SMP variants that use GIC were converted as it's the only irqchip that
will have the support for generic IPI for now.

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-18-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:58 +01:00
Qais Yousef
3b8e29a82d genirq: Implement ipi_send_mask/single()
Add APIs to send IPIs from driver and arch code.

We have different functions because we allow architecture code to cache the
irq descriptor to avoid lookups. Driver code has to use the irq number and is
subject to more restrictive checks.

[ tglx: Polish the implementation ]

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-12-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:57 +01:00
Qais Yousef
34dc1ae101 genirq: Add send_ipi callbacks to irq_chip
Introduce the new callbacks which can be used by the core code to implement a
generic IPI send mechanism.

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-11-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:57 +01:00
Qais Yousef
f9bce791ae genirq: Add a new function to get IPI reverse mapping
When dealing with coprocessors we need to find out the actual hwirqs values to
pass on to the firmware so that it knows what it needs to use to receive IPIs
from and send IPIs to Linux cpus.

[ tglx: Fixed the single hwirq IPI case. The hardware irq number does not
  	change due to the cpu number ]

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-10-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:56 +01:00
Qais Yousef
d17bf24e69 genirq: Add a new generic IPI reservation code to irq core
Add a generic mechanism to dynamically allocate an IPI. Depending on the
underlying implementation this creates either a single Linux irq or a
consective range of Linux irqs. The Linux irq is used later to send IPIs to
other CPUs.

[ tglx: Massaged the code and removed the 'consecutive mask' restriction for
  	the single IRQ case ]

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-9-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:56 +01:00
Qais Yousef
ac0a0cd266 genirq: Make irq_domain_alloc_descs() non static
We will need to use this function to implement irq_reserve_ipi() later. So
make it non static and move the prototype to irqdomain.h to allow using it
outside irqdomain.c

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-8-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:56 +01:00
Qais Yousef
f256c9a0c5 genirq: Add ipi_offset to irq_common_data
IPIs are always assumed to be consecutively allocated, hence virqs and hwirqs
can be inferred by using CPU id as an offset. But the first cpu doesn't always
have to start at offset 0. ipi_offset stores the position of the first cpu so
that we can easily calculate the virq or hwirq of an IPI associated with a
specific cpu.

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-6-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:55 +01:00
Qais Yousef
955bfe5912 genirq: Add an extra comment about the use of affinity in irq_common_data
Affinity will have dual meaning depends on the type of the irq. If it is
a normal irq, it'll have the standard affinity meaning.

If it is an IPI, it will hold the mask of the cpus to which an IPI can be
sent.

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-7-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:55 +01:00
Qais Yousef
29d5c8db26 genirq: Add DOMAIN_BUS_IPI
We need a way to search and match IPI domains.

Using the new enum we can use irq_find_matching_host() to do that.

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-3-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:55 +01:00
Qais Yousef
0abefbaab4 genirq: Add new IPI irqdomain flags
These flags will be used to identify an IPI domain. We have two flavours of
IPI implementations:

IRQ_DOMAIN_FLAG_IPI_PER_CPU: Each CPU has its own virq and hwirq
IRQ_DOMAIN_FLAG_IPI_SINGLE : A single virq and hwirq for all CPUs

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-2-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25 10:56:55 +01:00
Peter Zijlstra
9107c89e26 perf: Fix race between event install and jump_labels
perf_install_in_context() relies upon the context switch hooks to have
scheduled in events when the IPI misses its target -- after all, if
the task has moved from the CPU (or wasn't running at all), it will
have to context switch to run elsewhere.

This however doesn't appear to be happening.

It is possible for the IPI to not happen (task wasn't running) only to
later observe the task running with an inactive context.

The only possible explanation is that the context switch hooks are not
called. Therefore put in a sync_sched() after toggling the jump_label
to guarantee all CPUs will have them enabled before we install an
event.

A simple if (0->1) sync_sched() will not in fact work, because any
further increment can race and complete before the sync_sched().
Therefore we must jump through some hoops.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25 08:42:34 +01:00
Peter Zijlstra
a69b0ca4ac perf: Fix cloning
Alexander reported that when the 'original' context gets destroyed, no
new clones happen.

This can happen irrespective of the ctx switch optimization, any task
can die, even the parent, and we want to continue monitoring the task
hierarchy until we either close the event or no tasks are left in the
hierarchy.

perf_event_init_context() will attempt to pin the 'parent' context
during clone(). At that point current is the parent, and since current
cannot have exited while executing clone(), its context cannot have
passed through perf_event_exit_task_context(). Therefore
perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.

However, since inherit_event() does:

	if (parent_event->parent)
		parent_event = parent_event->parent;

it looks at the 'original' event when it does: is_orphaned_event().
This can return true if the context that contains the this event has
passed through perf_event_exit_task_context(). And thus we'll fail to
clone the perf context.

Fix this by adding a new state: STATE_DEAD, which is set by
perf_release() to indicate that the filedesc (or kernel reference) is
dead and there are no observers for our data left.

Only for STATE_DEAD will is_orphaned_event() be true and inhibit
cloning.

STATE_EXIT is otherwise preserved such that is_event_hup() remains
functional and will report when the observed task hierarchy becomes
empty.

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Fixes: c6e5b73242 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25 08:42:33 +01:00
Dan Williams
4577b06655 nfit: update address range scrub commands to the acpi 6.1 format
The original format of these commands from the "NVDIMM DSM Interface
Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root
Device _DSMs" [2].

[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
[2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
     "9.20.7 NVDIMM Root Device _DSMs"

Changes include:
1/ New 'restart' fields in ars_status, unfortunately these are
   implemented in the middle of the existing definition so this change
   is not backwards compatible.  The expectation is that shipping
   platforms will only ever support the ACPI 6.1 definition.

2/ New status values for ars_start ('busy') and ars_status ('overflow').

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-02-23 17:17:20 -08:00
Linus Torvalds
420eb6d7ef Merge tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
 "Stable bugfixes:
   - Fix nfs_size_to_loff_t
   - NFSv4: Fix a dentry leak on alias use

  Other bugfixes:
   - Don't schedule a layoutreturn if the layout segment can be freed
     immediately.
   - Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode
   - rpcrdma_bc_receive_call() should init rq_private_buf.len
   - fix stateid handling for the NFS v4.2 operations
   - pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page
   - fix panic in gss_pipe_downcall() in fips mode
   - Fix a race between layoutget and pnfs_destroy_layout
   - Fix a race between layoutget and bulk recalls"

* tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls
  NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layout
  auth_gss: fix panic in gss_pipe_downcall() in fips mode
  pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page
  nfs4: fix stateid handling for the NFS v4.2 operations
  NFSv4: Fix a dentry leak on alias use
  xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.len
  pNFS: Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode
  pNFS: Fix pnfs_mark_matching_lsegs_return()
  nfs: fix nfs_size_to_loff_t
2016-02-23 16:39:21 -08:00
Linus Torvalds
dea08e6044 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Looks like a lot, but mostly driver fixes scattered all over as usual.

  Of note:

   1) Add conditional sched in nf conntrack in cleanup to avoid NMI
      watchdogs.  From Florian Westphal.

   2) Fix deadlock in nfnetlink cttimeout, also from Floarian.

   3) Fix handling of slaves in bonding ARP monitor validation, from Jay
      Vosburgh.

   4) Callers of ip_cmsg_send() are responsible for freeing IP options,
      some were not doing so.  Fix from Eric Dumazet.

   5) Fix per-cpu bugs in mvneta driver, from Gregory CLEMENT.

   6) Fix vlan handling in mv88e6xxx DSA driver, from Vivien Didelot.

   7) bcm7xxx PHY driver bug fixes from Florian Fainelli.

   8) Avoid unaligned accesses to protocol headers wrt.  GRE, from
      Alexander Duyck.

   9) SKB leaks and other problems in arc_emac driver, from Alexander
      Kochetkov.

  10) tcp_v4_inbound_md5_hash() releases listener socket instead of
      request socket on error path, oops.  Fix from Eric Dumazet.

  11) Missing socket release in pppoe_rcv_core() that seems to have
      existed basically forever.  From Guillaume Nault.

  12) Missing slave_dev unregister in dsa_slave_create() error path,
      from Florian Fainelli.

  13) crypto_alloc_hash() never returns NULL, fix return value check in
      __tcp_alloc_md5sig_pool.  From Insu Yun.

  14) Properly expire exception route entries in ipv4, from Xin Long.

  15) Fix races in tcp/dccp listener socket dismantle, from Eric
      Dumazet.

  16) Don't set IFF_TX_SKB_SHARING in vxlan, geneve, or GRE, it's not
      legal.  These drivers modify the SKB on transmit.  From Jiri Benc.

  17) Fix regression in the initialziation of netdev->tx_queue_len.
      From Phil Sutter.

  18) Missing unlock in tipc_nl_add_bc_link() error path, from Insu Yun.

  19) SCTP port hash sizing does not properly ensure that table is a
      power of two in size.  From Neil Horman.

  20) Fix initializing of software copy of MAC address in fmvj18x_cs
      driver, from Ken Kawasaki"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (129 commits)
  bnx2x: Fix 84833 phy command handler
  bnx2x: Fix led setting for 84858 phy.
  bnx2x: Correct 84858 PHY fw version
  bnx2x: Fix 84833 RX CRC
  bnx2x: Fix link-forcing for KR2
  net: ethernet: davicom: fix devicetree irq resource
  fmvj18x_cs: fix incorrect indexing of dev->dev_addr[] when copying the MAC address
  Driver: Vmxnet3: Update Rx ring 2 max size
  net: netcp: rework the code for get/set sw_data in dma desc
  soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data
  net: ti: netcp: restore get/set_pad_info() functionality
  MAINTAINERS: Drop myself as xen netback maintainer
  sctp: Fix port hash table size computation
  can: ems_usb: Fix possible tx overflow
  Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb
  net: bcmgenet: Fix internal PHY link state
  af_unix: Don't use continue to re-execute unix_stream_read_generic loop
  unix_diag: fix incorrect sign extension in unix_lookup_by_ino
  bnxt_en: Failure to update PHY is not fatal condition.
  bnxt_en: Remove unnecessary call to update PHY settings.
  ...
2016-02-22 12:18:07 -08:00
Karicheri, Muralidharan
b1cb86ae0e soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data
Rename the pad to sw_data as per description of this field in the hardware
spec(refer sprugr9 from www.ti.com). Latest version of the document is
at http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf and section 3.1
Host Packet Descriptor describes this field.

Define and use a constant for the size of sw_data field similar to
other fields in the struct for desc and document the sw_data field
in the header. As the sw_data is not touched by hw, it's type can be
changed to u32.

Rename the helpers to match with the updated dma desc field sw_data.

Cc: Wingman Kwok <w-kwok2@ti.com>
Cc: Mugunthan V N <mugunthanvnm@ti.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-21 22:03:15 -05:00
Ivaylo Dimitrov
9aafabc7fe power: bq27xxx_battery: Restore device name
Patch <703df6c09795> ("power: bq27xxx_battery: Reorganize I2C
into a module") has removed the device name numbering from
bq27xxx_battery_i2c_probe. Fix that by restoring the code.

Fixes: 703df6c097
Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
2016-02-21 20:49:34 +01:00
Linus Torvalds
0389075ecf Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "This is unusually large, partly due to the EFI fixes that prevent
  accidental deletion of EFI variables through efivarfs that may brick
  machines.  These fixes are somewhat involved to maintain compatibility
  with existing install methods and other usage modes, while trying to
  turn off the 'rm -rf' bricking vector.

  Other fixes are for large page ioremap()s and for non-temporal
  user-memcpy()s"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix vmalloc_fault() to handle large pages properly
  hpet: Drop stale URLs
  x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
  x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
  lib/ucs2_string: Correct ucs2 -> utf8 conversion
  efi: Add pstore variables to the deletion whitelist
  efi: Make efivarfs entries immutable by default
  efi: Make our variable validation list include the guid
  efi: Do variable name validation tests in utf8
  efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
  lib/ucs2_string: Add ucs2 -> utf8 helper functions
2016-02-20 09:32:40 -08:00
Dan Williams
747ffe11b4 libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
Use the output length specified in the command to size the receive
buffer rather than the arbitrary 4K limit.

This bug was hiding the fact that the ndctl implementation of
ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size.

Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-02-19 15:21:52 -08:00
Linus Torvalds
87d9ac712b Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: slab: free kmem_cache_node after destroy sysfs file
  ipc/shm: handle removed segments gracefully in shm_mmap()
  MAINTAINERS: update Kselftest Framework mailing list
  devm_memremap_release(): fix memremap'd addr handling
  mm/hugetlb.c: fix incorrect proc nr_hugepages value
  mm, x86: fix pte_page() crash in gup_pte_range()
  fsnotify: turn fsnotify reaper thread into a workqueue job
  Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
  mm: fix regression in remap_file_pages() emulation
  thp, dax: do not try to withdraw pgtable from non-anon VMA
2016-02-19 13:36:00 -08:00
Nikolay Aleksandrov
cfdd28beb3 net: make netdev_for_each_lower_dev safe for device removal
When I used netdev_for_each_lower_dev in commit bad5316232 ("vrf:
remove slave queue and private slave struct") I thought that it acts
like netdev_for_each_lower_private and can be used to remove the current
device from the list while walking, but unfortunately it acts more like
netdev_for_each_lower_private_rcu and doesn't allow it. The difference
is where the "iter" points to, right now it points to the current element
and that makes it impossible to remove it. Change the logic to be
similar to netdev_for_each_lower_private and make it point to the "next"
element so we can safely delete the current one. VRF is the only such
user right now, there's no change for the read-only users.

Here's what can happen now:
[98423.249858] general protection fault: 0000 [#1] SMP
[98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng
sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw
gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon
parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button
9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg
virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd
ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy
virtio_ring virtio scsi_mod [last unloaded: bridge]
[98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G           O
4.5.0-rc2+ #81
[98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti:
ffff88003428c000
[98423.256123] RIP: 0010:[<ffffffff81514f3e>]  [<ffffffff81514f3e>]
netdev_lower_get_next+0x1e/0x30
[98423.256534] RSP: 0018:ffff88003428f940  EFLAGS: 00010207
[98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX:
0000000000000000
[98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI:
ffff880054ff90c0
[98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09:
0000000000000000
[98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12:
ffff88003428f9e0
[98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15:
0000000000000001
[98423.258055] FS:  00007f3d76881700(0000) GS:ffff88005d000000(0000)
knlGS:0000000000000000
[98423.258418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4:
00000000000406e0
[98423.258902] Stack:
[98423.259075]  ffff88003428f960 ffffffffa0442636 0002000100000004
ffff880054ff9000
[98423.259647]  ffff88003428f9b0 ffffffff81518205 ffff880054ff9000
ffff88003428f978
[98423.260208]  ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0
ffff880035b35f00
[98423.260739] Call Trace:
[98423.260920]  [<ffffffffa0442636>] vrf_dev_uninit+0x76/0xa0 [vrf]
[98423.261156]  [<ffffffff81518205>]
rollback_registered_many+0x205/0x390
[98423.261401]  [<ffffffff815183ec>] unregister_netdevice_many+0x1c/0x70
[98423.261641]  [<ffffffff8153223c>] rtnl_delete_link+0x3c/0x50
[98423.271557]  [<ffffffff815335bb>] rtnl_dellink+0xcb/0x1d0
[98423.271800]  [<ffffffff811cd7da>] ? __inc_zone_state+0x4a/0x90
[98423.272049]  [<ffffffff815337b4>] rtnetlink_rcv_msg+0x84/0x200
[98423.272279]  [<ffffffff810cfe7d>] ? trace_hardirqs_on+0xd/0x10
[98423.272513]  [<ffffffff8153370b>] ? rtnetlink_rcv+0x1b/0x40
[98423.272755]  [<ffffffff81533730>] ? rtnetlink_rcv+0x40/0x40
[98423.272983]  [<ffffffff8155d6e7>] netlink_rcv_skb+0x97/0xb0
[98423.273209]  [<ffffffff8153371a>] rtnetlink_rcv+0x2a/0x40
[98423.273476]  [<ffffffff8155ce8b>] netlink_unicast+0x11b/0x1a0
[98423.273710]  [<ffffffff8155d2f1>] netlink_sendmsg+0x3e1/0x610
[98423.273947]  [<ffffffff814fbc98>] sock_sendmsg+0x38/0x70
[98423.274175]  [<ffffffff814fc253>] ___sys_sendmsg+0x2e3/0x2f0
[98423.274416]  [<ffffffff810d841e>] ? do_raw_spin_unlock+0xbe/0x140
[98423.274658]  [<ffffffff811e1bec>] ? handle_mm_fault+0x26c/0x2210
[98423.274894]  [<ffffffff811e19cd>] ? handle_mm_fault+0x4d/0x2210
[98423.275130]  [<ffffffff81269611>] ? __fget_light+0x91/0xb0
[98423.275365]  [<ffffffff814fcd42>] __sys_sendmsg+0x42/0x80
[98423.275595]  [<ffffffff814fcd92>] SyS_sendmsg+0x12/0x20
[98423.275827]  [<ffffffff81611bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66
90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48
89 06 <48> 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66
[98423.279639] RIP  [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30
[98423.279920]  RSP <ffff88003428f940>

CC: David Ahern <dsa@cumulusnetworks.com>
CC: David S. Miller <davem@davemloft.net>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Vlad Yasevich <vyasevic@redhat.com>
Fixes: bad5316232 ("vrf: remove slave queue and private slave struct")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-19 15:29:26 -05:00
Linus Torvalds
705d43dbe1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching fixes from Jiri Kosina:

 - regression (from 4.4) fix for ordering issue, introduced by an
   earlier ftrace change, that broke live patching of modules.

   The fix replaces the ftrace module notifier by direct call in order
   to make the ordering guaranteed and well-defined.  The patch, from
   Jessica Yu, has been acked both by Steven and Rusty

 - error message fix from Miroslav Benes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  ftrace/module: remove ftrace module notifier
  livepatch: change the error message in asm/livepatch.h header files
2016-02-18 16:34:15 -08:00
Jeff Layton
13d34ac6e5 Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
This reverts commit c510eff6be ("fsnotify: destroy marks with
call_srcu instead of dedicated thread").

Eryu reported that he was seeing some OOM kills kick in when running a
testcase that adds and removes inotify marks on a file in a tight loop.

The above commit changed the code to use call_srcu to clean up the
marks.  While that does (in principle) work, the srcu callback job is
limited to cleaning up entries in small batches and only once per jiffy.
It's easily possible to overwhelm that machinery with too many call_srcu
callbacks, and Eryu's reproduer did just that.

There's also another potential problem with using call_srcu here.  While
you can obviously sleep while holding the srcu_read_lock, the callbacks
run under local_bh_disable, so you can't sleep there.

It's possible when putting the last reference to the fsnotify_mark that
we'll end up putting a chain of references including the fsnotify_group,
uid, and associated keys.  While I don't see any obvious ways that that
could occurs, it's probably still best to avoid using call_srcu here
after all.

This patch reverts the above patch.  A later patch will take a different
approach to eliminated the dedicated thread here.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reported-by: Eryu Guan <guaneryu@gmail.com>
Tested-by: Eryu Guan <guaneryu@gmail.com>
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00