Commit Graph

54697 Commits

Author SHA1 Message Date
Saeed Mahameed
258545449b net/mlx5e: IPoIB, Xmit flow
Implement mlx5e's IPoIB SKB transmit using the helper functions provided
by mlx5e ethernet tx flow, the only difference in the code between
mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill
(UD datagram segment) in the TX descriptor (WQE) and it doesn't need to
have any vlan handling.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:31 -04:00
Erez Shitrit
b3ba51498b net/mlx5: Refactor create flow table method to accept underlay QP
IB flow tables need the underlay qp to perform flow steering.
Here we change the API of the flow tables creation to accept the
underlay QP number as a parameter in order to support IB (IPoIB) flow
steering.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:29 -04:00
Erez Shitrit
500a3d0ded net/mlx5: Add IPoIB enhanced offloads bits to mlx5_ifc
New capability bit: ipoib_enhanced_offloads, indicates new ability for UD
QP to do RSS and enhanced IPoIB offloads and acceleration.

Add underlay_qpn to the TIS and flow_table objects In order to support
SET_ROOT command, to connect between IPoIB QPs and flow steering tables.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:29 -04:00
David S. Miller
6b6cbc1471 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were simply overlapping changes.  In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.

In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15 21:16:30 -04:00
Linus Torvalds
7e703eccf0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Things seem to be settling down as far as networking is concerned,
  let's hope this trend continues...

   1) Add iov_iter_revert() and use it to fix the behavior of
      skb_copy_datagram_msg() et al., from Al Viro.

   2) Fix the protocol used in the synthetic SKB we cons up for the
      purposes of doing a simulated route lookup for RTM_GETROUTE
      requests. From Florian Larysch.

   3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong
      Wang.

   4) Don't call netdev_change_features with the team lock held, from
      Xin Long.

   5) Revert TCP F-RTO extension to catch more spurious timeouts because
      it interacts very badly with some middle-boxes. From Yuchung
      Cheng.

   6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from
      Guillaume Nault.

   7) ctnetlink uses bit positions where it should be using bit masks,
      fix from Liping Zhang.

   8) Missing RCU locking in netfilter helper code, from Gao Feng.

   9) Avoid double frees and use-after-frees in tcp_disconnect(), from
      Eric Dumazet.

  10) Don't do a changelink before we register the netdevice in
      bridging, from Ido Schimmel.

  11) Lock the ipv6 device address list properly, from Rabin Vincent"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
  netfilter: nft_hash: do not dump the auto generated seed
  drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
  ipv6: Fix idev->addr_list corruption
  net: xdp: don't export dev_change_xdp_fd()
  bridge: netlink: register netdevice before executing changelink
  bridge: implement missing ndo_uninit()
  bpf: reference may_access_skb() from __bpf_prog_run()
  tcp: clear saved_syn in tcp_disconnect()
  netfilter: nf_ct_expect: use proper RCU list traversal/update APIs
  netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL
  netfilter: make it safer during the inet6_dev->addr_list traversal
  netfilter: ctnetlink: make it safer when checking the ct helper name
  netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find
  netfilter: ctnetlink: using bit to represent the ct event
  netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
  net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb
  l2tp: don't mask errors in pppol2tp_getsockopt()
  l2tp: don't mask errors in pppol2tp_setsockopt()
  tcp: restrict F-RTO to work-around broken middle-boxes
  ...
2017-04-14 17:38:24 -07:00
Linus Torvalds
7873933385 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael S. Tsirkin:
 "virtio oops fixes

  The virtio pci rework using shared interrupts caused a lot of issues.
  We tried to fix them but run out of time. Revert for now, and revisit
  the issue for the next kernel.

  Luckily we are able to do this without loosing automatic interrupt
  NUMA affinity which was the main motivator for the rework"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-pci: Remove affinity hint before freeing the interrupt
  Revert "virtio_pci: remove struct virtio_pci_vq_info"
  Revert "virtio_pci: use shared interrupts for virtqueues"
  Revert "virtio_pci: don't duplicate the msix_enable flag in struct pci_dev"
  Revert "virtio_pci: simplify MSI-X setup"
  Revert "virtio_pci: fix out of bound access for msix_names"
  MAINTAINERS: fix virtio file pattern
  virtio_console: fix uninitialized variable use
  virtio_net: clear MTU when out of range
  virtio: allow drivers to validate features
  virtio_net: enable big packets for large MTU values
2017-04-14 08:49:39 -07:00
Kirill A. Shutemov
c0c379e293 mm: drop unused pmdp_huge_get_and_clear_notify()
Dave noticed that after fixing MADV_DONTNEED vs numa balancing race the
last pmdp_huge_get_and_clear_notify() user is gone.

Let's drop the helper.

Link: http://lkml.kernel.org/r/20170306112047.24809-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Johannes Berg
ba0dc5f6e0 netlink: allow sending extended ACK with cookie on success
Now that we have extended error reporting and a new message format for
netlink ACK messages, also extend this to be able to return arbitrary
cookie data on success.

This will allow, for example, nl80211 to not send an extra message for
cookies identifying newly created objects, but return those directly
in the ACK message.

The cookie data size is currently limited to 20 bytes (since Jamal
talked about using SHA1 for identifiers.)

Thanks to Jamal Hadi Salim for bringing up this idea during the
discussions.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:21 -04:00
Johannes Berg
2d4bc93368 netlink: extended ACK reporting
Add the base infrastructure and UAPI for netlink extended ACK
reporting. All "manual" calls to netlink_ack() pass NULL for now and
thus don't get extended ACK reporting.

Big thanks goes to Pablo Neira Ayuso for not only bringing up the
whole topic at netconf (again) but also coming up with the nlattr
passing trick and various other ideas.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:20 -04:00
Alexey Dobriyan
d92be7a41e net: make struct net_device::min_header_len 8-bit
This field is never big enough to warrant 16-bitness.

8-bit accesses enjoy shorted encoding on i386/x86_64 than 16-bit
accesses:

	add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10)
	function                                     old     new   delta
	loopback_setup                               169     164      -5
	ether_setup                                  148     143      -5

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12 13:59:21 -04:00
Alexey Dobriyan
5b3dc2f37d net: neigh: make ->hh_len 32-bit
Using 16-bit ->hh_len doesn't save any memory, save some .text instead:

	add/remove: 0/0 grow/shrink: 1/6 up/down: 2/-19 (-17)
	function                                     old     new   delta
	neigh_update                                2312    2314      +2
	fwnet_header_cache                           199     197      -2
	eth_header_cache                             101      99      -2
	ip6_finish_output2                          2371    2368      -3
	vrf_finish_output6                          1522    1518      -4
	vrf_finish_output                           1413    1409      -4
	ip_finish_output2                           1627    1623      -4

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12 13:59:21 -04:00
Linus Torvalds
06ea4c38bc Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "This contains fixes for two long standing subtle bugs:

   - kthread_bind() on a new kthread binds it to specific CPUs and
     prevents userland from messing with the affinity or cgroup
     membership. Unfortunately, for cgroup membership, there's a window
     between kthread creation and kthread_bind*() invocation where the
     kthread can be moved into a non-root cgroup by userland.

     Depending on what controllers are in effect, this can assign the
     kthread unexpected attributes. For example, in the reported case,
     workqueue workers ended up in a non-root cpuset cgroups and had
     their CPU affinities overridden. This broke workqueue invariants
     and led to workqueue stalls.

     Fixed by closing the window between kthread creation and
     kthread_bind() as suggested by Oleg.

   - There was a bug in cgroup mount path which could allow two
     competing mount attempts to attach the same cgroup_root to two
     different superblocks.

     This was caused by mishandling return value from kernfs_pin_sb().

     Fixed"

* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: avoid attaching a cgroup root to two different superblocks
  cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups
2017-04-11 23:38:16 -07:00
Johannes Berg
40077e0cf6 bpf: remove struct bpf_map_type_list
There's no need to have struct bpf_map_type_list since
it just contains a list_head, the type, and the ops
pointer. Since the types are densely packed and not
actually dynamically registered, it's much easier and
smaller to have an array of type->ops pointer. Also
initialize this array statically to remove code needed
to initialize it.

In order to save duplicating the list, move it to the
types header file added by the previous patch and
include it in the same fashion.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11 14:38:43 -04:00
Johannes Berg
be9370a7d8 bpf: remove struct bpf_prog_type_list
There's no need to have struct bpf_prog_type_list since
it just contains a list_head, the type, and the ops
pointer. Since the types are densely packed and not
actually dynamically registered, it's much easier and
smaller to have an array of type->ops pointer. Also
initialize this array statically to remove code needed
to initialize it.

In order to save duplicating the list, move it to a new
header file and include it in the places needing it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11 14:38:43 -04:00
Christoph Hellwig
4244de1c64 PCI: remove pci_enable_msix
Unused now that all callers switched to pci_alloc_irq_vectors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11 11:16:03 -04:00
David S. Miller
bf74b20d00 Revert "rtnl: Add support for netdev event to link messages"
This reverts commit def12888c1.

As per discussion between Roopa Prabhu and David Ahern, it is
advisable that we instead have the code collect the setlink triggered
events into a bitmask emitted in the IFLA_EVENT netlink attribute.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09 14:45:21 -07:00
Linus Torvalds
2a610b8aa8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS fixes from Al Viro:
 "statx followup fixes and a fix for stack-smashing on alpha"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  alpha: fix stack smashing in old_adjtimex(2)
  statx: Include a mask for stx_attributes in struct statx
  statx: Reserve the top bit of the mask for future struct expansion
  xfs: report crtime and attribute flags to statx
  ext4: Add statx support
  statx: optimize copy of struct statx to userspace
  statx: remove incorrect part of vfs_statx() comment
  statx: reject unknown flags when using NULL path
  Documentation/filesystems: fix documentation for ->getattr()
2017-04-09 08:26:21 -07:00
Steffen Klassert
7f564528a4 skbuff: Extend gso_type to unsigned int.
All available gso_type flags are currently in use, so
extend gso_type from 'unsigned short' to 'unsigned int'
to be able to add further flags.

We reorder the struct skb_shared_info to use
two bytes of the four byte hole before dataref.
All fields before dataref are cleared, i.e.
four bytes more than before the change.

The remaining two byte hole is moved to the
beginning of the structure, this protects us
from immediate overwites on out of bound writes
to the sk_buff head.

Structure layout on x86-64 before the change:

struct skb_shared_info {
	unsigned char              nr_frags;             /*     0     1 */
	__u8                       tx_flags;             /*     1     1 */
	short unsigned int         gso_size;             /*     2     2 */
	short unsigned int         gso_segs;             /*     4     2 */
	short unsigned int         gso_type;             /*     6     2 */
	struct sk_buff *           frag_list;            /*     8     8 */
	struct skb_shared_hwtstamps hwtstamps;           /*    16     8 */
	u32                        tskey;                /*    24     4 */
	__be32                     ip6_frag_id;          /*    28     4 */
	atomic_t                   dataref;              /*    32     4 */

	/* XXX 4 bytes hole, try to pack */

	void *                     destructor_arg;       /*    40     8 */
	skb_frag_t                 frags[17];            /*    48   272 */
	/* --- cacheline 5 boundary (320 bytes) --- */

	/* size: 320, cachelines: 5, members: 12 */
	/* sum members: 316, holes: 1, sum holes: 4 */
};

Structure layout on x86-64 after the change:

struct skb_shared_info {
	short unsigned int         _unused;              /*     0     2 */
	unsigned char              nr_frags;             /*     2     1 */
	__u8                       tx_flags;             /*     3     1 */
	short unsigned int         gso_size;             /*     4     2 */
	short unsigned int         gso_segs;             /*     6     2 */
	struct sk_buff *           frag_list;            /*     8     8 */
	struct skb_shared_hwtstamps hwtstamps;           /*    16     8 */
	unsigned int               gso_type;             /*    24     4 */
	u32                        tskey;                /*    28     4 */
	__be32                     ip6_frag_id;          /*    32     4 */
	atomic_t                   dataref;              /*    36     4 */
	void *                     destructor_arg;       /*    40     8 */
	skb_frag_t                 frags[17];            /*    48   272 */
	/* --- cacheline 5 boundary (320 bytes) --- */

	/* size: 320, cachelines: 5, members: 13 */
};

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08 12:32:07 -07:00
Linus Torvalds
78d91a75b4 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Here's a pull request for 4.11-rc, fixing a set of issues mostly
  centered around the new scheduling framework. These have been brewing
  for a while, but split up into what we absolutely need in 4.11, and
  what we can defer until 4.12. These are well tested, on both single
  queue and multiqueue setups, and with and without shared tags. They
  fix several hangs that have happened in testing.

  This is obviously larger than I would have preferred at this point in
  time, but I don't think we can shave much off this and still get the
  desired results.

  In detail, this pull request contains:

   - a set of five fixes for NVMe, mostly from Christoph and one from
     Roland.

   - a series from Bart, fixing issues with dm-mq and SCSI shared tags
     and scheduling. Note that one of those patches commit messages may
     read like an optimization, but it is in fact an important fix for
     queue restarts in particular.

   - a series from Omar, most importantly fixing a hang with multiple
     hardware queues when we fail to get a driver tag. Another important
     fix in there is for resizing hardware queues, which nbd does when
     handling multiple sockets for one connection.

   - fixing an imbalance in putting the ctx for hctx request allocations
     from Minchan"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: Restart a single queue if tag sets are shared
  dm rq: Avoid that request processing stalls sporadically
  scsi: Avoid that SCSI queues get stuck
  blk-mq: Introduce blk_mq_delay_run_hw_queue()
  blk-mq: remap queues when adding/removing hardware queues
  blk-mq-sched: fix crash in switch error path
  blk-mq-sched: set up scheduler tags when bringing up new queues
  blk-mq-sched: refactor scheduler initialization
  blk-mq: use the right hctx when getting a driver tag fails
  nvmet: fix byte swap in nvmet_parse_io_cmd
  nvmet: fix byte swap in nvmet_execute_write_zeroes
  nvmet: add missing byte swap in nvmet_get_smart_log
  nvme: add missing byte swap in nvme_setup_discard
  nvme: Correct NVMF enum values to match NVMe-oF rev 1.0
  block: do not put mq context in blk_mq_alloc_request_hctx
2017-04-08 11:56:58 -07:00
Linus Torvalds
c3df1c7c36 Merge tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fix from Linus Walleij:
 "This late fix for pin control is hopefully the last I send this cycle.

  The problem was detected early in the v4.11 release cycle and there
  has been some back and forth on how to solve it. Sadly the proper fix
  arrives late, but at least not too late.

  An issue was detected with pin control on the Freescale i.MX after the
  refactorings for more general group and function handling.

  We now have the proper fix for this"

* tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()
2017-04-08 11:43:38 -07:00
Linus Torvalds
542380a208 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
 "ARM:
   - Fix a problem with GICv3 userspace save/restore
   - Clarify GICv2 userspace save/restore ABI
   - Be more careful in clearing GIC LRs
   - Add missing synchronization primitive to our MMU handling code

  PPC:
   - Check for a NULL return from kzalloc

  s390:
   - Prevent translation exception errors on valid page tables for the
     instruction-exection-protection support

  x86:
   - Fix Page-Modification Logging when running a nested guest"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl
  KVM: nVMX: initialize PML fields in vmcs02
  KVM: nVMX: do not leak PML full vmexit to L1
  KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI
  KVM: arm64: Ensure LRs are clear when they should be
  kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  KVM: s390: remove change-recording override support
  arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region
  arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
2017-04-08 01:39:43 -07:00
Bart Van Assche
6d8c6c0f97 blk-mq: Restart a single queue if tag sets are shared
To improve scalability, if hardware queues are shared, restart
a single hardware queue in round-robin fashion. Rename
blk_mq_sched_restart_queues() to reflect the new semantics.
Remove blk_mq_sched_mark_restart_queue() because this function
has no callers. Remove flag QUEUE_FLAG_RESTART because this
patch removes the code that uses this flag.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 12:40:09 -06:00
Bart Van Assche
7587a5ae7e blk-mq: Introduce blk_mq_delay_run_hw_queue()
Introduce a function that runs a hardware queue unconditionally
after a delay. Note: there is already a function that stops and
restarts a hardware queue after a delay, namely blk_mq_delay_queue().

This function will be used in the next patch in this series.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Long Li <longli@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 12:27:06 -06:00
Omar Sandoval
54d5329d42 blk-mq-sched: fix crash in switch error path
In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall
back to the original scheduler. However, at this point, we've already
torn down the original scheduler's tags, so this causes a crash. Doing
the fallback like the legacy elevator path is much harder for mq, so fix
it by just falling back to none, instead.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 08:56:48 -06:00
Michael S. Tsirkin
404123c2db virtio: allow drivers to validate features
Some drivers can't support all features in all configurations.  At the
moment we blindly set FEATURES_OK and later FAILED.  Support this better
by adding a callback drivers can use to do some early checks.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-04-07 16:38:59 +03:00
Tony Lindgren
6118714275 pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()
Recent pinctrl changes to allow dynamic allocation of pins exposed one
more issue with the pinctrl pins claimed early by the controller itself.
This caused a regression for IMX6 pinctrl hogs.

Before enabling the pin controller driver we need to wait until it has
been properly initialized, then claim the hogs, and only then enable it.

To fix the regression, split the code into pinctrl_claim_hogs() and
pinctrl_enable(). And then let's require that pinctrl_enable() is always
called by the pin controller driver when ready after calling
pinctrl_register_and_init().

Depends-on: 950b0d91dc ("pinctrl: core: Fix regression caused by delayed
work for hogs")
Fixes: df61b366af26 ("pinctrl: core: Use delayed work for hogs")
Fixes: e566fc11ea ("pinctrl: imx: use generic pinctrl helpers for
managing groups")
Cc: Haojian Zhuang <haojian.zhuang@linaro.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Stefan Agner <stefan@agner.ch>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Gary Bisson <gary.bisson@boundarydevices.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-04-07 01:08:08 +02:00
Guy Ergas
102722fc68 net/mlx5e: Add support for RXFCS feature flag
Add support for rx-fcs flag from ethtool.
In case this flag is set, update all RQs to scatter the FCS data into
the packet.

Signed-off-by: Guy Ergas <guye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-07 01:20:59 +03:00
Mintz, Yuval
08737a3fa3 qed: Inform qedi the number of possible CQs
Now that management firmware is capable of telling us the number of CQs
available for a given PF, qed needs to communicate the number to qedi
so it would know have many to use.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 14:26:31 -07:00
R. Parameswaran
113c307593 New kernel function to get IP overhead on a socket.
A new function, kernel_sock_ip_overhead(), is provided
to calculate the cumulative overhead imposed by the IP
Header and IP options, if any, on a socket's payload.
The new function returns an overhead of zero for sockets
that do not belong to the IPv4 or IPv6 address families.
This is used in the L2TP code path to compute the
total outer IP overhead on the L2TP tunnel socket when
calculating the default MTU for Ethernet pseudowires.

Signed-off-by: R. Parameswaran <rparames@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 13:43:31 -07:00
David S. Miller
0e4c0ee580 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-04-06 11:57:04 -07:00
David S. Miller
6f14f443d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 08:24:51 -07:00
David S. Miller
18148f09c0 Merge tag 'linux-can-next-for-4.13-20170404' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:

====================
pull-request: can-next 2017-03-03

this is a pull request of 5 patches for net-next/master.

There are two patches by Yegor Yefremov which convert the ti_hecc
driver into a DT only driver, as there is no in-tree user of the old
platform driver interface anymore. The next patch by Mario Kicherer
adds network namespace support to the can subsystem. The last two
patches by Akshay Bhat add support for the holt_hi311x SPI CAN driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 09:56:22 -07:00
Linus Torvalds
aeb4a57681 Merge tag 'mfd-fixes-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD bug fix from Lee Jones:
 "Increase buffer size om cros-ec to allow for SPI messages"

* tag 'mfd-fixes-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: cros-ec: Fix host command buffer size
2017-04-05 09:04:26 -07:00
Vlad Yasevich
def12888c1 rtnl: Add support for netdev event to link messages
When netdev events happen, a rtnetlink_event() handler will send
messages for every event in it's white list.  These messages contain
current information about a particular device, but they do not include
the iformation about which event just happened.  The consumer of
the message has to try to infer this information.  In some cases
(ex: NETDEV_NOTIFY_PEERS), that is not possible.

This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT
that would have an encoding of the which event triggered this
message.  This would allow the the message consumer to easily determine
if it is interested in a particular event or not.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 08:14:14 -07:00
Radim Krčmář
6fd6410311 Merge tag 'kvm-arm-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
From: Christoffer Dall <cdall@linaro.org>

KVM/ARM Fixes for v4.11-rc6

Fixes include:
 - Fix a problem with GICv3 userspace save/restore
 - Clarify GICv2 userspace save/restore ABI
 - Be more careful in clearing GIC LRs
 - Add missing synchronization primitive to our MMU handling code
2017-04-05 16:27:47 +02:00
Vic Yang
b2376407f9 mfd: cros-ec: Fix host command buffer size
For SPI, we can get up to 32 additional bytes for response preamble.
The current overhead (2 bytes) may cause problems when we try to receive
a big response. Update it to 32 bytes.

Without this fix we could see a kernel BUG when we receive a big response
from the Chrome EC when is connected via SPI.

Signed-off-by: Vic Yang <victoryang@google.com>
Tested-by: Enric Balletbo i Serra <enric.balletbo.collabora.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2017-04-05 13:30:07 +01:00
Mario Kicherer
8e8cda6d73 can: initial support for network namespaces
This patch adds initial support for network namespaces. The changes only
enable support in the CAN raw, proc and af_can code. GW and BCM still
have their checks that ensure that they are used only from the main
namespace.

The patch boils down to moving the global structures, i.e. the global
filter list and their /proc stats, into a per-namespace structure and passing
around the corresponding "struct net" in a lot of different places.

Changes since v1:
 - rebased on current HEAD (2bfe01e)
 - fixed overlong line

Signed-off-by: Mario Kicherer <dev@kicherer.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-04-04 17:35:58 +02:00
Yegor Yefremov
dabf54dd1c can: ti_hecc: Convert TI HECC driver to DT only driver
This patch converts TI HECC driver to DT only driver. This results in
removing ti_hecc.h containing now obsolete platform data.

Former transceiver_switch callback function will be now modelled via
regulator API.

Signed-off-by: Anton Glukhov <anton.a.glukhov@gmail.com>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-04-04 16:55:26 +02:00
Christoffer Dall
6d56111c92 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI
As an oversight, for GICv2, we accidentally export the GICC_PMR register
in the format of the GICH_VMCR.VMPriMask field in the lower 5 bits of a
word, meaning that userspace must always use the lower 5 bits to
communicate with the KVM device and must shift the value left by 3
places to obtain the actual priority mask level.

Since GICv3 supports the full 8 bits of priority masking in the ICH_VMCR,
we have to fix the value we export when emulating a GICv2 on top of a
hardware GICv3 and exporting the emulated GICv2 state to userspace.

Take the chance to clarify this aspect of the ABI.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-04 14:33:59 +02:00
Ram Amrani
f9dc4d1f0d qed: Manage with less memory regions for RoCE
It's possible some configurations would prevent driver from utilizing
all the Memory Regions due to a lack of ILT lines.
In such a case, calculate how many memory regions would have to be
dropped due to limit, and manage without those.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Greg Ungerer
c8b5d129ee net: usbnet: support 64bit stats
Add support for the net stats64 counters to the usbnet core. With that
in place put the hooks into every usbnet driver to use it.

This is a strait forward addition of 64bit counters for RX and TX packet
and byte counts. It is done in the same style as for the other net drivers
that support stats64. Note that the other stats fields remain as 32bit
sized values (error counts, etc).

The motivation to add this is that it is not particularly difficult to
get the RX and TX byte counts to wrap on 32bit platforms.

Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:09:40 -07:00
David Howells
3209f68b3c statx: Include a mask for stx_attributes in struct statx
Include a mask in struct stat to indicate which bits of stx_attributes the
filesystem actually supports.

This would also be useful if we add another system call that allows you to
do a 'bulk attribute set' and pass in a statx struct with the masks
appropriately set to say what you want to set.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-03 01:06:00 -04:00
Linus Torvalds
128c434a70 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
 "This update provides:

   - make the scheduler clock switch to unstable mode smooth so the
     timestamps stay at microseconds granularity instead of switching to
     tick granularity.

   - unbreak perf test tsc by taking the new offset into account which
     was added in order to proveide better sched clock continuity

   - switching sched clock to unstable mode runs all clock related
     computations which affect the sched clock output itself from a work
     queue. In case of preemption sched clock uses half updated data and
     provides wrong timestamps. Keep the math in the protected context
     and delegate only the static key switch to workqueue context.

   - remove a duplicate header include"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/headers: Remove duplicate #include <linux/sched/debug.h> line
  sched/clock: Fix broken stable to unstable transfer
  sched/clock, x86/perf: Fix "perf test tsc"
  sched/clock: Fix clear_sched_clock_stable() preempt wobbly
2017-04-02 09:25:10 -07:00
Linus Torvalds
4a6808f347 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "Two small fixes for the new CLKEVT_OF infrastructure"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  vmlinux.lds: Add __clkevt_of_table to kernel
  clockevents: Fix syntax error in clkevt-of macro
2017-04-02 09:22:03 -07:00
Al Viro
27c0e3748e [iov_iter] new privimitive: iov_iter_revert()
opposite to iov_iter_advance(); the caller is responsible for never
using it to move back past the initial position.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-02 12:10:47 -04:00
Roland Dreier
bf17aa36c0 nvme: Correct NVMF enum values to match NVMe-oF rev 1.0
The enum values for QPTYPE, PRTYPE and CMS are off by 1 from the
values defined in figure 42 of the NVM Express over Fabrics 1.0:

    http://www.nvmexpress.org/wp-content/uploads/NVMe_over_Fabrics_1_0_Gold_20160605-1.pdf

Fix our enums to match the final spec.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2017-04-02 10:24:07 +03:00
Paolo Abeni
3d8417d79e udp: use sk_protocol instead of pcflag to detect udplite sockets
In the udp_sock struct, the 'forward_deficit' and 'pcflag' fields
share the same cacheline. While the first is dirtied by
udp_recvmsg, the latter is read, possibly several times, by the
bottom half processing to discriminate between udp and udplite
sockets.

With this patch, sk->sk_protocol is used to check is the socket is
really an udplite one, avoiding some cache misses per
packet and improving the performance under udp_flood with
small packet up to 10%.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 20:11:36 -07:00
Alexei Starovoitov
1cf1cae963 bpf: introduce BPF_PROG_TEST_RUN command
development and testing of networking bpf programs is quite cumbersome.
Despite availability of user space bpf interpreters the kernel is
the ultimate authority and execution environment.
Current test frameworks for TC include creation of netns, veth,
qdiscs and use of various packet generators just to test functionality
of a bpf program. XDP testing is even more complicated, since
qemu needs to be started with gro/gso disabled and precise queue
configuration, transferring of xdp program from host into guest,
attaching to virtio/eth0 and generating traffic from the host
while capturing the results from the guest.

Moreover analyzing performance bottlenecks in XDP program is
impossible in virtio environment, since cost of running the program
is tiny comparing to the overhead of virtio packet processing,
so performance testing can only be done on physical nic
with another server generating traffic.

Furthermore ongoing changes to user space control plane of production
applications cannot be run on the test servers leaving bpf programs
stubbed out for testing.

Last but not least, the upstream llvm changes are validated by the bpf
backend testsuite which has no ability to test the code generated.

To improve this situation introduce BPF_PROG_TEST_RUN command
to test and performance benchmark bpf programs.

Joint work with Daniel Borkmann.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
Mark Rutland
b0845ce583 kasan: report only the first error by default
Disable kasan after the first report.  There are several reasons for
this:

 - Single bug quite often has multiple invalid memory accesses causing
   storm in the dmesg.

 - Write OOB access might corrupt metadata so the next report will print
   bogus alloc/free stacktraces.

 - Reports after the first easily could be not bugs by itself but just
   side effects of the first one.

Given that multiple reports usually only do harm, it makes sense to
disable kasan after the first one.  If user wants to see all the
reports, the boot-time parameter kasan_multi_shot must be used.

[aryabinin@virtuozzo.com: wrote changelog and doc, added missing include]
Link: http://lkml.kernel.org/r/20170323154416.30257-1-aryabinin@virtuozzo.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-31 17:13:30 -07:00
Johannes Weiner
553af430e7 mm: rmap: fix huge file mmap accounting in the memcg stats
Huge pages are accounted as single units in the memcg's "file_mapped"
counter.  Account the correct number of base pages, like we do in the
corresponding node counter.

Link: http://lkml.kernel.org/r/20170322005111.3156-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-31 17:13:30 -07:00