Commit Graph

649779 Commits

Author SHA1 Message Date
Aaron Conole
679972f3be netfilter: convert while loops to for loops
This is to facilitate converting from a singly-linked list to an array
of elements.

Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06 21:42:16 +01:00
Aaron Conole
d415b9eb76 netfilter: decouple nf_hook_entry and nf_hook_ops
During nfhook traversal we only need a very small subset of
nf_hook_ops members.

We need:
- next element
- hook function to call
- hook function priv argument

Bridge netfilter also needs 'thresh'; can be obtained via ->orig_ops.

nf_hook_entry struct is now 32 bytes on x86_64.

A followup patch will turn the run-time list into an array that only
stores hook functions plus their priv arguments, eliminating the ->next
element.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06 21:42:16 +01:00
Aaron Conole
0aa8c57a04 netfilter: introduce accessor functions for hook entries
This allows easier future refactoring.

Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06 21:42:15 +01:00
Florian Westphal
834184b1f3 netfilter: defrag: only register defrag functionality if needed
nf_defrag modules for ipv4 and ipv6 export an empty stub function.
Any module that needs the defragmentation hooks registered simply 'calls'
this empty function to create a phony module dependency -- modprobe will
then load the defrag module too.

This extends netfilter ipv4/ipv6 defragmentation modules to delay the hook
registration until the functionality is requested within a network namespace
instead of module load time for all namespaces.

Hooks are only un-registered on module unload or when a namespace that used
such defrag functionality exits.

We have to use struct net for this as the register hooks can be called
before netns initialization here from the ipv4/ipv6 conntrack module
init path.

There is no unregister functionality support, defrag will always be
active once it was requested inside a net namespace.

The reason is that defrag has impact on nft and iptables rulesets
(without defrag we might see framents).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06 21:42:00 +01:00
Duc Dang
c5d4603961 PCI: Add MCFG quirks for X-Gene host controller
PCIe controllers in X-Gene SoCs are not ECAM compliant: software needs to
configure additional controller's register to address device at
bus:dev:function.

Add a quirk to discover controller MMIO register space and configure
controller registers to select and address the target secondary device.

The quirk will only be applied for X-Gene PCIe MCFG table with
OEM revison 1, 2, 3 or 4 (PCIe controller v1 and v2 on X-Gene SoCs).

Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:50 -06:00
Tomasz Nowicki
648d93fc77 PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller
ThunderX pass1.x requires to emulate the EA headers for on-chip devices
hence it has to use custom pci_thunder_ecam_ops for accessing PCI config
space (pci-thunder-ecam.c). Add new entries to MCFG quirk array where it
can be applied while probing ACPI based PCI host controller.

ThunderX pass1.x is using the same way for accessing off-chip devices
(so-called PEM) as silicon pass-2.x so we need to add PEM quirk entries
too.

Quirk is considered for ThunderX silicon pass1.x only which is identified
via MCFG revision 2.

ThunderX pass 1.x requires the following accessors:

  NUMA node 0 PCI segments  0- 3: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 10-13: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: change Makefile/ifdefs so quirk doesn't depend on
CONFIG_PCI_HOST_THUNDER_ECAM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:50 -06:00
Tomasz Nowicki
44f22bd91e PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller
ThunderX PCIe controller to off-chip devices (so-called PEM) is not fully
compliant with ECAM standard. It uses non-standard configuration space
accessors (see thunder_pem_ecam_ops) and custom configuration space
granulation (see bus_shift = 24). In order to access configuration space
and probe PEM as ACPI-based PCI host controller we need to add MCFG quirk
infrastructure. This involves:
1. A new thunder_pem_acpi_init() init function to locate PEM-specific
   register ranges using ACPI.
2. Export PEM thunder_pem_ecam_ops structure so it is visible to MCFG quirk
   code.
3. New quirk entries for each PEM segment. Each contains platform IDs,
   mentioned thunder_pem_ecam_ops and CFG resources.

Quirk is considered for ThunderX silicon pass2.x only which is identified
via MCFG revision 1.

ThunderX pass 2.x requires the following accessors:

  NUMA Node 0 PCI segments  0- 3: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA Node 1 PCI segments 10-13: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: adapt to use acpi_get_rc_resources(), update Makefile/ifdefs so
quirk doesn't depend on CONFIG_PCI_HOST_THUNDER_PEM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Bjorn Helgaas
0d414268fb PCI: thunder-pem: Factor out resource lookup
Pull the register resource lookup out of thunder_pem_init() so we can
easily add a corresponding lookup using ACPI.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Dongdong Liu
5f00f1a017 PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers
The PCIe controller in Hip05/Hip06/Hip07 SoCs is not completely
ECAM-compliant.  It is non-ECAM only for the RC bus config space; for any
other bus underneath the root bus it does support ECAM access.

Add specific quirks for PCI config space accessors.  This involves:
1. New initialization call hisi_pcie_init() to obtain RC base
addresses from PNP0C02 at the root of the ACPI namespace (under \_SB).
2. New entry in common quirk array.

[bhelgaas: move to pcie-hisi.c and change Makefile/ifdefs so quirk doesn't
depend on CONFIG_PCI_HISI]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Christopher Covington
2ca5b8ddc6 PCI: Add MCFG quirks for Qualcomm QDF2432 host controller
The Qualcomm Technologies QDF2432 SoC does not support accesses smaller
than 32 bits to the PCI configuration space.  Register the appropriate
quirk.

[bhelgaas: add QCOM_ECAM32 macro, ifdef for ACPI and PCI_QUIRKS]
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Dongdong Liu
169de969c0 PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform
The acpi_get_rc_resources() is used to get the RC register address that can
not be described in MCFG.  It takes the _HID & segment to look for and
outputs the RC address resource.  Use PNP0C02 devices to describe such RC
address resource.  Use _UID to match segment to tell which root bus the
PNP0C02 resource belongs to.

[bhelgaas: add dev argument, wrap in #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Tomasz Nowicki
5b69b85ba1 PCI/ACPI: Check for platform-specific MCFG quirks
The PCIe spec (r3.0, sec 7.2.2) specifies an "Enhanced Configuration Access
Mechanism" (ECAM) for memory-mapped access to configuration space.  ECAM is
required for PCIe systems unless there's a standard firmware interface for
config access.

In the absence of a firmware interface, we use pci_generic_ecam_ops, and on
ACPI systems, we discover the ECAM space via the MCFG table and/or the _CBA
method.

Unfortunately some systems provide MCFG but don't implement ECAM according
to spec, so we need a mechanism for quirks to make those systems work.

Add an MCFG quirk mechanism to override the config accessor functions
and/or the memory-mapped address space.

A quirk is selected if it matches all of the following:

  - OEM ID
  - OEM Table ID
  - OEM Revision
  - PCI segment (from _SEG)
  - PCI bus number range (from _CRS, wildcard allowed)

If the quirk specifies config accessor functions or a memory-mapped address
range, these override the defaults.

[bhelgaas: changelog, reorder quirk matching, fix oem_revision typo per
Duc, add under #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:48 -06:00
Tomasz Nowicki
13983eb89d PCI/ACPI: Extend pci_mcfg_lookup() to return ECAM config accessors
pci_mcfg_lookup() is the external interface to the generic MCFG code.
Previously it merely looked up the ECAM base address for a given domain and
bus range.  We want a way to add MCFG quirks, some of which may require
special config accessors and adjustments to the ECAM address range.

Extend pci_mcfg_lookup() so it can return a pointer to a pci_ecam_ops
structure and a struct resource for the ECAM address space.  For now, it
always returns &pci_generic_ecam_ops (the standard accessor) and the
resource described by the MCFG.

No functional changes intended.

[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:48 -06:00
Bjorn Helgaas
8fd4391ee7 arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows
On x86 and ia64, we have treated all ACPI _CRS resources of PNP0A03 host
bridge devices as "producers", i.e., as host bridge windows.  That's partly
because some x86 BIOSes improperly used "consumer" descriptors to describe
windows and partly because Linux didn't have good support for handling
consumer and producer descriptors differently.

One result is that x86 BIOSes describe host bridge "consumer" resources in
the _CRS of a PNP0C02 device, not the PNP0A03 device itself.  On arm64 we
don't have a legacy of firmware that has this consumer/producer confusion,
so we can handle PNP0A03 "consumer" descriptors as host bridge registers
instead of windows.

Exclude non-window ("consumer") resources from the list of host bridge
windows.  This allows the use of "consumer" PNP0A03 descriptors for bridge
register space.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:48 -06:00
Tomasz Nowicki
093d24a204 arm64: PCI: Manage controller-specific data on per-controller basis
Currently we use one shared global acpi_pci_root_ops structure to keep
controller-specific ops. We pass its pointer to acpi_pci_root_create() and
associate it with a host bridge instance for good.  Such a design implies
serious drawback. Any potential manipulation on the single system-wide
acpi_pci_root_ops leads to kernel crash. The structure content is not
really changing even across multiple host bridges creation; thus it was not
an issue so far.

In preparation for adding ECAM quirks mechanism (where controller-specific
PCI ops may be different for each host bridge) allocate new
acpi_pci_root_ops and fill in with data for each bridge. Now it is safe to
have different controller-specific info. As a consequence free
acpi_pci_root_ops when host bridge is released.

No functional changes in this patch.

Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2016-12-06 13:45:48 -06:00
Bjorn Helgaas
08b1c19606 arm64: PCI: Search ACPI namespace to ensure ECAM space is reserved
The static MCFG table tells us the base of ECAM space, but it does not
reserve the space -- the reservation should be done via a device in the
ACPI namespace whose _CRS includes the ECAM region.

Use acpi_resource_consumer() to check whether the ECAM space is reserved by
an ACPI namespace device.  If it is, emit a message showing which device
reserves it.  If not, emit a "[Firmware Bug]" warning.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2016-12-06 13:45:48 -06:00
Bjorn Helgaas
dfd1972c2b arm64: PCI: Add local struct device pointers
Use a local "struct device *dev" for brevity.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2016-12-06 13:45:48 -06:00
Kirti Wankhede
2b8bb1d771 vfio iommu type1: Fix size argument to vfio_find_dma() in pin_pages/unpin_pages
Passing zero for the size to vfio_find_dma() isn't compatible with
matching the start address of an existing vfio_dma. Doing so triggers a
corner case. In vfio_find_dma(), when the start address is equal to
dma->iova and size is 0, check for the end of search range makes it to
take wrong side of RB-tree. That fails the search even though the address
is present in mapped dma ranges.
In functions pin_pages and unpin_pages, the iova which is being searched
is base address of page to be pinned or unpinned. So here size should be
set to PAGE_SIZE, as argument to vfio_find_dma().

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Neo Jia <cjia@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-12-06 12:35:53 -07:00
Kirti Wankhede
7c03f42846 vfio iommu type1: Fix size argument to vfio_find_dma() during DMA UNMAP.
Passing zero for the size to vfio_find_dma() isn't compatible with
matching the start address of an existing vfio_dma. Doing so triggers a
corner case. In vfio_find_dma(), when the start address is equal to
dma->iova and size is 0, check for the end of search range makes it to
take wrong side of RB-tree. That fails the search even though the address
is present in mapped dma ranges. Due to this, in vfio_dma_do_unmap(),
while checking boundary conditions, size should be set to 1 for verifying
start address of unmap range.
vfio_find_dma() is also used to verify last address in unmap range with
size = 0, but in that case address to be searched is calculated with
start + size - 1 and so it works correctly.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Neo Jia <cjia@nvidia.com>
[aw: changelog tweak]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-12-06 12:28:04 -07:00
Fabian Frederick
3eb15f2828 sunrpc: use DEFINE_SPINLOCK()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-12-06 14:18:30 -05:00
Gustavo Padovan
db444e1344 drm/atomic: doc: remove old comment about nonblocking commits
We now support nonblocking commits on drm_atomic_helper_commit()
so the comment is not valid anymore.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1480946626-30917-1-git-send-email-gustavo@padovan.org
2016-12-06 16:28:30 -02:00
Linus Torvalds
bc3913a537 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fix from David Miller:
 "A use-before-NULL-check from Dan Carpenter"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  dbri: move dereference after check for NULL
2016-12-06 09:24:11 -08:00
Greg Kroah-Hartman
0af72df267 staging: slicoss: remove the staging driver
A "real" driver for this hardware has now landed in the networking tree,
so remove this old staging driver so that we don't have multiple drivers
for the same hardware, and so people don't waste their time trying to
clean up this old code.

Cc: Lior Dotan <liodot@gmail.com>
Cc: Christopher Harrer <charrer@alacritech.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-06 18:18:30 +01:00
Dan Carpenter
163117e8d4 dbri: move dereference after check for NULL
We accidentally introduced a dereference before the NULL check in
xmit_descs() as part of silencing a GCC warning.

Fixes: 16f46050e7 ("dbri: Fix compiler warning")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 12:18:22 -05:00
Linus Torvalds
da1b466fa4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) When dcbnl_cee_fill() fails to be able to push a new netlink
    attribute, it return 0 instead of an error code. From Pan Bian.

 2) Two suffix handling fixes to FIB trie code, from Alexander Duyck.

 3) bnxt_hwrm_stat_ctx_alloc() goes through all the trouble of setting
    and maintaining a return code 'rc' but fails to actually return it.
    Also from Pan Bian.

 4) ping socket ICMP handler needs to validate ICMP header length, from
    Kees Cook.

 5) caif_sktinit_module() has this interesting logic:

        int err = sock_register(...);
        if (!err)
                return err;
        return 0;

    Just return sock_register()'s return value directly which is the
    only possible correct thing to do.

 6) Two bnx2x driver fixes from Yuval Mintz, return a reasonable
    estimate from get_ringparam() ethtool op when interface is down and
    avoid trying to use UDP port based tunneling on 577xx chips.

 7) Fix ep93xx_eth crash on module unload from Florian Fainelli.

 8) Missing uapi exports, from Stephen Hemminger.

 9) Don't schedule work from sk_destruct(), because the socket will be
    freed upon return from that function. From Herbert Xu.

10) Buggy drivers, of which we know there is at least one, can send a
    huge packet into the TCP stack but forget to set the gso_size in the
    SKB, which causes all kinds of problems.

    Correct this when it happens, and emit a one-time warning with the
    device name included so that it can be diagnosed more easily.

    From Marcelo Ricardo Leitner.

11) virtio-net does DMA off the stack causes hiccups with VMAP_STACK,
    fix from Andy Lutomirski.

12) Fix fec driver compilation with CONFIG_M5272, from Nikita
    Yushchenko.

13) mlx5 fixes from Kamal Heib, Saeed Mahameed, and Mohamad Haj Yahia.
    (erroneously flushing queues on error, module parameter validation,
    etc)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
  net/mlx5e: Change the SQ/RQ operational state to positive logic
  net/mlx5e: Don't flush SQ on error
  net/mlx5e: Don't notify HW when filling the edge of ICO SQ
  net/mlx5: Fix query ISSI flow
  net/mlx5: Remove duplicate pci dev name print
  net/mlx5: Verify module parameters
  net: fec: fix compile with CONFIG_M5272
  be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
  virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
  tcp: warn on bogus MSS and try to amend it
  uapi glibc compat: fix outer guard of net device flags enum
  net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing
  netlink: Do not schedule work from sk_destruct
  uapi: export nf_log.h
  uapi: export tc_skbmod.h
  net: ep93xx_eth: Do not crash unloading module
  bnx2x: Prevent tunnel config for 577xx
  bnx2x: Correct ringparam estimate when DOWN
  isdn: hisax: set error code on failure
  net: bnx2x: fix improper return value
  ...
2016-12-06 09:06:51 -08:00
Linus Torvalds
10d20bd25e shmem: fix shm fallocate() list corruption
The shmem hole punching with fallocate(FALLOC_FL_PUNCH_HOLE) does not
want to race with generating new pages by faulting them in.

However, the wait-queue used to delay the page faulting has a serious
problem: the wait queue head (in shmem_fallocate()) is allocated on the
stack, and the code expects that "wake_up_all()" will make sure that all
the queue entries are gone before the stack frame is de-allocated.

And that is not at all necessarily the case.

Yes, a normal wake-up sequence will remove the wait-queue entry that
caused the wakeup (see "autoremove_wake_function()"), but the key
wording there is "that caused the wakeup".  When there are multiple
possible wakeup sources, the wait queue entry may well stay around.

And _particularly_ in a page fault path, we may be faulting in new pages
from user space while we also have other things going on, and there may
well be other pending wakeups.

So despite the "wake_up_all()", it's not at all guaranteed that all list
entries are removed from the wait queue head on the stack.

Fix this by introducing a new wakeup function that removes the list
entry unconditionally, even if the target process had already woken up
for other reasons.  Use that "synchronous" function to set up the
waiters in shmem_fault().

This problem has never been seen in the wild afaik, but Dave Jones has
reported it on and off while running trinity.  We thought we fixed the
stack corruption with the blk-mq rq_list locking fix (commit
7fe311302f: "blk-mq: update hardware and software queues for sleeping
alloc"), but it turns out there was _another_ stack corruptor hiding
in the trinity runs.

Vegard Nossum (also running trinity) was able to trigger this one fairly
consistently, and made us look once again at the shmem code due to the
faults often being in that area.

Reported-and-tested-by: Vegard Nossum <vegard.nossum@oracle.com>.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-06 08:59:05 -08:00
David S. Miller
32f16e142d Merge branch 'mlx5-fixes'
Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-12-04

Some bug fixes for mlx5 core and mlx5e driver.

v1->v2:
 - replace "uint" with "unsigned int"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:45 -05:00
Mohamad Haj Yahia
c0f1147d14 net/mlx5e: Change the SQ/RQ operational state to positive logic
When using the negative logic (i.e. FLUSH state), after the RQ/SQ reopen
we will have a time interval that the RQ/SQ is not really ready and the
state indicates that its not in FLUSH state because the initial SQ/RQ struct
memory starts as zeros.
Now we changed the state to indicate if the SQ/RQ is opened and we will
set the READY state after finishing preparing all the SQ/RQ resources.

Fixes: 6e8dd6d6f4 ("net/mlx5e: Don't wait for SQ completions on close")
Fixes: f2fde18c52 ("net/mlx5e: Don't wait for RQ completions on close")
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:45 -05:00
Saeed Mahameed
3c8591d593 net/mlx5e: Don't flush SQ on error
We are doing SQ descriptors cleanup in driver.

Fixes: 6e8dd6d6f4 ("net/mlx5e: Don't wait for SQ completions on close")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Saeed Mahameed
b8335d91b4 net/mlx5e: Don't notify HW when filling the edge of ICO SQ
We are going to do this a couple of steps ahead anyway.

Fixes: d3c9bc2743 ("net/mlx5e: Added ICO SQs")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Kamal Heib
f9c14e4674 net/mlx5: Fix query ISSI flow
In old FWs query ISSI command is not supported and for some of those FWs
it might fail with status other than "MLX5_CMD_STAT_BAD_OP_ERR".

In such case instead of failing the driver load, we will treat any FW
status other than 0 for Query ISSI FW command as ISSI not supported and
assume ISSI=0 (most basic driver/FW interface).

In case of driver syndrom (query ISSI failure by driver) we will fail
driver load.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4
Ethernet functionality')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Kamal Heib
9e5b2fc1d3 net/mlx5: Remove duplicate pci dev name print
Remove duplicate pci dev name printing from mlx5_core_warn/dbg.

Fixes: 5a7883989b ('net/mlx5_core: Improve mlx5 messages')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:43 -05:00
Kamal Heib
f663ad9862 net/mlx5: Verify module parameters
Verify the mlx5_core module parameters by making sure that they are in
the expected range and if they aren't restore them to their default
values.

Fixes: 9603b61de1 ('mlx5: Move pci device handling from mlx5_ib to mlx5_core')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:43 -05:00
Salil
862b3d2090 net: hns: Fix to conditionally convey RX checksum flag to stack
This patch introduces the RX checksum function to check the
status of the hardware calculated checksum and its error and
appropriately convey status to the upper stack in skb->ip_summed
field.

In hardware, we only support checksum for the following
protocols:
1) IPv4,
2) TCP(over IPv4 or IPv6),
3) UDP(over IPv4 or IPv6),
4) SCTP(over IPv4 or IPv6)
but we support many L3(IPv4, IPv6, MPLS, PPPoE etc) and
L4(TCP, UDP, GRE, SCTP, IGMP, ICMP etc.) protocols.

Hardware limitation:
Our present hardware RX Descriptor lacks L3/L4 checksum
"Status & Error" bit (which usually can be used to indicate whether
checksum was calculated by the hardware and if there was any error
encountered during checksum calculation).

Software workaround:
We do get info within the RX descriptor about the kind of
L3/L4 protocol coming in the packet and the error status. These
errors might not just be checksum errors but could be related to
version, length of IPv4, UDP, TCP etc.
Because there is no-way of knowing if it is a L3/L4 error due
to bad checksum or any other L3/L4 error, we will not (cannot)
convey hardware checksum status(CHECKSUM_UNNECESSARY) for such
cases to upper stack and will not maintain the RX L3/L4 checksum
counters as well.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:41:57 -05:00
Nikita Yushchenko
f85de66663 net: fec: fix compile with CONFIG_M5272
Commit 80cca775cd ("net: fec: cache statistics while device is down")
introduced unconditional statistics-related actions.

However, when driver is compiled with CONFIG_M5272, staticsics-related
definitions do not exist, which results into build errors.

Fix that by adding explicit handling of !defined(CONFIG_M5272) case.

Fixes: 80cca775cd ("net: fec: cache statistics while device is down")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:40:15 -05:00
Venkat Duvvuru
d14584d919 be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
OPCODE_COMMON_GET_FN_PRIVILEGES is returning only DEVSEC
privilege (Unrestricted Administrative Privilege) for Lancer NIC functions.
So, driver is failing SET_HSW_CONFIG command, as DEVSEC privilege was not
set in the privilege bitmap. This patch fixes the problem by setting DEVSEC
privilege in SET_HSW_CONFIG’s privilege bitmap.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:39:41 -05:00
Andy Lutomirski
e37e2ff350 virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
With CONFIG_VMAP_STACK=y, virtnet_set_mac_address() can be passed a
pointer to the stack and it will OOPS.  Copy the address to the heap
to prevent the crash.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Laura Abbott <labbott@redhat.com>
Reported-by: zbyszek@in.waw.pl
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:38:43 -05:00
Ivan Khoronzhuk
48e0a83ece net: ethernet: ti: cpsw: fix early budget split
The budget split function requires the phy speed to be known.
While ndo open a phy speed identification is postponed till the
moment link is up. Hence, move it to appropriate callback, when link
is up.

Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Fixes: 8feb0a1965 ("net: ethernet: ti: cpsw: split tx budget according between channels")
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:37:21 -05:00
Florian Westphal
343dfaa198 Revert "dctcp: update cwnd on congestion event"
Neal Cardwell says:
 If I am reading the code correctly, then I would have two concerns:
 1) Has that been tested? That seems like an extremely dramatic
    decrease in cwnd. For example, if the cwnd is 80, and there are 40
    ACKs, and half the ACKs are ECE marked, then my back-of-the-envelope
    calculations seem to suggest that after just 11 ACKs the cwnd would be
    down to a minimal value of 2 [..]
 2) That seems to contradict another passage in the draft [..] where it
    sazs:
       Just as specified in [RFC3168], DCTCP does not react to congestion
       indications more than once for every window of data.

Neal is right.  Fortunately we don't have to complicate this by testing
vs. current rtt estimate, we can just revert the patch.

Normal stack already handles this for us: receiving ACKs with ECE
set causes a call to tcp_enter_cwr(), from there on the ssthresh gets
adjusted and prr will take care of cwnd adjustment.

Fixes: 4780566784 ("dctcp: update cwnd on congestion event")
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:34:24 -05:00
David S. Miller
b1df0f5cea Merge branch 'mv88e6xxx-rework-reset-and-PPU-code'
Vivien Didelot says:

====================
net: dsa: mv88e6xxx: rework reset and PPU code

Old Marvell chips (like 88E6060) don't have a PHY Polling Unit (PPU).

Next chips (like 88E6185) have a PPU, which has exclusive access to the
PHY registers, thus must be disabled before access.

Newer chips (like 88E6352) have an indirect mechanism to access the PHY
registers whenever, thus loose control over the PPU (always enabled).

Here's a summary:

Model | PPU? | Has PPU ctrl?  | PPU state readable? | PHY access
----- | ---- | -------------- | ------------------- | ----------
 6060 | no   | no             | no                  | direct
 6185 | yes  | yes, PPUEn bit | yes, PPUState 2-bit | direct w/ PPU dis.
 6352 | yes  | no             | yes, PPUState 1-bit | indirect
 6390 | yes  | no             | yes, InitState bit  | indirect

Depending on the PPU control, a switch may have to restart the PPU when
resetting the switch. Once the switch is reset, we must wait for the PPU
state to be active polling again before accessing the registers.

For that purpose, add new operations to the chips to enable/disable the
PPU, and execute software reset. With these new ops in place, rework the
switch reset code and finally get rid of the MV88E6XXX_FLAG_PPU* flags.

Changes in v3:
  - consider 6097 as 6352 (no PPU ops and use mv88e6352_g1_reset).

Changes in v2:
  - wait in ppu/reset ops so that ppu_polling is not needed anymore.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:32:29 -05:00
Vivien Didelot
a199d8b695 net: dsa: mv88e6xxx: add PPU operations
Some Marvell chips can enable/disable the PPU on demand. This is needed
to access the PHY registers when there is no indirection mechanism.

Add two new ppu_enable and ppu_disable ops to describe this and finally
get rid of the MV88E6XXX_FLAG_PPU* flags.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:32:28 -05:00
Vivien Didelot
17e708baf7 net: dsa: mv88e6xxx: add a soft reset operation
Marvell chips have different way to issue a software reset.

Old chips (such as 88E6060) have a reset bit in an ATU control register.

Newer chips moved this bit in a Global control register. Chips with
controllable PPU should reset the PPU when resetting the switch.

Add a new reset operation to implement these differences and introduce a
mv88e6xxx_software_reset() helper to wrap it conveniently.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:32:28 -05:00
Vivien Didelot
309eca6db9 net: dsa: mv88e6xxx: add helper to hardware reset
Add an helper to toggle the eventual GPIO connected to the reset pin.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:32:28 -05:00
Vivien Didelot
4ac4b5a623 net: dsa: mv88e6xxx: add helper to disable ports
Before resetting a switch, the ports should be set to the Disabled state
and the transmit queues should be drained.

Add an helper to explicit that.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:32:28 -05:00
Joerg Roedel
1465f48146 Merge branches 'arm/mediatek', 'arm/smmu', 'x86/amd', 's390', 'core' and 'arm/exynos' into next 2016-12-06 17:32:16 +01:00
David S. Miller
9f9ffdffe9 Merge branch 'Alacritech-SLIC-driver'
Lino Sanfilippo says:

====================
Gigabit ethernet driver for Alacritechs SLIC devices (v4)

this is the forth version of the slicoss gigabit ethernet driver (which is a
rework of the driver from Alacritech which can currently be found under
drivers/staging/slicoss). The driver is supposed to support Mojave, Oasis and
Kalahari cards, for both copper and fiber.

If this code is accepted the staging version can be removed.

The driver has been tested on a SEN2104ET adapter (4 Port PCIe copper).

v4:
- fix wrong driver name in Kconfig file (reported by Rami Rosen)
- remove unused variable from driver struct (reported by Rami Rosen)
- return "err" instead of 0 in slic_load_rcvseq_firmware() (reported by Rami Rosen)
- Fix typos in constants, comments and error message (reported by Markus Böhme)
- fix various warnings concerning signedness (reported by Markus Böhme)
- improve line formatting (reported by Markus Böhme)
- add comment describing the need for SLIC_MAX_TX_COMPLETIONS (suggested by Florian Fainelli)
- do not zero out complete rx descriptor (suggested by Florian Fainelli)
- add missing write barrier (reported by Florian Fainelli)
- remove unneeded assignment of net_device to skb (reported by Florian Fainelli)
- use napi_complete_done() instead of napi_complete (suggested by Florian Fainelli)
- use napi_schedule_irqoff() instead of napi_schedule (suggested by Florian Fainelli)
- do not map error returned by slic_init() to -ENOMEM
- do proper dma syncs before and after rx descriptor status is set to 0
- if after dma sync for CPU rx descriptor is not used return it to HW by means of dma sync for device

v3:
- dont add defines to pci_ids.h but instead put it into the drivers header file
(requested by Greg Kroah-Hartman)

v2:
- remove unusual padding in statistic strings (suggested by Andrew Lunn)
- for mdio register and bit names use defines from mii.h instead of own ones
  (suggested by Andrew Lunn)
- remove unused defines
- ensure PCI flush at two more places
- use mmiowb before lock to prevent mmio writes leaking out of lock
- fix some typos in comments
- add copyright and GPL header
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:24:29 -05:00
Lino Sanfilippo
b956702738 MAINTAINERS: add entry for slicoss ethernet driver
Add myself as maintainer for the slicoss ethernet driver.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:24:28 -05:00
Lino Sanfilippo
60c140df15 net: ethernet: slicoss: add slicoss gigabit ethernet driver
Add driver for Alacritech gigabit ethernet cards with SLIC (session-layer
interface control) technology. The driver provides basic support without
SLIC for the following devices:

- Mojave cards (single port PCI Gigabit) both copper and fiber
- Oasis cards (single and dual port PCI-x Gigabit) copper and fiber
- Kalahari cards (dual and quad port PCI-e Gigabit) copper and fiber

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:24:28 -05:00
Jiri Olsa
8ac1eb7bab perf tools: Move perf build related variables under non fixdep leg
Because there's no need for them in fixdep build.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:03 -03:00
Jiri Olsa
abb26210a3 perf tools: Force fixdep compilation at the start of the build
The fixdep tool needs to be built before everything else, because it fixes
every object dependency file.

We handle this currently by making all objects to depend on fixdep, which is
error prone and is easily forgotten when new object is added.

Instead of this, this patch force fixdep tool to be built as the first target
in the separate make session. This way we don't need to handle extra fixdep
dependencies and we are certain there's no fixdep race with any parallel make
job.

Committer notes:

Testing it:

Before:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/json.o
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
    HOSTCC   /tmp/build/perf/pmu-events/jevents.o
    HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
    PERF_VERSION = 4.9.rc8.g868cd5
    CC       /tmp/build/perf/perf-read-vdso32
  <SNIP>

After:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    MKDIR    /tmp/build/perf/fd/
    CC       /tmp/build/perf/fd/array.o
    LD       /tmp/build/perf/fd/libapi-in.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/event-parse.o
    CC       /tmp/build/perf/fs/fs.o
    PERF_VERSION = 4.9.rc8.g57a92f
    CC       /tmp/build/perf/event-plugin.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/fs/tracing_path.o
  <SNIP>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:02 -03:00