Commit Graph

20574 Commits

Author SHA1 Message Date
Ding Tianhong
f4986d250a Revert commit 1a8b6d76dc ("net:add one common config...")
The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added
to indicate that Relaxed Ordering Attributes (RO) should not
be used for Transaction Layer Packets (TLP) targeted toward
these affected Root Port, it will clear the bit4 in the PCIe
Device Control register, so the PCIe device drivers could
query PCIe configuration space to determine if it can send
TLPs to Root Port with the Relaxed Ordering Attributes set.

With this new flag  we don't need the config ARCH_WANT_RELAX_ORDER
to control the Relaxed Ordering Attributes for the ixgbe drivers
just like the commit 1a8b6d76dc ("net:add one common config...") did,
so revert this commit.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09 07:43:06 -07:00
Sabrina Dubroca
a39221ce96 ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register
In ixgbe_clear_udp_tunnel_port(), we read the IXGBE_VXLANCTRL register
and then try to mask some bits out of the value, using the logical
instead of bitwise and operator.

Fixes: a21d0822ff ("ixgbe: add support for geneve Rx offload")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09 07:43:06 -07:00
Mark D Rustad
e0f06bba96 ixgbe: Return error when getting PHY address if PHY access is not supported
In cases where PHY register access is not supported, don't mislead
a caller into thinking that it is supported by returning a PHY
address. Instead, return -EOPNOTSUPP when PHY access is not
supported.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09 07:43:06 -07:00
Christos Gkekas
c49c777f9c qed: Delete redundant check on dcb_app priority
dcb_app priority is unsigned thus checking whether it is less than zero
is redundant.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Acked-By: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:21:02 -07:00
Christos Gkekas
c778c32118 net: ethernet: stmmac: Clean up dead code
Many macros in dwmac-ipq806x are unused and should be removed.
Moreover gmac->id is an unsigned variable and therefore checking
whether it is less than zero is redundant.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:19:07 -07:00
David S. Miller
51a0c00c6b Merge tag 'mlx5-updates-2017-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
Mellanox, mlx5 updates 2017-10-06

This series includes some shared code updates for kernel 4.15 to both
net-next and rdma-next trees.

The series includes mlx5 low level flow steering updates and optimizations
to support firmware command parallelism for flow steering requests from
Maor Gottlieb and two other small fixes from Matan and Maor.

One fix from Matan adds error handling for when the destination
list of the flow steering rule is full.

Maor introduced a patch to avoid NULL pointer dereference on steering cleanup.

Then Some refactoring patches needed by the series for code sharing purposes.
and split the Flow Table Entry (FTE) and Flow Group (FG) creation code to two parts:
    1) Object allocation - allocate the steering node and initialize
    its resources.

    2) The firmware command execution.

This change will give us the ability to take write lock on the
parent node (e.g. FG for FTE creating) only on the software data struct allocation
and creation part of the procedure where the synchronization is really required,
and will allow us to execute multiple firmware commands simultaneously and overcome the
firmware bottleneck.

Refactor the locking scheme of the mlx5 core flow steering as follows:

1) Replace the mutex lock with readers-writers semaphore and take
    the write lock only when necessary (e.g. allocating a new flow
    table entry index or adding a node to the parent's children list).
    When we try to find a suitable child in the parent's children list
    (e.g. search for flow group with the same match_criteria of the rule)
    then we only take the read lock.

2) Add versioning mechanism - each steering entity (FT, FG, FTE, DST)
    will have an incremental version. The version is increased when the
    entity is changed (e.g. when a new FTE was added to FG - the FG's
    version is increased).
    Versioning is used in order to determine if the last traverse of an
    entity's children is valid or a rescan under write lock is required.

Last patch adds FGs and FTEs memory pool, It is useful because these objects
are not small and could be allocated/deallocated many times.

This support improves the insertion rate of steering rules
from ~5k/sec to ~40k/sec.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:07:11 -07:00
Ido Schimmel
9b63ef88d3 mlxsw: spectrum: Propagate extack further for bridge enslavements
The code that actually takes care of bridge offload introduces a few
more non-trivial constraints with regards to bridge enslavements.
Propagate extack there to indicate the reason.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add link enp1s0np1 name enp1s0np1.20 type vlan id 20
$ ip link add name br0 type bridge
$ ip link set dev enp1s0np1.10 master br0
$ ip link set dev enp1s0np1.20 master br0
Error: spectrum: Can not bridge VLAN uppers of the same port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:07:21 -07:00
Ido Schimmel
c1f2c6d025 mlxsw: spectrum: Add extack for VLAN enslavements
Similar to physical ports, enslavement of VLAN devices can also fail.
Use extack to indicate why the enslavement failed.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add name bond0 type bond mode 802.3ad
$ ip link set dev enp1s0np1.10 master bond0
Error: spectrum: VLAN devices only support bridge and VRF uppers.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:07:21 -07:00
Ido Schimmel
a69518cf0b mlxsw: spectrum_router: Avoid expensive lookup during route removal
In commit fc922bb0dd ("mlxsw: spectrum_router: Use one LPM tree for
all virtual routers") I increased the scale of supported VRFs by having
all of them share the same LPM tree.

In order to avoid look-ups for prefix lengths that don't exist, each
route removal would trigger an aggregation across all the active virtual
routers to see which prefix lengths are in use and which aren't and
structure the tree accordingly.

With the way the data structures are currently laid out, this is a very
expensive operation. When preformed repeatedly - due to the invocation
of the abort mechanism - and with enough VRFs, this can result in a hung
task.

For now, avoid this optimization until it can be properly re-added in
net-next.

Fixes: fc922bb0dd ("mlxsw: spectrum_router: Use one LPM tree for all virtual routers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:05:27 -07:00
Jonathan Toppins
0d7b70e836 bnxt_en: don't consider building bnxt_tc.o if option not enabled
Instead of zeroing out bnxt_tc.c with a #ifdef foo, instead don't compile
the file when the option is not enabled. Now make and the preprocessor do
not have to waste time compiling a no-op.

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 03:00:26 +01:00
David S. Miller
f5333f80c3 Merge branch '40GbE' of ra.kernel.org:/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-10-06

This series contains updates to i40e and i40evf only.

Rami fixes a typo in the code comments.

Mitch adds an ethtool private flag to control source pruning to resolve an
issue where our default behavior is to enable source pruning which breaks ARP
monitoring in channel bonding.  Fixes a couple of register definitions, which
were incorrect.

Jake fixes an issue with multiple logical CPUs per core (simultaneous
multithreading - SMT) and how we set an affinity hint based on the v_idx of
that q_vector, which is an incremental value and might lead to multiple
offline CPUs being assigned to a q_vector.  Instead, we should only assign
hints for CPUs which are online, so look to use cpumask_local_spread().
Also fixed a VF VLAN tag stripping issue, where the flag created to change
this feature was seen as unchangeable.  Lastly, organized and re-numbered
the feature flags.

Alan re-enables PTP L4 for XL710 devices with firmware version 6.0 or
greater, now that the previous bug in the older firmware is fixed.
Implements the PCI error handlers for reset_prepare() and reset_done() to
allow us to handle function level resets.

Alice cleans up code that was added to the incorrect function during a
merge.

Filip adds a change to display an error message when a module is inserted
that does not meet the thermal requirements, Talking Heads "Burning Down
the House" comes to mind.  Also fixed a flow director filter issue where
a variable was not being cleared which stores the filter number to be
removed from the list when the firmware refused to add the requested
filter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 23:29:38 +01:00
Bjorn Helgaas
d2746fe538 bnx2x: Use pci_ari_enabled() instead of local copy
Use pci_ari_enabled() from the PCI core instead of the identical local copy
bnx2x_ari_enabled().  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 10:06:06 -07:00
Pieter Jansen van Vuuren
f8b7b0a6b1 nfp: add set tcp and udp header action flower offload
Previously we did not have offloading support for set TCP/UDP actions. This
patch enables TC flower offload of set TCP/UDP sport and dport actions.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren
354b82bb32 nfp: add set ipv6 source and destination address
Previously we did not have offloading support for set IPv6 actions. This
patch enables TC flower offload of set IPv6 src and dst address actions.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren
c0b1bd9a8b nfp: add set ipv4 header action flower offload
Previously we did not have offloading support for set IPv4 actions. This
patch enables TC flower offload of set IPv4 src and dst address actions.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren
da83d8fe58 nfp: add set ethernet header action flower offload
Previously we did not have offloading support for set ethernet actions.
This patch enables TC flower offload of set ethernet actions.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren
fc53b4a701 nfp: add IPv6 ttl and tos match offloading support
Previously matching on IPv6 ttl and tos fields were not offloaded. This
patch enables offloading IPv6 ttl and tos as match fields.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren
a1e9203cc6 nfp: add IPv4 ttl and tos match offloading support
Previously matching on IPv4 ttl and tos fields were not offloaded. This
patch enables offloading IPv4 ttl and tos as match fields.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren
bb055c198d nfp: add mpls match offloading support
Previously MPLS match offloading was not supported. This patch enables
MPLS match offloading support for label, bos and tc fields.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 09:56:35 -07:00
Jacob Keller
b74f571f59 i40e/i40evf: organize and re-number feature flags
Now that we've reduced the number of flags, organize similar flags
together and re-number them accordingly.

Since we don't yet have more than 32 flags, we'll use a u32 for both the
hw_features and flag field. Should we gain more flags in the future, we
may need to convert to a u64 or separate flags out into two fields.

One alternative approach considered, but not implemented here, was to
use an enumeration for the flag variables, and create a macro
I40E_FLAG() which used string concatenation to generate BIT_ULL values.
This has the advantage of making the actual bit values compile-time
dynamic so that we do not need to worry about matching the order to the
bit value. However, this does produce a high level of code churn, and
makes it more difficult to read a dumped flags value when debugging.

Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Jacob Keller
a5340d933e i40e: ignore skb->xmit_more when deciding to set RS bit
Since commit 6a7fded776 ("i40e: Fix RS bit update in Tx path and
disable force WB workaround") we've tried to "optimize" setting the
RS bit based around skb->xmit_more. This same logic was refactored
in commit 1dc8b53879 ("i40e: Reorder logic for coalescing RS bits"),
but ultimately was not functionally changed.

Using skb->xmit_more in this way is incorrect, because in certain
circumstances we may see a large number of skbs in sequence with
xmit_more set. This leads to a performance loss as the hardware does not
writeback anything for those packets, which delays the time it takes for
us to respond to the stack transmit requests. This significantly impacts
UDP performance, especially when layered with multiple devices, such as
bonding, VLANs, and vnet setups.

This was not noticed until now because it is difficult to create a setup
which reproduces the issue. It was discovered in a UDP_STREAM test in
a VM, connected using a vnet device to a bridge, which is connected to
a bonded pair of X710 ports in active-backup mode with a VLAN. These
layered devices seem to compound the number of skbs transmitted at once
by the qdisc. Additionally, the problem can be masked by reducing the
ITR value.

Since the original commit does not provide strong justification for this
RS bit "optimization", revert to the previous behavior of setting the RS
bit every 4th packet.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Jacob Keller
0a3b4f702f i40evf: enable support for VF VLAN tag stripping control
A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag
stripping control") added support for VFs to negotiate the control of
VLAN tag stripping. This should have allowed VFs to disable the feature.
Unfortunately, the flag was set only in netdev->feature flags and not in
netdev->hw_features.

This ultimately causes the stack to assume that it cannot change the
flag, so it was unchangeable and marked as [fixed] in the ethtool -k
output.

Fix this by setting the feature in hw_features first, just as we do for
the PF code. This enables ethtool -K to disable the feature correctly,
and fully enables user control of the VLAN tag stripping feature.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Mariusz Stachura
052b93d0c2 i40e: do not enter PHY debug mode while setting LEDs behaviour
Previous implementation of LED set/get functions required to enter
PHY debug mode, in order to prevent access to it from FW and SW at
the same time. Reset of all ports was a unwanted side effect.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Alan Brady
19b7960b2d i40e: implement split PCI error reset handler
This patch implements the PCI error handler reset_prepare and reset_done.
This allows us to handle function level reset.  Without this patch we
are unable to perform and recover from an FLR correctly and this will cause
VFs to be unable to recover from an FLR on the PF.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Filip Sadowski
013df598d6 i40e: Properly maintain flow director filters list
When there is no space for more flow director filters and user requested to
add a new one it is rejected by firmware and automatically removed from the
filter list maintained by driver. This behaviour is correct. Afterwards
existing filter can be removed making free slot for the new one. This
however causes the newly added filter to be accepted by firmware but
removed from driver filter list resulting in not showing after issuing
'ethtool -n <dev_name>'.

This happened due to not clearing the variable pf->fd_inv which stores
filter number to be removed from the list when firmware refused to add the
requested filter. It caused the filter with this specific ID to be
constantly removed once it was added to the list although it has been
accepted by firmware and effectively applied to the NIC.
It was fixed by clearing pf->fd_inv variable after removal of the filter
from the list when it was rejected by firmware.

Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:32 -07:00
Filip Sadowski
9a858178ef i40e: Display error message if module does not meet thermal requirements
This patch causes error message to be displayed when NIC detects
insertion of module that does not meet thermal requirements.

Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Alice Michael
7f66182263 i40e: fix merge error
This patch removes some code that was accidentally added to
the wrong function with a merge error.  Fixes: c53934c6d1
("i40e: fix: do not sleep in netdev_ops")

Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Jesse Brandeburg
bd6cd4e6dd i40e/i40evf: use DECLARE_BITMAP for state
When using set_bit and friends, we should be using actual
bitmaps, and fix all the locations where we might access
it.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Mitch Williams
0a0d9af5bc i40e: fix incorrect register definition
This register was defined incorrectly. Fix the increment value to 8, and
replace the iterator with _i to make the definition consistent with
other statistics registers.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Mitch Williams
60518a0489 i40e: redfine I40E_PHY_TYPE_MAX
Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with
some sort of bit-shifting, it should only include actual PHY types and
not error cases. Move it up in the enum declaration so that loops only
iterate across valid PHY types.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Alan Brady
c3d26b75c2 i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0
Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug.  The
bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be
re-enabled on those devices with updated firmware.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Jacob Keller
be664cbefc i40e/i40evf: spread CPU affinity hints across online CPUs only
Currently, when setting up the IRQ for a q_vector, we set an affinity
hint based on the v_idx of that q_vector. Meaning a loop iterates on
v_idx, which is an incremental value, and the cpumask is created based
on this value.

This is a problem in systems with multiple logical CPUs per core (like in
simultaneous multithreading (SMT) scenarios). If we disable some logical
CPUs, by turning SMT off for example, we will end up with a sparse
cpu_online_mask, i.e., only the first CPU in a core is online, and
incremental filling in q_vector cpumask might lead to multiple offline
CPUs being assigned to q_vectors.

Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.

In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1)  where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.

Instead, we should only assign hints for CPUs which are online. Even
better, the kernel already provides a function, cpumask_local_spread()
which takes an index and returns a CPU, spreading the interrupts across
local NUMA nodes first, and then remote ones if necessary.

Since we generally have a 1:1 mapping between vectors and CPUs, there
is no real advantage to spreading vectors to local CPUs first. In order
to avoid mismatch of the default XPS hints, we'll pass -1 so that it
spreads across all CPUs without regard to the node locality.

Note that we don't need to change the q_vector->affinity_mask as this is
initialized to cpu_possible_mask, until an actual affinity is set and
then notified back to us.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Mitch Williams
64615b5418 i40e: add private flag to control source pruning
By default, our devices do source pruning, that is, they drop receive
packets that have the source MAC matching one of the receive filters.
Unfortunately, this breaks ARP monitoring in channel bonding, as the
bonding driver expects devices to receive ARPs containing their own
source address.

Add an ethtool private flag to control this feature.

Also, remove the netif_running() check when we process our private
flags. It's OK to reset when the device is closed and in most cases we
need the reset the apply these changes.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Rami Rosen
ec2f25d203 i40e: fix a typo in i40e_pf documentation
This patch fixes a typo in i40e_pf object documentation; num_req_vfs
refers to the number of VFs requested for the PF.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06 08:11:31 -07:00
Colin Ian King
d009313c99 net: qcom/emac: make function emac_isr static
The function emac_isr is local to the source and does not need to
be in global scope, so make it static.

Cleans up sparse warnings:
symbol 'emac_isr' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 21:27:02 -07:00
David S. Miller
53954cf8c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 18:19:22 -07:00
David Ahern
e58376e1df mlxsw: spectrum: Add extack messages for enslave failures
mlxsw fails device enslavement for a number of reasons. Use the extack
facility to return an error message to the user stating why the enslave
is failing.

Messages are prefixed with "spectrum" so users know it is a constraint
imposed by the hardware driver. For example:
    $ ip li add br0.11 link br0 type vlan id 11
    $ ip li set swp11 master br0
    Error: spectrum: Enslaving a port to a device that already has an upper device is not supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 21:39:34 -07:00
David Ahern
42ab19ee90 net: Add extack to upper device linking
Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 21:39:33 -07:00
Colin Ian King
ebf6b13142 cxgb4vf: make a couple of functions static
The functions t4vf_link_down_rc_str and t4vf_handle_get_port_info are
local to the source and do not need to be in global scope, so make
them static.

Cleans up sparse warnings:
symbol 't4vf_link_down_rc_str' was not declared. Should it be static?
symbol 't4vf_handle_get_port_info' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 10:31:55 -07:00
Simon Horman
4d86d38186 ravb: RX checksum offload
Add support for RX checksum offload. This is enabled by default and
may be disabled and re-enabled using ethtool:

 # ethtool -K eth0 rx off
 # ethtool -K eth0 rx on

The RAVB provides a simple checksumming scheme which appears to be
completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
the L2 header is appended to packet data; this may be trivially read by the
driver and used to update the skb accordingly.

In terms of performance throughput is close to gigabit line-rate both with
and without RX checksum offload enabled. Perf output, however, appears to
indicate that significantly less time is spent in do_csum(). This is as
expected.

Test results with RX checksum offload enabled:
 # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
 enable_enobufs failed: getprotobyname
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384  16384    10.00     937.54

 Summary of output of perf report:
    18.28%      ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
    10.34%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
     9.83%      ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
     7.89%      ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
     4.01%      ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
     3.37%          netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
     3.17%          swapper  [kernel.kallsyms]  [k] arch_cpu_idle
     2.55%          swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
     2.04%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
     2.03%          swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irq
     1.96%      ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
     1.59%      ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83

Test results without RX checksum offload enabled:
 # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
 enable_enobufs failed: getprotobyname
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384  16384    10.00     940.20

 Summary of output of perf report:
    17.10%    ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
    10.99%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
     8.87%    ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
     8.16%    ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
     7.42%    ksoftirqd/0  [kernel.kallsyms]  [k] do_csum
     3.91%    ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
     2.31%        swapper  [kernel.kallsyms]  [k] arch_cpu_idle
     2.16%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
     2.14%    ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
     1.93%        netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
     1.79%        swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
     1.63%    ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83

Above results collected on an R-Car Gen 3 Salvator-X/r8a7796 ES1.0.
Also tested on a R-Car Gen 3 Salvator-X/r8a7795 ES1.0.

By inspection this also appears to be compatible with the ravb found
on R-Car Gen 2 SoCs, however, this patch is currently untested on such
hardware.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 10:26:05 -07:00
Ganesh Goudar
acd669a8f6 cxgb4: add new T6 pci device id's
Add 0x6085 T6 device id.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 21:42:39 -07:00
David S. Miller
af14827fa3 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2017-10-03

This series contains updates to fm10k only.

Jake provides majority of the changes in this series, starting with using
fm10k_prepare_for_reset() if we lose PCIe link.  Before we would detach
the device and close the netdev, which left a lot of items still active,
such as the Tx/Rx resources.  This could cause problems where register
reads would return potentially invalid values and would result in unknown
driver behavior, so call fm10k_prepare_for_reset() much like we do for
suspend/resume cycles.  This will attempt to shutdown as much as possible
to prevent possible issues.  Then replaced the PCI specific legacy power
management hooks with the new generic power management hooks for both
suspend and hibernate.  Introduced a workqueue item which monitors a
queue of MAC and VLAN requests since a large number of MAC address or
VLAN updates at once can overload the mailbox with too many messages at
once.  Fixed a cppcheck warning by properly declaring the min_rate and
max_rate variables in the declaration and definition for .ndo_set_vf_bw,
rather than using "unused" for the minimum rates.

Joe Perches fixes the backward logic when using net_ratelimit().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:25:05 -07:00
David Wu
05946876f0 net: stmmac: dwmac-rk: Add RK3128 GMAC support
Add constants and callback functions for the dwmac on rk3128 soc.
As can be seen, the base structure is the same, only registers
and the bits in them moved slightly.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 15:39:56 -07:00
Dan Carpenter
b5c7d4e54c mlxsw: spectrum: Add missing error code on allocation failure
We accidentally return success if the kmalloc_array() call fails.

Fixes: 0e14c7777a ("mlxsw: spectrum: Add the multicast routing hardware logic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:26:58 -07:00
Dan Carpenter
b508e0b6e4 mlxsw: spectrum: Fix check for IS_ERR() instead of NULL
mlxsw_afa_block_create() doesn't return error pointers, it returns NULL
on error.

Fixes: 0e14c7777a ("mlxsw: spectrum: Add the multicast routing hardware logic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:26:58 -07:00
Yotam Gigi
f60c254998 mlxsw: spectrum: mr: Support trap-and-forward routes
Add the support of trap-and-forward route action in the multicast routing
offloading logic. A route will be set to trap-and-forward action if one (or
more) of its output interfaces is not offload-able, i.e. does not have a
valid Spectrum RIF.

This way, a route with mixed output VIFs list, which contains both
offload-able and un-offload-able devices can go through partial offloading
in hardware, and the rest will be done in the kernel ipmr module.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:06:30 -07:00
Yotam Gigi
607feadef8 mlxsw: spectrum: mr_tcam: Add trap-and-forward multicast route
In addition to the current multicast route actions, which include trap
route action and a forward route action, add the trap-and-forward multicast
route action, and implement it in the multicast routing hardware logic.

To implement that, add a trap-and-forward ACL action as the last action in
the route flexible action set. The used trap is the ACL2 trap, which marks
the packets with offload_mr_forward_mark, to prevent the packet from being
forwarded again by the kernel.

Note: At that stage the offloading logic does not support trap-and-forward
multicast routes. This patch adds the support only in the hardware logic.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:06:30 -07:00
Yotam Gigi
a0040c8c93 mlxsw: spectrum: Add trap for multicast trap-and-forward routes
When a multicast route is configured with trap-and-forward action, the
packets should be marked with skb->offload_mr_fwd_mark, in order to prevent
the packets from being forwarded again by the kernel ipmr module.

Due to this, it is not possible to use the already existing multicast trap
(MLXSW_TRAP_ID_ACL1) as the packet should be marked differently. Add the
MLXSW_TRAP_ID_ACL2 which is for trap-and-forward multicast routes, and set
the offload_mr_fwd_mark skb field in its handler.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:06:30 -07:00
Yotam Gigi
2678724355 mlxsw: acl: Introduce ACL trap and forward action
Use trap/discard flex action to implement trap and forward. The action will
later be used for multicast routing, as the multicast routing mechanism is
done using ACL flexible actions in Spectrum hardware. Using that action, it
will be possible to implement a trap-and-forward route.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 10:06:30 -07:00
Arjun Vynipadath
a047fbae23 cxgb4: Update comment for min_mtu
We have lost a comment for minimum mtu value set for netdevice with
'commit d894be57ca ("ethernet: use net core MTU range checking in
more drivers"). Updating it accordingly.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 09:55:41 -07:00