Commit Graph

74996 Commits

Author SHA1 Message Date
Michael Chan
845adfe40c bnxt_en: Improve valid bit checking in firmware response message.
When firmware sends a DMA response to the driver, the last byte of the
message will be set to 1 to indicate that the whole response is valid.
The driver waits for the message to be valid before reading the message.

The firmware spec allows these response messages to increase in
length by adding new fields to the end of these messages.  The
older spec's valid location may become a new field in a newer
spec.  To guarantee compatibility, the driver should zero the valid
byte before interpreting the entire message so that any new fields not
implemented by the older spec will be read as zero.

For messages that are forwarded to VFs, we need to set the length
and re-instate the valid bit so the VF will see the valid response.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
596f9d55fe bnxt_en: Improve resource accounting for SRIOV.
When VFs are created, the current code subtracts the maximum VF
resources from the PF's pool.  This under-estimates the resources
remaining in the PF pool.  Instead, we should subtract the minimum
VF resources.  The VF minimum resources are guaranteed to the VFs
and only these should be subtracted from the PF's pool.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
db4723b3cd bnxt_en: Check max_tx_scheduler_inputs value from firmware.
When checking for the maximum pre-set TX channels for ethtool -l, we
need to check the current max_tx_scheduler_inputs parameter from firmware.
This parameter specifies the max input for the internal QoS nodes currently
available to this function.  The function's TX rings will be capped by this
parameter.  By adding this logic, we provide a more accurate pre-set max
TX channels to the user.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Vasundhara Volam
00db3cba35 bnxt_en: Add extended port statistics support
Gather periodic extended port statistics, if the device is PF and
link is up.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Vasundhara Volam
699efed00d bnxt_en: Include additional hardware port statistics in ethtool -S.
Include additional hardware port statistics in ethtool -S, which
are useful for debugging.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Vasundhara Volam
746df13964 bnxt_en: Add support for ndo_set_vf_trust
Trusted VFs are allowed to modify MAC address, even when PF
has assigned one.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Scott Branden
2373d8d6a7 bnxt_en: fix clear flags in ethtool reset handling
Clear flags when reset command processed successfully for components
specified.

Fixes: 6502ad5963 ("bnxt_en: Add ETH_RESET_AP support")
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
abe93ad2e0 bnxt_en: Use a dedicated VNIC mode for RDMA.
If the RDMA driver is registered, use a new VNIC mode that allows
RDMA traffic to be seen on the netdev in promiscuous mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
1d3ef13dd4 bnxt_en: Adjust default rings for multi-port NICs.
Change the default ring logic to select default number of rings to be up to
8 per port if the default rings x NIC ports <= total CPUs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
d4f52de02f bnxt_en: Update firmware interface to 1.9.1.15.
Minor changes, such as new extended port statistics.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
David S. Miller
8bde261e53 Merge tag 'mlx5-updates-2018-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2018-03-30

This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.

First Four patches are trivial misc cleanups.
 - Spelling mistake fix
 - Dead code removal
 - Warning messages

RX optimizations for striding RQ:

1) RX refactoring, cleanups and micro optimizations
   - MTU calculation simplifications, obsoletes some WQEs-to-packets translation
     functions and helps delete ~60 LOC.
   - Do not busy-wait a pending UMR completion.
   - post the new values of UMR WQE inline, instead of using a data pointer.
   - use pre-initialized structures to save calculations in datapath.

2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages):
    - Saves a memcpy of the headers.
    - No page-boundary checks in datapath.
    - No filler CQEs.
    - Significantly smaller CQ.
    - SKB data continuously resides in linear part, and not split to
      small amount (linear part) and large amount (fragment).
      This saves datapath cycles in driver and improves utilization
      of SKB fragments in GRO.
    - The fragments of a resulting GRO SKB follow the IP forwarding
      assumption of equal-size fragments.

    implementation details:
    HW writes the packets to the beginning of a stride,
    i.e. does not keep headroom. To overcome this we make sure we can
    extend backwards and use the last bytes of stride i-1.
    Extra care is needed for stride 0 as it has no preceding stride.
    We make sure headroom bytes are available by shifting the buffer
    pointer passed to HW by headroom bytes.

    This configuration now becomes default, whenever capable.
    Of course, this implies turning LRO off.

    Performance testing:
    ConnectX-5, single core, single RX ring, default MTU.

    UDP packet rate, early drop in TC layer:

    --------------------------------------------
    | pkt size | before    | after     | ratio |
    --------------------------------------------
    | 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
    |  500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
    |   64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
    --------------------------------------------

    TCP streams: ~20% gain

3) Support XDP over Striding RQ:
    Now that linear SKB is supported over Striding RQ,
    we can support XDP by setting stride size to PAGE_SIZE
    and headroom to XDP_PACKET_HEADROOM.

    Striding RQ is capable of a higher packet-rate than
    conventional RQ.

    Performance testing:
    ConnectX-5, 24 rings, default MTU.
    CQE compression ON (to reduce completions BW in PCI).

    XDP_DROP packet rate:
    --------------------------------------------------
    | pkt size | XDP rate   | 100GbE linerate | pct% |
    --------------------------------------------------
    |   64byte | 126.2 Mpps |      148.0 Mpps |  85% |
    |  128byte |  80.0 Mpps |       84.8 Mpps |  94% |
    |  256byte |  42.7 Mpps |       42.7 Mpps | 100% |
    |  512byte |  23.4 Mpps |       23.4 Mpps | 100% |
    --------------------------------------------------

4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed.
   Without this bulking, we have:
    - no atomic ops on WQE allocation or free
    - one atomic op per SKB
    - In the default MTU configuration (1500, stride size is 2K),
      the non-bulking method execute 2 atomic ops as before
    - For larger MTUs with stride size of 4K, non-bulking method
      executes only a single op.
    - For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all.

    Performance testing:
    ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

    Single core packet rate (64 bytes).

    Early drop in TC: no degradation.

    XDP_DROP:
    before: 14,270,188 pps
    after:  20,503,603 pps, 43% improvement.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:31:43 -04:00
Haiyang Zhang
3be9b5fdc6 hv_netvsc: Clean up extra parameter from rndis_filter_receive_data()
The variables, msg and data, have the same value. This patch removes
the extra one.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:27:45 -04:00
Joe Perches
49b44aa23e ethernet: hisilicon: hns: hns_dsaf_mac: Use generic eth_broadcast_addr
Rather than use an on-stack array to copy a broadcast address, use
the generic eth_broadcast_addr function to save a trivial amount of
object code.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:26:43 -04:00
Kirill Tkhai
554873e517 net: Do not take net_rwsem in __rtnl_link_unregister()
This function calls call_netdevice_notifier(), which also
may take net_rwsem. So, we can't use net_rwsem here.

This patch makes callers of this functions take pernet_ops_rwsem,
like register_netdevice_notifier() does. This will protect
the modifications of net_namespace_list, and allows notifiers
to take it (they won't have to care about context).

Since __rtnl_link_unregister() is used on module load
and unload (which are not frequent operations), this looks
for me better, than make all call_netdevice_notifier()
always executing in "protected net_namespace_list" context.

Also, this fixes the problem we had a deal in 328fbe747a
"Close race between {un, }register_netdevice_notifier and ...",
and guarantees __rtnl_link_unregister() does not skip
exitting net.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:24:58 -04:00
Wei Yongjun
c679f6a26d net: hns3: remove unnecessary pci_set_drvdata() and devm_kfree()
There is no need for explicit calls of devm_kfree(), as the allocated
memory will be freed during driver's detach.

The driver core clears the driver data to NULL after device_release.
Thus, it is not needed to manually clear the device driver data to NULL.

So remove the unnecessary pci_set_drvdata() and devm_kfree().

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:22:25 -04:00
David Ahern
ef81710258 netdevsim: Change nsim_devlink_setup to return error to caller
Change nsim_devlink_setup to return any error back to the caller and
update nsim_init to handle it.

Requested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:22:10 -04:00
Vadim Lomovtsev
37c3347eb2 net: thunderx: add ndo_set_rx_mode callback implementation for VF
The ndo_set_rx_mode() is called from atomic context which causes
messages response timeouts while VF to PF communication via MSIx.
To get rid of that we're copy passed mc list, parse flags and queue
handling of kernel request to ordered workqueue.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:17 -04:00
Vadim Lomovtsev
1b6d55f239 net: thunderx: add workqueue control structures for handle ndo_set_rx_mode request
The kernel calls ndo_set_rx_mode() callback from atomic context which
causes messaging timeouts between VF and PF (as they’re implemented via
MSIx). So in order to handle ndo_set_rx_mode() we need to get rid of it.

This commit implements necessary workqueue related structures to let VF
queue kernel request processing in non-atomic context later.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:17 -04:00
Vadim Lomovtsev
aba4a2633b net: thunderx: add XCAST messages handlers for PF
This commit is to add message handling for ndo_set_rx_mode()
callback at PF side.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:17 -04:00
Vadim Lomovtsev
0b849f58f2 net: thunderx: add new messages for handle ndo_set_rx_mode callback
The kernel calls ndo_set_rx_mode() callback supplying it will all necessary
info, such as device state flags, multicast mac addresses list and so on.
Since we have only 128 bits to communicate with PF we need to initiate
several requests to PF with small/short operation each based on input data.

So this commit implements following PF messages codes along with new
data structures for them:
NIC_MBOX_MSG_RESET_XCAST to flush all filters configured for this
                          particular network interface (VF)
NIC_MBOX_MSG_ADD_MCAST   to add new MAC address to DMAC filter registers
                          for this particular network interface (VF)
NIC_MBOX_MSG_SET_XCAST   to apply filtering configuration to filter control
                          register

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:17 -04:00
Vadim Lomovtsev
ceb9ea21cc net: thunderx: add multicast filter management support
The ThunderX NIC could be partitioned to up to 128 VFs and thus
represented to system. Each VF is mapped to pair BGX:LMAC, and each of VF
is configured by kernel individually. Eventually the bunch of VFs could be
mapped onto same pair BGX:LMAC and thus could cause several multicast
filtering configuration requests to LMAC with the same MAC addresses.

This commit is to add ThunderX NIC BGX filtering manipulation routines.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:17 -04:00
Vadim Lomovtsev
3a34ecfd9d net: thunderx: add MAC address filter tracking for LMAC
The ThunderX NIC has two Ethernet Interfaces (BGX) each of them could has
up to four Logical MACs configured. Each of BGX has 32 filters to be
configured for filtering ingress packets. The number of filters available
to particular LMAC is from 8 (if we have four LMACs configured per BGX)
up to 32 (in case of only one LMAC is configured per BGX).

At the same time the NIC could present up to 128 VFs to OS as network
interfaces, each of them kernel will configure with set of MAC addresses
for filtering. So to prevent dupes in BGX filter registers from different
network interfaces it is required to cache and track all filter
configuration requests prior to applying them onto BGX filter registers.

This commit is to update LMAC structures with control fields to
allocate/releasing filters tracking list along with implementing
dmac array allocate/release per LMAC.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:16 -04:00
Vadim Lomovtsev
f8ad1f3f07 net: thunderx: move filter register related macro into proper place
The ThunderX NIC has set of registers which allows to configure
filter policy for ingress packets. There are three possible regimes
of filtering multicasts, broadcasts and unicasts: accept all, reject all
and accept filter allowed only.

Current implementation has enum with all of them and two generic macro
for enabling filtering et all (CAM_ACCEPT) and enabling/disabling
broadcast packets, which also should be corrected in order to represent
register bits properly. All these values are private for driver and
there is no need to ‘publish’ them via header file.

This commit is to move filtering register manipulation values from
header file into source with explicit assignment of exact register
values to them to be used while register configuring.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:18:16 -04:00
Martin Blumenstingl
7676693c68 net: stmmac: dwmac-meson8b: Add support for the Meson8m2 SoC
The Meson8m2 SoC uses a similar (potentially even identical) register
layout as the Meson8b and GXBB SoCs for the dwmac glue.
Add a new compatible string and update the module description to
indicate support for these SoCs.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 22:17:24 -04:00
Tariq Toukan
ab966d7e4f net/mlx5e: RX, Recycle buffer of UMR WQEs
Upon a new UMR post, check if the WQE buffer contains
a previous UMR WQE. If so, modify the dynamic fields
instead of a whole WQE overwrite. This saves a memcpy.

In current setting, after 2 WQ cycles (12 UMR posts),
this will always be the case.

No degradation sensed.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:55:07 -07:00
Tariq Toukan
b8a98a4cf3 net/mlx5e: Keep single pre-initialized UMR WQE per RQ
All UMR WQEs of an RQ share many common fields. We use
pre-initialized structures to save calculations in datapath.
One field (xlt_offset) was the only reason we saved a pre-initialized
copy per WQE index.
Here we remove its initialization (move its calculation to datapath),
and reduce the number of copies to one-per-RQ.

A very small datapath calculation is added, it occurs once per a MPWQE
(i.e. once every 256KB), but reduces memory consumption and gives
better cache utilization.

Performance testing:
Tested packet rate, no degradation sensed.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:55:07 -07:00
Tariq Toukan
9f9e9cd50e net/mlx5e: Remove page_ref bulking in Striding RQ
When many packets reside on the same page, the bulking of
page_ref modifications reduces the total number of atomic
operations executed.

Besides the necessary 2 operations on page alloc/free, we
have the following extra ops per page:
- one on WQE allocation (bump refcnt to maximum possible),
- zero ops for SKBs,
- one on WQE free,
a constant of two operations in total, no matter how many
packets/SKBs actually populate the page.

Without this bulking, we have:
- no ops on WQE allocation or free,
- one op per SKB,

Comparing the two methods when PAGE_SIZE is 4K:
- As mentioned above, bulking method always executes 2 operations,
  not more, but not less.
- In the default MTU configuration (1500, stride size is 2K),
  the non-bulking method execute 2 ops as well.
- For larger MTUs with stride size of 4K, non-bulking method
  executes only a single op.
- For XDP (stride size of 4K, no SKBs), non-bulking method
  executes no ops at all!

Hence, to optimize the flows with linear SKB and XDP over Striding RQ,
we here remove the page_ref bulking method.

Performance testing:
ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Single core packet rate (64 bytes).

Early drop in TC: no degradation.

XDP_DROP:
before: 14,270,188 pps
after:  20,503,603 pps, 43% improvement.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:55:06 -07:00
Tariq Toukan
22f4539881 net/mlx5e: Support XDP over Striding RQ
Add XDP support over Striding RQ.
Now that linear SKB is supported over Striding RQ,
we can support XDP by setting stride size to PAGE_SIZE
and headroom to XDP_PACKET_HEADROOM.

Upon a MPWQE free, do not release pages that are being
XDP xmit, they will be released upon completions.

Striding RQ is capable of a higher packet-rate than
conventional RQ.
A performance gain is expected for all cases that had
a HW packet-rate bottleneck. This is the case whenever
using many flows that distribute to many cores.

Performance testing:
ConnectX-5, 24 rings, default MTU.
CQE compression ON (to reduce completions BW in PCI).

XDP_DROP packet rate:
--------------------------------------------------
| pkt size | XDP rate   | 100GbE linerate | pct% |
--------------------------------------------------
|   64byte | 126.2 Mpps |      148.0 Mpps |  85% |
|  128byte |  80.0 Mpps |       84.8 Mpps |  94% |
|  256byte |  42.7 Mpps |       42.7 Mpps | 100% |
|  512byte |  23.4 Mpps |       23.4 Mpps | 100% |
--------------------------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:55:06 -07:00
Tariq Toukan
121e892754 net/mlx5e: Refactor RQ XDP_TX indication
Make the xdp_xmit indication available for Striding RQ
by taking it out of the type-specific union.
This refactor is a preparation for a downstream patch that
adds XDP support over Striding RQ.
In addition, use a bitmap instead of a boolean for possible
future flags.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:55:05 -07:00
Tariq Toukan
619a8f2a42 net/mlx5e: Use linear SKB in Striding RQ
Current Striding RQ HW feature utilizes the RX buffers so that
there is no wasted room between the strides. This maximises
the memory utilization.
This prevents the use of build_skb() (which requires headroom
and tailroom), and demands to memcpy the packets headers into
the skb linear part.

In this patch, whenever a set of conditions holds, we apply
an RQ configuration that allows combining the use of linear SKB
on top of a Striding RQ.

To use build_skb() with Striding RQ, the following must hold:
1. packet does not cross a page boundary.
2. there is enough headroom and tailroom surrounding the packet.

We can satisfy 1 and 2 by configuring:
	stride size = MTU + headroom + tailoom.

This is possible only when:
a. (MTU - headroom - tailoom) does not exceed PAGE_SIZE.
b. HW LRO is turned off.

Using linear SKB has many advantages:
- Saves a memcpy of the headers.
- No page-boundary checks in datapath.
- No filler CQEs.
- Significantly smaller CQ.
- SKB data continuously resides in linear part, and not split to
  small amount (linear part) and large amount (fragment).
  This saves datapath cycles in driver and improves utilization
  of SKB fragments in GRO.
- The fragments of a resulting GRO SKB follow the IP forwarding
  assumption of equal-size fragments.

Some implementation details:
HW writes the packets to the beginning of a stride,
i.e. does not keep headroom. To overcome this we make sure we can
extend backwards and use the last bytes of stride i-1.
Extra care is needed for stride 0 as it has no preceding stride.
We make sure headroom bytes are available by shifting the buffer
pointer passed to HW by headroom bytes.

This configuration now becomes default, whenever capable.
Of course, this implies turning LRO off.

Performance testing:
ConnectX-5, single core, single RX ring, default MTU.

UDP packet rate, early drop in TC layer:

--------------------------------------------
| pkt size | before    | after     | ratio |
--------------------------------------------
| 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
|  500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
|   64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
--------------------------------------------

TCP streams: ~20% gain

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:54:49 -07:00
Tariq Toukan
ea3886cab7 net/mlx5e: Use inline MTTs in UMR WQEs
When modifying the page mapping of a HW memory region
(via a UMR post), post the new values inlined in WQE,
instead of using a data pointer.

This is a micro-optimization, inline UMR WQEs of different
rings scale better in HW.

In addition, this obsoletes a few control flows and helps
delete ~50 LOC.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Tariq Toukan
e4d86a4a58 net/mlx5e: Do not busy-wait for UMR completion in Striding RQ
Do not busy-wait a pending UMR completion. Under high HW load,
busy-waiting a delayed completion would fully utilize the CPU core
and mistakenly indicate a SW bottleneck.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Tariq Toukan
18187fb2c3 net/mlx5e: Code movements in RX UMR WQE post
Gets the process of a UMR WQE post in one function,
in preparation for a downstream patch that inlines
the WQE data.
No functional change here.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Tariq Toukan
73281b78a3 net/mlx5e: Derive Striding RQ size from MTU
In Striding RQ, each WQE serves multiple packets
(hence called Multi-Packet WQE, MPWQE).
The size of a MPWQE is constant (currently 256KB).

Upon a ringparam set operation, we calculate the number of
MPWQEs per RQ. For this, first it is needed to determine the
number of packets that can reside within a single MPWQE.
In this patch we use the actual MTU size instead of ETH_DATA_LEN
for this calculation.

This implies that a change in MTU might require a change
in Striding RQ ring size.

In addition, this obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Tariq Toukan
472a1e44b3 net/mlx5e: Save MTU in channels params
Knowing the MTU is required for RQ creation flow.
By our design, channels creation flow is totally isolated
from priv/netdev, and can be completed with access to
channels params and mdev.
Adding the MTU to the channels params helps preserving that.
In addition, we save it in RQ to make its access faster in
datapath checks.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Talat Batheesh
533788988c net/mlx5e: IPoIB, Fix spelling mistake
Fix spelling mistake in debug message text.
"dettaching" -> "detaching"

Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Alaa Hleihel
6c75062823 net/mlx5: Change teardown with force mode failure message to warning
With ConnectX-4, we expect the force teardown to fail in case that
DC was enabled, therefore change the message from error to warning.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Saeed Mahameed
b2d3907c23 net/mlx5: Eliminate query xsrq dead code
1. This function is not used anywhere in mlx5 driver
2. It has a memcpy statement that makes no sense and produces build
warning with gcc8

drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq':
drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]

Fixes: 01949d0109 ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Saeed Mahameed
7b2117bb8f net/mlx5e: Use eq ptr from cq
Instead of looking for the EQ of the CQ, remove that redundant code and
use the eq pointer stored in the cq struct.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30 16:16:17 -07:00
Yelena Krivosheev
e81b5e01c1 net: mvneta: fix enable of all initialized RXQs
In mvneta_port_up() we enable relevant RX and TX port queues by write
queues bit map to an appropriate register.

q_map must be ZERO in the beginning of this process.

Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 14:27:47 -04:00
David Ahern
82dd0d2a9a vrf: Fix use after free and double free in vrf_finish_output
Miguel reported an skb use after free / double free in vrf_finish_output
when neigh_output returns an error. The vrf driver should return after
the call to neigh_output as it takes over the skb on error path as well.

Patch is a simplified version of Miguel's patch which was written for 4.9,
and updated to top of tree.

Fixes: 8f58336d3f ("net: Add ethernet header for pass through VRF device")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 14:20:23 -04:00
Raghu Vatsavayi
ccdd0b4c35 liquidio: prevent rx queues from getting stalled
This commit has fix for RX traffic issues when we stress test the driver
with continuous ifconfig up/down under very high traffic conditions.

Reason for the issue is that, in existing liquidio_stop function NAPI is
disabled even before actual FW/HW interface is brought down via
send_rx_ctrl_cmd(lio, 0). Between time frame of NAPI disable and actual
interface down in firmware, firmware continuously enqueues rx traffic to
host. When interrupt happens for new packets, host irq handler fails in
scheduling NAPI as the NAPI is already disabled.

After "ifconfig <iface> up", Host re-enables NAPI but cannot schedule it
until it receives another Rx interrupt. Host never receives Rx interrupt as
it never cleared the Rx interrupt it received during interface down
operation. NIC Rx interrupt gets cleared only when Host processes queue and
clears the queue counts. Above anomaly leads to other issues like packet
overflow in FW/HW queues, backpressure.

Fix:
This commit fixes this issue by disabling NAPI only after informing
firmware to stop queueing packets to host via send_rx_ctrl_cmd(lio, 0).
send_rx_ctrl_cmd is not visible in the patch as it is already there in the
code. The DOWN command also waits for any pending packets to be processed
by NAPI so that the deadlock will not occur.

Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 14:16:19 -04:00
David S. Miller
6f14f49ce5 Merge branch 'ieee802154-for-davem-2018-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next
Stefan Schmidt says:

====================
pull-request: ieee802154-next 2018-03-29

An update from ieee802154 for *net-next*

Colin fixed a unused variable in the new mcr20a driver.
Harry fixed an unitialised data read in the debugfs interface of the
ca8210 driver.

If there are any issues or you think these are to late for -rc1 (both can also
go into -rc2 as they are simple fixes) let me know.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 13:00:11 -04:00
Jose Abreu
8bf993a587 net: stmmac: Add support for DWMAC5 and implement Safety Features
This adds initial suport for DWMAC5 and implements the Automotive Safety
Package which is available from core version 5.10.

The Automotive Safety Pacakge (also called Safety Features) offers us
with error protection in the core by implementing ECC Protection in
memories, on-chip data path parity protection, FSM parity and timeout
protection and Application/CSR interface timeout protection.

In case of an uncorrectable error we call stmmac_global_err() and
reconfigure the whole core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 12:32:00 -04:00
Jose Abreu
34877a15f7 net: stmmac: Rework and fix TX Timeout code
Currently TX Timeout handler does not behaves as expected and leads to
an unrecoverable state. Rework current implementation of TX Timeout
handling to actually perform a complete reset of the driver state and IP.

We use deferred work to init a task which will be responsible for
resetting the system.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 12:31:59 -04:00
Jisheng Zhang
02281a3525 net: mvneta: remove duplicate *_coal assignment
The style of the rx/tx queue's *_coal member assignment is:

static void foo_coal_set(...)
{
	set the coal in hw;
	update queue's foo_coal member; [1]
}

In other place, we call foo_coal_set(pp, queue->foo_coal), so the above [1]
is duplicated and could be removed.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 12:27:25 -04:00
Mike Looijmans
aa076e3d22 net: macb: Try to retrieve MAC addess from nvmem provider
Call of_get_nvmem_mac_address() to fetch the MAC address from an nvmem
cell, if one is provided in the device tree. This allows the address to
be stored in an I2C EEPROM device for example.

Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 10:40:18 -04:00
John Hurley
29a5dcae27 nfp: flower: offload phys port MTU change
Trigger a port mod message to request an MTU change on the NIC when any
physical port representor is assigned a new MTU value. The driver waits
10 msec for an ack that the FW has set the MTU. If no ack is received the
request is rejected and an appropriate warning flagged.

Rather than maintain an MTU queue per repr, one is maintained per app.
Because the MTU ndo is protected by the rtnl lock, there can never be
contention here. Portmod messages from the NIC are also protected by
rtnl so we first check if the portmod is an ack and, if so, handle outside
rtnl and the cmsg work queue.

Acks are detected by the marking of a bit in a portmod response. They are
then verfied by checking the port number and MTU value expected by the
app. If the expected MTU is 0 then no acks are currently expected.

Also, ensure that the packet headroom reserved by the flower firmware is
considered when accepting an MTU change on any repr.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 10:18:55 -04:00
John Hurley
167cebeffa nfp: modify app MTU setting callbacks
Rename the 'change_mtu' app callback to 'check_mtu'. This is called
whenever an MTU change is requested on a netdev. It can reject the
change but is not responsible for implementing it.

Introduce a new 'repr_change_mtu' app callback that is hit when the MTU
of a repr is to be changed. This is responsible for performing the MTU
change and verifying it.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 10:18:54 -04:00
Russell King
e679c9c1db sfp/phylink: move module EEPROM ethtool access into netdev core ethtool
Provide a pointer to the SFP bus in struct net_device, so that the
ethtool module EEPROM methods can access the SFP directly, rather
than needing every user to provide a hook for it.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 10:11:06 -04:00