Commit Graph

25387 Commits

Author SHA1 Message Date
Anirudh Venkataramanan
45d3d428ea ice: Move common functions out of ice_main.c part 1/7
The functions that are used for PF VSI/netdev setup will also be used
for SR-IOV support. To allow reuse of these functions, move these
functions out of ice_main.c to ice_common.c/ice_lib.c

This move is done across multiple patches. Each patch moves a few
functions and may have minor adjustments. For example, a function that was
previously static in ice_main.c will be made non-static temporarily in
its new location to allow the driver to build cleanly. These adjustments
will be removed in subsequent patches where more code is moved out of
ice_main.c

In this particular patch, the following functions were moved out of
ice_main.c:
int ice_add_mac_to_list
ice_free_fltr_list
ice_stat_update40
ice_stat_update32
ice_update_eth_stats
ice_vsi_add_vlan
ice_vsi_kill_vlan
ice_vsi_manage_vlan_insertion
ice_vsi_manage_vlan_stripping

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:39:40 -07:00
Alaa Hleihel
59c9d35ea9 net/mlx5: Cache the system image guid
The system image guid is a read-only field which is used by the TC
offloads code to determine if two mlx5 devices belong to the same
ASIC while adding flows.

Read this once and save it on the core device rather than querying each
time an offloaded flow is added.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b856df28f9 net/mlx5e: Allow reporting of checksum unnecessary
Currently we practically never report checksum unnecessary, because
for all IP packets we take the checksum complete path.

Enable non-default runs with reprorting checksum unnecessary, using
an ethtool private flag. This can be useful for performance evals
and other explorations.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b820e6fb09 net/mlx5e: Enable reporting checksum unnecessary also for L3 packets
We can report checksum unnecessary also when the L3 checksum
flag on the cqe is set and there's no L4 header.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
f128f138cc net/mlx5e: Add ethtool control of ring params to VF representors
Added ethtool control to the representors for setting and querying
the ring params.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
84a0973386 net/mlx5e: Enable multi-queue and RSS for VF representors
Increased the amount of channels the representors can open to be the
amount of CPUs. The default amount opened remains one.

Used the standard NIC netdev functions to:
* Set RSS params when building the representors' params.
* Setup an indirect TIR and RQT for the representors upon
  initialization.
* Create a TTC flow table for the representors' indirect TIR (when
  creating the TTC table, mlx5e_set_ttc_basic_params() is not called,
  in order to avoid setting the inner_ttc param, which is not needed).

Added ethtool control to the representors for setting and querying
the amount of open channels. Additionally, included logic in the
representors' ethtool set channels handler which controls a
representor's vport rx rule, so that if there is one open channel
the rx rule steers traffic to the representor's direct TIR, whereas
if there is more than one channel, the rx rule steers traffic to the
new TTC flow table.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Or Gerlitz
a5355de878 net/mlx5e: Expose ethtool rss key size / indirection table functions
Towards enabling RSS for the vport representors, expose the functions for
querying the rss hash key size and indirection table size via ethtool.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Gavi Teitz
3edc0159c0 net/mlx5e: Expose function for building RSS params
Towards enabling RSS for the vport representors, extract the
procedure for building a device's RSS params, and expose the
function.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Or Gerlitz
46dc933cee net/mlx5e: Provide explicit directive if to create inner indirect tirs
Change the driver functions that deal with creating indirect tirs
to get a flag telling if inner ttc is desired.

A pre-step for enabling rss on the vport representors, where
inner ttc is not needed.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Gavi Teitz
c966f7d55d net/mlx5: E-Switch, Provide flow dest when creating vport rx rule
Currently the destination for the representor e-switch rx rule is
a TIR number. Towards changing that to potentially be a flow table,
as part of enabling RSS for representors, modify the signature of
the related e-switch API to get a flow destination.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
092297e09a net/mlx5e: Extract creation of rep's default flow rule
Cleaning up the flow of the representors' rx initialization, towards
enabling RSS for the representors.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
dabeb3b0d5 net/mlx5e: Enable stateless offloads for VF representor netdevs
Enabled checksum and TSO offloads for the representors, in
order to increase their performance, which is required to
increase the performance of flows that cannot be offloaded.

Checksum offloads contribute to a general acceleration of all
traffic (to around 150%), whereas the TSO offload contributes
to a prominent acceleration of the representor's TX for traffic
flows with larger than MTU sized packets (to around 200%). This
is the usual case for TCP streams, as the PF, which serves as
the uplink representor, and the VF representors employ GRO before
forwarding the packets to the representor.

GRO was enabled implicitly for the representors beforehand, and
is explicitly enabled here to ensure that the representors preserve
the performance boost it provides (of around 200%) when working in
tandem with the TSO offload by the forwardee, which is the standard
case as both the PF and the VF representors employ HW TSO.

The impact of these changes can be seen in the following
measurements taken on a setup of a VM over a VF, connected
to OVS via the VF representor, to an external host:

Before current changes:
                     TCP Throughput [Gb/s]
External host to VM         ~ 10.5
VM to external host         ~ 23.5

With just checksum offloads enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 14.9
VM to external host         ~ 28.5

With the TSO offload also enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 30.5

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Gavi Teitz
749359f4aa net/mlx5e: Change VF representors' RQ type
The representors' RQ size was not large enough for them to achieve
high enough performance, and therefore needed to be enlarged, while
suffering a minimum hit to its memory usage. To achieve this the
representors RQ size was increased, and its type was changed to be a
striding RQ if it is supported.

Towards that goal the following changes were made:

* Extracted the sequence for setting the standard netdev's RQ parmas
  into a function

* Replaced the sequence for setting the representor's RQ params with
  the standard sequence

The impact of this change can be seen in the following measurements
taken on a setup of a VM over a VF, connected to OVS via the VF
representor, to an external host:

Before current change:
                     TCP Throughput [Gb/s]
VM to external host         ~  7.2

With the current change (measured with a striding RQ):
                     TCP Throughput [Gb/s]
VM to external host         ~ 23.5

Each representor now consumes 2 [MB] of memory for its packet
buffers.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Or Gerlitz
3a95e0ccaf net/mlx5e: Ethtool steering, Support masks for l3/l4 filters
Allow using partial masks for L3 addresses and L4 ports across
the place.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:41 -07:00
Jianbo Liu
cee2648762 net/mlx5e: Set vlan masks for all offloaded TC rules
In flow steering, if asked to, the hardware matches on the first ethertype
which is not vlan. It's possible to set a rule as follows, which is meant
to match on untagged packet, but will match on a vlan packet:
    tc filter add dev eth0 parent ffff: protocol ip flower ...

To avoid this for packets with single tag, we set vlan masks to tell
hardware to check the tags for every matched packet.

Fixes: 095b6cfd69 ('net/mlx5e: Add TC vlan match parsing')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Eran Ben Elisha
11aa5800ed net/mlx5: E-Switch, Fix out of bound access when setting vport rate
The code that deals with eswitch vport bw guarantee was going beyond the
eswitch vport array limit, fix that.  This was pointed out by the kernel
address sanitizer (KASAN).

The error from KASAN log:
[2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in
mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core]

Fixes: c9497c9890 ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Alaa Hleihel
4d8fcf216c net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules
If the peer device was already unbound, then do not attempt to modify
it's resources, otherwise we will crash on dereferencing non-existing
device.

Fixes: 5c65c564c9 ("net/mlx5e: Support offloading TC NIC hairpin flows")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Hans de Goede
ac8bd9e13b r8169: Disable clk during suspend / resume
Disable the clk during suspend to save power. Note that tp->clk may be
NULL, the clk core functions handle this without problems.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:47:57 -07:00
Shahed Shaikh
c333fa0c4f qlcnic: fix Tx descriptor corruption on 82xx devices
In regular NIC transmission flow, driver always configures MAC using
Tx queue zero descriptor as a part of MAC learning flow.
But with multi Tx queue supported NIC, regular transmission can occur on
any non-zero Tx queue and from that context it uses
Tx queue zero descriptor to configure MAC, at the same time TX queue
zero could be used by another CPU for regular transmission
which could lead to Tx queue zero descriptor corruption and cause FW
abort.

This patch fixes this in such a way that driver always configures
learned MAC address from the same Tx queue which is used for
regular transmission.

Fixes: 7e2cf4feba ("qlcnic: change driver hardware interface mechanism")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:46:07 -07:00
David S. Miller
3ff6cde846 hns3: Another build fix.
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function ‘hclge_get_sset_count’:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:496:31: error: ‘HNAE3_REVISION_ID_21’ undeclared (first use in this function); did you mean ‘FADT2_REVISION_ID’?
   if (hdev->pdev->revision >= HNAE3_REVISION_ID_21 ||
                               ^~~~~~~~~~~~~~~~~~~~
                               FADT2_REVISION_ID

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:21:06 -07:00
David S. Miller
d401766585 hns3: Fix the build.
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: In function ‘hns3_self_test’:
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:278:15: error: ‘HNS3_SELF_TEST_TYPE_NUM’ undeclared (first use in this function); did you mean ‘HNS3_SELF_TEST_TPYE_NUM’?
  int st_param[HNS3_SELF_TEST_TYPE_NUM][2];
               ^~~~~~~~~~~~~~~~~~~~~~~
               HNS3_SELF_TEST_TPYE_NUM

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 10:49:58 -07:00
Bruce Allan
3b6bf296c4 ice: fix changing of ring descriptor size (ethtool -G)
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.

Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
7d86cf3840 ice: Update to capabilities admin queue command
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.

1. Get device capabilities in addition to function capabilities

2. Align to latest spec by using cap_count to determine size of the
   buffer in case of length error.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
1886588fb6 ice: Query the Tx scheduler node before adding it
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Brett Creeley
8bc8d188cd ice: Update comment for ice_fltr_mgmt_list_entry
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
b2ccf317ed ice: update fw version check logic
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.

This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.

Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.

This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...

In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!

Fix this by replacing the current logic with an updated check that
behaves as follows:

First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.

Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.

Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.

To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
e4a0e1ee94 ice: update branding strings and supported device ids
Update branding strings and remove device ids 0x1594 and 0x1595.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
95a525bee0 ice: replace unnecessary memcpy with direct assignment
Direct assignment is preferred over a memcpy()

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
c913b73cd0 ice: use [sr]q.count when checking if queue is initialized
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.

Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.

Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Colin Ian King
6a42b5128d qed: fix spelling mistake "b_cb_registred" -> "b_cb_registered"
Trivial fix to spelling mistake struct field name, rename it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:15:11 -07:00
Eric Dumazet
0c3b9d1b37 ibmvnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ibmvnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

ibmvnic_netpoll_controller() was completely wrong anyway,
as it was scheduling NAPI to service RX queues (instead of TX),
so I doubt netpoll ever worked on this driver.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
a4f570be65 sfc-falcon: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc-falcon uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
9447a10ff6 sfc: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
21627982e4 net: ena: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
3548fcf7d8 qlogic: netxen: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

netxen uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
81b059b218 qlcnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

qlcnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
4bd2c03be7 net: hns: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hns uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
226a2dd62c ehea: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ehea uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
e71fb423e0 hinic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hinic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Note that hinic_netpoll() was incorrectly scheduling NAPI
on both RX and TX queues.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
David S. Miller
ec72001d38 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-09-27

This series contains fixes to the ice driver only.

Jake fixes a potential crash due to attempting to access the mutex which
is already destroyed.  Fix this by using rq.count and sq.count to
determine if the queue was initialized.  Fixed the current logic for
checking the firmware version to properly handle situations when
firmware major/minor versions differ and when the branch version
differs.

Bruce replaces a memcpy() with a direct assignment, which is preferred.
Also updated the branding strings and device ids supported by the
driver.  Fixed the "ethtool -G" command in the driver, which was always
returning EINVAL when changing the descriptor ring size.

Brett update and clarified code comments.

Anirudh updates the driver to ensure we query the firmware for the
transmit scheduler node information before adding it to the driver
database, to ensure we have the current information.  Also update the
"get capabilities" command to get device and function capabilities.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:09:02 -07:00
Sudarsana Reddy Kalluru
5f672090e4 qed: Fix shmem structure inconsistency between driver and the mfw.
The structure shared between driver and the management FW (mfw) differ in
sizes. This would lead to issues when driver try to access the structure
members which are not-aligned with the mfw copy e.g., data_ptr usage in the
case of mfw_tlv request.
Align the driver structure with mfw copy, add reserved field(s) to driver
structure for the members not used by the driver.

Fixes: dd006921d6 ("qed: Add MFW interfaces for TLV request support.)
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
2018-09-28 10:39:44 -07:00
Huazhong Tan
e4fd75022c net: hns3: Fix loss of coal configuration while doing reset
The user's coal configuration will be lost after reset, so the tx_coal
and rx_coal fields are added to the struct hns_nic_priv to save the coal
configuration and used to restore the user's configuration after the reset
is complete.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
0d43bf45f4 net: hns3: Modify hns3_get_max_available_channels
The current hns3_get_max_available_channels returns the total number
of queues for the device, which makes ethtool -L set the number of queues
per channel queues incorrectly, so hns3_get_max_available_channels should
return the maximum available number of queues per channel, depending on
the total number of queues allocated and the hardware configurations.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
fe5eb04318 net: hns3: Change return type of hclge_tm_schd_info_update()
hclge_tm_schd_info_update should return an error when num_tc is greater
than alloc_tqps.

This patch changes the return type of hnae3_register_ae_algo from void
to int.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
93d8daf460 net: hns3: Fix for netdev not up problem when setting mtu
Currently hns3_nic_change_mtu will try to down the netdev before
setting mtu, and it does not up the netdev when the setting fails,
which causes netdev not up problem.

This patch fixes it by not returning when the setting fails.

Fixes: a8e8b7ff35 ("net: hns3: Add support to change MTU in HNS3 hardware")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
996ff91840 net: hns3: Fix for packet buffer setting bug
The hardware expects a unit of 128 bytes when setting
packet buffer. When calculating the packet buffer size,
hclge_rx_buffer_calc does not round up the size as a unit
of 128 byte, which may casue packet lost problem when stress
testing.

This patch fixes it by rounding up packet size when calculating.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Fuyun Liang
4dc13b9668 net: hns3: Add serdes parallel inner loopback support
This patch adds serdes parallel inner loopback support for self test.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
eb66d50352 net: hns3: Rename mac loopback to app loopback
In fact, our implementation of mac loopback is the implementation of app
loopback now. Current name is wrong. This patch renames mac loopback to
app loopback.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
a7b687b354 net: hns3: Rename loop mode
Our loop mode includes mac loop, serdes loop and phy loop. Not all of them
are related with mac. This patch corrects their names.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
cd2086bf49 net: hns3: Set extra mac address of pause param for HW
The extra mac address of pause param is used to do double check
for pause frame. This patch set it to HW. If we do not do that,
pfc pause frame will be transferred protocol stack when normal
flow control mode is enabled.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00