There's no value in having an anonymous struct for holding
a few fields, remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In new devices, access to periphery is forbidden. Send instead
host command to start and stop debugging.
Memory allocation is written in context info, but in case we
need to update it there is a dedicated command. Add definitions,
currently unused, of the new command.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Move the restart FW debug code to a function. This avoids code
duplication and lays the infra to support the new start and stop
host commands in some future devices.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The SP length in the ADD_STA command is an actual number of
frames, and not the SP len as it appears in the WME IE.
Fix that comment. The actual code is fine.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Jason writes:
"Second RDMA rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation
and other cases
- Fixes for regressions in patches merged this window in the gid
cache, devx, ucma and uapi."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Set right entry state before releasing reference
IB/mlx5: Destroy the DEVX object upon error flow
IB/uverbs: Free uapi on destroy
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
IB/hfi1: Fix destroy_qp hang after a link down
IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
IB/hfi1: Invalid user input can result in crash
IB/hfi1: Fix SL array bounds check
RDMA/uverbs: Fix validity check for modify QP
IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
ucma: fix a use-after-free in ucma_resolve_ip()
RDMA/uverbs: Atomically flush and mark closed the comp event queue
cxgb4: fix abort_req_rss6 struct
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.
Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.
1. Get device capabilities in addition to function capabilities
2. Align to latest spec by using cap_count to determine size of the
buffer in case of length error.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The combination of defined constants are used to present the
state of IRQ so the magic numbers has been replaced.
This is a simple coding style change which should have no impact on
runtime code execution.
Signed-off-by: Xue Liu <liuxuenetmail@gmail.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.
This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.
Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.
This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...
In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!
Fix this by replacing the current logic with an updated check that
behaves as follows:
First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.
Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.
Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.
To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.
Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.
Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
debugfs_remove has taken the IS_ERR into account. Just
remove the unnecessary condition.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events. bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting. In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.
We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.
Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet <edumazet@google.com>
Reported-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When add vxlan ttl inherit support, I forgot to fill it when dump
vlxan info. Fix it now.
Fixes: 72f6d71e49 ("vxlan: add ttl inherit support")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta controller can handle speeds up to 2500Mbps on the SGMII
interface. This relies on serdes configuration, the lane must be
configured at 3.125Gbps and we can't use in-band autoneg at that speed.
The main issue when supporting that speed on this particular controller
is that the link partner can send ethernet frames with a shortened
preamble, which if not explicitly enabled in the controller will cause
unexpected behaviours.
This was tested on Armada 385, with the comphy configuration done in
bootloader.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A HWMON device is only registered is the SFP module supports the
diagnostic page and is complient to SFF8472. Don't unconditionally
unregister the hwmon device when the SFP module is remove, otherwise
we access data structures which don't exist.
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 1323061a01 ("net: phy: sfp: Add HWMON support for module sensors")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV4;
~ ^~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV6;
~ ^~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, qed_tcp_ip_version:
TCP_IPV4 = QED_TCP_IPV4 = 0
TCP_IPV6 = QED_TCP_IPV6 = 1
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when a constant is used in a boolean context as it thinks a
bitwise operation may have been intended.
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical
'&&' with constant operand [-Wconstant-logical-operand]
if (!p_iov->b_pre_fp_hsi &&
^
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a
bitwise operation
if (!p_iov->b_pre_fp_hsi &&
^~
&
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant
to silence this warning
if (!p_iov->b_pre_fp_hsi &&
~^~
1 warning generated.
This has been here since commit 1fe614d10f ("qed: Relax VF firmware
requirements") and I am not entirely sure why since 0 isn't a special
case. Just remove the statement causing Clang to warn since it isn't
required.
Link: https://github.com/ClangBuiltLinux/linux/issues/126
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b89f04c61e ("bonding: deliver link-local packets with
skb->dev set to link that packets arrived on") changed the behavior
of how link-local-multicast packets are processed. The change in
the behavior broke some legacy use cases where these packets are
expected to arrive on bonding master device also.
This patch passes the packet to the stack with the link it arrived
on as well as passes to the bonding-master device to preserve the
legacy use case.
Fixes: b89f04c61e ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on")
Reported-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = ROCE_V2_IPV6;
~ ^~~~~~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = MAX_ROCE_MODE;
~ ^~~~~~~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, roce_flavor:
ROCE_V2_IPV6 = RROCE_IPV6 = 2
MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3
While we're add it, ditch the local variable flavor, we can just return
the value directly from the switch statement.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang complains when one enumerated type is implicitly converted to
another.
drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit
conversion from enumeration type 'enum qed_tunn_mode' to different
enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
QED_MODE_L2GENEVE_TUNN,
^~~~~~~~~~~~~~~~~~~~~~
Update mask's parameter to expect qed_tunn_mode, which is what was
intended.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->vxlan.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_geneve.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_geneve.tun_cls = type;
~ ^~~~
5 warnings generated.
Avoid this by changing type to an int.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in ms_to_errno array of error messages
and remove confusing "not" from the error text since the error code
refers to an uninitialized error code.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Core of the problem is that phy_suspend() suspends the PHY when it
should not because of WoL. phy_suspend() checks for WoL already, but
this works only if the PHY driver handles WoL (what is rarely the case).
Typically WoL is handled by the MAC driver.
This patch uses new member wol_enabled of struct net_device as
additional criteria in the check when not to suspend the PHY because
of WoL.
Last but not least change phy_detach() to call phy_suspend() before
attached_dev is set to NULL. phy_suspend() accesses attached_dev
when checking whether the MAC driver activated WoL.
Fixes: f1e911d5d0 ("r8169: add basic phylib support")
Fixes: e8cfd9d6c7 ("net: phy: call state machine synchronously in phy_stop")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Actually there's nothing wrong with the two changes marked as "Fixes",
they just revealed a problem which has been existing before.
After having switched r8169 to phylib it was reported that WoL from
shutdown doesn't work any longer (WoL from suspend isn't affected).
Reason is that during shutdown phy_disconnect()->phy_detach()->
phy_suspend() is called.
A similar issue occurs when the phylib state machine calls
phy_suspend() when handling state PHY_HALTED.
Core of the problem is that phy_suspend() suspends the PHY when it
should not due to WoL. phy_suspend() checks for WoL already, but this
works only if the PHY driver handles WoL (what is rarely the case).
Typically WoL is handled by the MAC driver.
phylib knows about this and handles it in mdio_bus_phy_may_suspend(),
but that's used only when suspending the system, not in other cases
like shutdown.
Therefore factor out the relevant check from
mdio_bus_phy_may_suspend() to a new function phy_may_suspend() and
use it in phy_suspend().
Last but not least change phy_detach() to call phy_suspend() before
attached_dev is set to NULL. phy_suspend() accesses attached_dev
when checking whether the MAC driver activated WoL.
Fixes: f1e911d5d0 ("r8169: add basic phylib support")
Fixes: e8cfd9d6c7 ("net: phy: call state machine synchronously in phy_stop")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocation of hwsim radio identifiers uses a post-increment from 0,
so the first radio has idx 0. This idx is explicitly excluded from
multicast announcements ever since, but it is unclear why.
Drop that idx check and announce the first radio as well. This makes
userspace happy if it relies on these events.
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The asynchronous destruction from a work-queue of radios tagged with
destroy-on-close may race with the owning namespace about to exit,
resulting in potential use-after-free of that namespace.
Instead of using a work-queue, move radios about to destroy to a
temporary list, which can be worked on synchronously after releasing
the lock. This should be safe to do from the netlink socket notifier,
as the namespace is guaranteed to not get released.
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The cleanup of radios during namespace exit has recently been reworked
to directly delete a radio while temporarily releasing the spinlock,
fixing a race condition between the work-queue execution and namespace
exits. However, the temporary unlock allows unsafe modifications on the
iterated list, resulting in a potential crash when continuing the
iteration of additional radios.
Move radios about to destroy to a temporary list, and clean that up
after releasing the spinlock once iteration is complete.
Fixes: 8cfd36a0b5 ("mac80211_hwsim: fix use-after-free bug in hwsim_exit_net")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Local variable 'autoneg' doesn't even exist:
drivers/net/phy/marvell.c: In function 'm88e1121_config_aneg':
drivers/net/phy/marvell.c:468:25: error: 'autoneg' undeclared (first use in this function); did you mean 'put_net'?
if (phydev->autoneg != autoneg || changed) {
^~~~~~~
Fixes: d6ab933647 ("net: phy: marvell: Avoid unnecessary soft reset")
Reported-by:Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver uses devm_ioremap_resource() which is only available when
CONFIG_HAS_IOMEM is set, make the driver depend on this config option.
User mode Linux does not have CONFIG_HAS_IOMEM set and the driver was
failing on this architecture.
Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BMCR.RESET bit on the Marvell PHYs has a special meaning in that
it commits the register writes into the HW for it to latch and be
configured appropriately. Doing software resets causes link drops, and
this is unnecessary disruption if nothing changed.
Determine from marvell_set_polarity()'s return code whether the register value
was changed and if it was, propagate that to the logic that hits the software
reset bit.
This avoids doing unnecessary soft reset if the PHY is configured in
the same state it was previously, this also eliminates the need for a
m88e1111_config_aneg() function since it now is the same as
marvell_config_aneg().
Tested-by: Wang, Dongsheng <dongsheng.wang@hxt-semitech.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While consolidating the PHY reset in phy_init_hw() an unconditionaly
BMCR soft-reset I became quite trigger happy with those. This was later
on deactivated for the Generic PHY driver on the premise that a prior
software entity (e.g: bootloader) might have applied workarounds in
commit 0878fff1f4 ("net: phy: Do not perform software reset for
Generic PHY").
Since we have a hook to wire-up a soft_reset callback, just use that and
get rid of the call to genphy_soft_reset() entirely. This speeds up
initialization and link establishment for most PHYs out there that do
not require a reset.
Fixes: 87aa9f9c61 ("net: phy: consolidate PHY reset in phy_init_hw()")
Tested-by: Wang, Dongsheng <dongsheng.wang@hxt-semitech.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-09-25
This series contains updates to i40e and xsk.
Mariusz fixes an issue where the VF link state was not being updated
properly when the PF is down or up. Also cleaned up the promiscuous
configuration during a VF reset.
Patryk simplifies the code a bit to use the variables for PF and HW that
are declared, rather than using the VSI pointers. Cleaned up the
message length parameter to several virtchnl functions, since it was not
being used (or needed).
Harshitha fixes two potential race conditions when trying to change VF
settings by creating a helper function to validate that the VF is
enabled and that the VSI is set up.
Sergey corrects a double "link down" message by putting in a check for
whether or not the link is up or going down.
Björn addresses an AF_XDP zero-copy issue that buffers passed
from userspace to the kernel was leaked when the hardware descriptor
ring was torn down. A zero-copy capable driver picks buffers off the
fill ring and places them on the hardware receive ring to be completed at
a later point when DMA is complete. Similar on the transmit side; The
driver picks buffers off the transmit ring and places them on the
transmit hardware ring.
In the typical flow, the receive buffer will be placed onto an receive
ring (completed to the user), and the transmit buffer will be placed on
the completion ring to notify the user that the transfer is done.
However, if the driver needs to tear down the hardware rings for some
reason (interface goes down, reconfiguration and such), the userspace
buffers cannot be leaked. They have to be reused or completed back to
userspace.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a
user to change the number of descriptors via e.g. "ethtool -G IFNAME".
Otherwise, the size of the stash/reuse queue can grow unbounded, which
would result in OOM or leaking userspace buffers.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Outstanding Rx descriptors are temporarily stored on a stash/reuse
queue. When/if the HW rings comes up again, entries from the stash are
used to re-populate the ring.
The latter required some restructuring of the allocation scheme for
the AF_XDP zero-copy implementation. There is now a fast, and a slow
allocation. The "fast allocation" is used from the fast-path and
obtains free buffers from the fill ring and the internal recycle
mechanism. The "slow allocation" is only used in ring setup, and
obtains buffers from the fill ring and the stash (if any).
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the zero-copy enabled XDP Tx ring is torn down, due to
configuration changes, outstanding frames on the hardware descriptor
ring are queued on the completion ring.
The completion ring has a back-pressure mechanism that will guarantee
that there is sufficient space on the ring.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>