Make sure the queue structs exist before trying to tear
them down to make for safer error recovery.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a little more cleanup when tearing down the queues.
Fixes: 1d062b7b6f ("ionic: Add basic adminq support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't worry if the rx filter add firmware request fails on
EEXIST, at least we know the filter is there. Same for
the delete request, at least we know it isn't there.
Fixes: 2a654540be ("ionic: Add Rx filter and rx_mode ndo support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't save the lif->dentry until we know we have
a good value.
Fixes: 1a58e19646 ("ionic: Add basic lif support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible (but unlikely) that FW was busy and missed a heartbeat
check but is still alive and will process the pending request, so don't
clean the dev_cmd in this case. This occasionally occurs when working
with a card that is supporting many devices and is trying to shut them
all down at once, but still wants to see that last LIF disable request.
Fixes: 97ca486592 ("ionic: add heartbeat check")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Short circuit the cleanup if we get a timeout error from
ionic_qcq_disable() so as to not have to wait too long
on shutdown when we already know the FW is not responding.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Headers ionic_if.h and ionic_regs.h are licensed under three alternative
licenses and the used SPDX-License-Identifier expression makes
./scripts/spdxcheck.py complain:
drivers/net/ethernet/pensando/ionic/ionic_if.h: 1:52 Syntax error: OR
drivers/net/ethernet/pensando/ionic/ionic_regs.h: 1:52 Syntax error: OR
As OR is associative, it is irrelevant if the parentheses are put around
the first or the second OR-expression.
Simply add parentheses to make spdxcheck.py happy.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't assume the receive buffer size is a power-of-2 number of pages.
Instead, define the receive buffer size independently, and then
compute the page order from that size when needed.
This fixes a build problem that arises when the ARM64_PAGE_SHIFT
config option is set to have a page size greater than 4KB. The
problem was identified by Linux Kernel Functional Testing.
The IPA code basically assumed the page size to be 4KB. A larger page
size caused the receive buffer size to become correspondingly larger
(32KB or 128KB for ARM64_16K_PAGES and ARM64_64K_PAGES, respectively).
The receive buffer size is used to compute an "aggregation byte limit"
value that gets programmed into the hardware, and the large page sizes
caused that limit value to be too big to fit in a 5 bit field. This
triggered a BUILD_BUG_ON() call in ipa_endpoint_validate_build().
This fix causes a lot of receive buffer memory to be wasted if
system is configured for page size greater than 4KB. But such a
misguided configuration will now build successfully.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
the minimum value of skb len that hw supports is 32 rather than 17
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the second input parameter of wait_for_completion_timeout should
be jiffies instead of millisecond
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
add read barrier in driver code to keep from reading other fileds
in dma memory which is writable for hw until we have verified the
memory is valid for driver
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
should disable eq irq before freeing it, must clear event queue
depth in hw before freeing relevant memory to avoid illegal
memory access and update consumer idx to avoid invalid interrupt
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
it's unreliable for fw to check whether IO is stopped, so driver
wait for enough time to ensure IO process is done in hw before
freeing resources
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The completion usage in this driver is interesting:
- it uses a magic complete function which according to the comment was
implemented by invoking complete() four times in a row because
complete_all() was not exported at that time.
- it uses an open coded wait/poll which checks completion:done. Only one wait
side (device removal) uses the regular wait_for_completion() interface.
The rationale behind this is to prevent that wait_for_completion() consumes
completion::done which would prevent that all waiters are woken. This is not
necessary with complete_all() as that sets completion::done to UINT_MAX which
is left unmodified by the woken waiters.
Replace the magic complete function with complete_all() and convert the
open coded wait/poll to regular completion interfaces.
This changes the wait to exclusive wait mode. But that does not make any
difference because the wakers use complete_all() which ignores the
exclusive mode.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lkml.kernel.org/r/20200321113241.150783464@linutronix.de
The devlink .info_get callback allows the driver to report detailed
version information. The following devlink versions are reported with
this initial implementation:
"fw.mgmt" -> The version of the firmware that controls PHY, link, etc
"fw.mgmt.api" -> API version of interface exposed over the AdminQ
"fw.mgmt.build" -> Unique build id of the source for the management fw
"fw.undi" -> Version of the Option ROM containing the UEFI driver
"fw.psid.api" -> Version of the NVM image format.
"fw.bundle_id" -> Unique identifier for the combined flash image.
"fw.app.name" -> The name of the active DDP package.
"fw.app" -> The version of the active DDP package.
With this, devlink dev info can report at least as much information as
is reported by ETHTOOL_GDRVINFO.
Compare the output from ethtool vs from devlink:
$ ethtool -i ens785s0
driver: ice
version: 0.8.1-k
firmware-version: 0.80 0x80002ec0 1.2581.0
expansion-rom-version:
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
$ devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial number 00-01-ab-ff-ff-ca-05-68
versions:
running:
fw.mgmt 2.1.7
fw.mgmt.api 1.5
fw.mgmt.build 0x305d955f
fw.undi 1.2581.0
fw.psid.api 0.80
fw.bundle_id 0x80002ec0
fw.app.name ICE OS Default Package
fw.app 1.3.1.0
More pieces of information can be displayed, each version is kept
separate instead of munged together, and each version has an identifier
which comes with associated documentation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The nfp driver uses ``fw.bundle_id`` to represent a unique identifier of the
entire firmware bundle.
A future change is going to introduce a similar notion in the ice
driver, so promote ``fw.bundle_id`` into a generic version now.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Begin implementing support for the devlink interface with the ice
driver.
The pf structure is currently memory managed through devres, via
a devm_alloc. To mimic this behavior, after allocating the devlink
pointer, use devm_add_action to add a teardown action for releasing the
devlink memory on exit.
The ice hardware is a multi-function PCIe device. Thus, each physical
function will get its own devlink instance. This means that each
function will be treated independently, with its own parameters and
configuration. This is done because the ice driver loads a separate
instance for each function.
Due to this, the implementation does not enable devlink to manage
device-wide resources or configuration, as each physical function will
be treated independently. This is done for simplicity, as managing
a devlink instance across multiple driver instances would significantly
increase the complexity for minimal gain.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current implementation of .get_eeprom only enables reading from the
Shadow RAM portion of the NVM contents. Implement support for reading
the entire flash contents instead of only the initial portion contained
in the Shadow RAM.
A complete dump can take several seconds, but the ETHTOOL_GEEPROM ioctl
is capable of reading only a limited portion at a time by specifying the
offset and length to read.
In order to perform the reads directly, several functions are made non
static. Additionally, the unused ice_read_sr_buf_aq and ice_read_sr_buf
functions are removed.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When reading from the NVM using a flat address, it is useful to know the
upper bound on the size of the flash contents. This value is not stored
within the NVM.
We can determine the size by performing a bisection between upper and
lower bounds. It is known that the size cannot exceed 16 MB (offset of
0xFFFFFF).
Use a while loop to bisect the upper and lower bounds by reading one
byte at a time. On a failed read, lower the maximum bound. On
a successful read, increase the lower bound.
Save this as the flash_size in the ice_nvm_info structure that contains
data related to the NVM.
The size will be used in a future patch for implementing full NVM read
via ethtool's GEEPROM command.
The maximum possible size for the flash is bounded by the size limit for
the NVM AdminQ commands. Add a new macro, ICE_AQC_NVM_MAX_OFFSET, which
can be used to represent this upper bound.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The NVM version and Option ROM version information is stored within the
struct ice_nvm_ver_info structure. The data for the NVM is stored as
a 2byte value with the major and minor versions each using one byte from
the field. The Option ROM is stored as a 4byte value that contains
a major, build, and patch number.
Modify the code to immediately extract the version values and store them
in a new struct ice_orom_info. Remove the now unnecessary
ice_get_nvm_version function.
Update ice_ethtool.c to use the new fields directly from the structured
data.
This reduces complexity of the code that prints these versions in
ice_ethtool.c
Update the macro definitions and variable names to use the term "orom"
instead of "oem" for the Option ROM version. This helps increase the
clarity of the Option ROM version code.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The NVM contents are read via firmware by using the ice_aq_read_nvm
function. This function has a couple of limits:
1) The AdminQ commands can only take buffers sized up to 4Kb. Thus, any
larger read must be split into multiple reads.
2) when reading from the Shadow RAM, reads must not cross sector
boundaries. The sectors are also 4Kb in size.
Implement the ice_read_flat_nvm function to read portions of the NVM by
flat offset. That is, to read using offsets from the start of the NVM
rather than from a specific module.
This function will be able to read both from the NVM and from the Shadow
RAM. For simplicity NVM reads will always be broken up to not cross 4Kb
page boundaries, even though this is not required unless reading from
the Shadow RAM.
Use this new function as the implementation of ice_read_sr_word_aq.
The ice_read_sr_buf_aq function is not modified here. This is because
a following change will remove the only caller of that function in favor
of directly using ice_read_flat_nvm. Thus, there is little benefit to
changing it now only to remove it momentarily. At the same time, the
ice_read_sr_aq function will also be removed.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ice_read_sr_aq function returns words in the Little Endian format.
Remove the need for __force and typecasting by using a local variable in
the ice_read_sr_word_aq function.
Additionally clarify explicitly that the ice_read_sr_aq function takes
storage for __le16 values instead of using u16.
Being explicit about the endianness of this data helps when using tools
like sparse to catch endian-related issues.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Johannes Berg says:
====================
Another set of changes:
* HE ranging (fine timing measurement) API support
* hwsim gets virtio support, for use with wmediumd,
to be able to simulate with multiple machines
* eapol-over-nl80211 improvements to exclude preauth
* IBSS reset support, to recover connections from
userspace
* and various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
SJA1105 switches R and S have one SerDes port with an 802.3z
quasi-compatible PCS, hardwired on port 4. The other ports are still
MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds;
the MAC on this port is hardwired at gigabit. Only full duplex is
supported.
The SGMII port can be configured as part of the static config tables, as
well as through a dedicated SPI address region for its pseudo-clause-22
registers. However it looks like the static configuration is not
able to change some out-of-reset values (like the value of MII_BMCR), so
at the end of the day, having code for it is utterly pointless. We are
just going to use the pseudo-C22 interface.
Because the PCS gets reset when the switch resets, we have to add even
more restoration logic to sja1105_static_config_reload, otherwise the
SGMII port breaks after operations such as enabling PTP timestamping
which require a switch reset.
>From PHYLINK perspective, the switch supports *only* SGMII (it doesn't
support 1000Base-X). It also doesn't expose access to the raw config
word for in-band AN in registers MII_ADV/MII_LPA.
It is able to work in the following modes:
- Forced speed
- SGMII in-band AN slave (speed received from PHY)
- SGMII in-band AN master (acting as a PHY)
The latter mode is not supported by this patch. It is even unclear to me
how that would be described. There is some code for it left in the
patch, but 'an_master' is always passed as false.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sja1105_init_mii_settings iterates over the port list, it prints
this message for disabled ports, because they don't have a valid
phy-mode:
[ 4.778702] sja1105 spi2.0: Unsupported PHY mode unknown!
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver reclaims descriptors in much smaller batches, even if hardware
indicates more to reclaim, during backpressure. So, fix the check to
restart the Txq during backpressure, by looking at how many
descriptors hardware had indicated to reclaim, and not on how many
descriptors that driver had actually reclaimed. Once the Txq is
restarted, driver will reclaim even more descriptors when Tx path
is entered again.
Fixes: d429005fdf ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7c3bebc3d8 ("cxgb4: request the TX CIDX updates to status page")
reverted back to getting Tx CIDX updates via DMA, instead of interrupts,
introduced by commit d429005fdf ("cxgb4/cxgb4vf: Add support for SGE
doorbell queue timer")
However, it missed reverting back several code changes where Tx CIDX
updates are not explicitly requested during backpressure when using
interrupt mode. These missed changes cause slow recovery during
backpressure because the corresponding interrupt no longer comes and
hence results in Tx throughput drop.
So, revert back these missed code changes, as well, which will allow
explicitly requesting Tx CIDX updates when backpressure happens.
This enables the corresponding interrupt with Tx CIDX update message
to get generated and hence speed up recovery and restore back
throughput.
Fixes: 7c3bebc3d8 ("cxgb4: request the TX CIDX updates to status page")
Fixes: d429005fdf ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove atomic64_add from veth_xdp_xmit hotpath and rely on
xdp_xmit_err/xdp_tx_err counters
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce xdp_xmit counter in order to distinguish between XDP_TX and
ndo_xdp_xmit stats. Introduce the following ethtool counters:
- rx_xdp_tx
- rx_xdp_tx_errors
- tx_xdp_xmit
- tx_xdp_xmit_errors
- rx_xdp_redirect
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Distinguish between rx_drops and xdp_drops since the latter is already
reported in rx_packets. Report xdp_drops in ethtool statistics
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce xdp_tx, xdp_redirect and rx_drops counters in veth_stats data
structure. Move stats accounting in veth_poll. Remove xdp_xmit variable
in veth_xdp_rcv_one/veth_xdp_rcv_skb and rely on veth_stats counters.
This is a preliminary patch to align veth xdp statistics to mlx, intel
and marvell xdp implementation
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move xdp stats in veth_stats data structure. This is a preliminary patch
to align xdp statistics to mlx5, ixgbe and mvneta drivers
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for configuring the RGMII skew delays in Rx and
Tx. The Rx and Tx skews are set based on the interface mode. By default
their configuration is set to the default value in hardware (0.2ns);
this means the driver do not rely anymore on the bootloader
configuration.
Then based on the interface mode being used, a 2ns delay is added:
- RGMII_ID adds it for both Rx and Tx.
- RGMII_RXID adds it for Rx.
- RGMII_TXID adds it for Tx.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew reported:
After a number of network port link up/down changes, sometimes the switch
port gets stuck in a state where it thinks it is still transmitting packets
but the cpu port is not actually transmitting anymore. In this state you
will see a message on the console
"mtk_soc_eth 1e100000.ethernet eth0: transmit timed out" and the Tx counter
in ifconfig will be incrementing on virtual port, but not incrementing on
cpu port.
The issue is that MAC TX/RX status has no impact on the link status or
queue manager of the switch. So the queue manager just queues up packets
of a disabled port and sends out pause frames when the queue is full.
Change the LINK bit to reflect the link status.
Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Reported-by: Andrew Smith <andrew.smith@digi.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload action skbedit priority when keyed to a flower classifier. The
skb->priority field in Linux is very generic, so only allow setting the
bottom 8 priorities and bounce anything else.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QOS_ACTION is used for manipulating the QoS attributes of a packet.
Add the corresponding defines and helpers, in particular for the
switch_priority override.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
it will check the return value of dwmac_dma_reset() in the
stmmac_init_dma_engine() function and report an error if the
return value is not zero. so don't need check here.
Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit a5afc16780 ("net: phy: mscc: add support for VSC8584 PHY")
introduced a call to 'phy_write' storing its return value to a variable
called 'ret'. But 'ret' never was checked for a possible error being
returned, and hence was not used at all. Fix this by checking the return
value and exiting the function if an error was returned.
As this does not fix a known bug, this commit is mostly cosmetic and not
sent as a fix.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/cavium/liquidio/lio_main.c: In function 'octeon_chip_specific_setup':
drivers/net/ethernet/cavium/liquidio/lio_main.c:1378:8: warning:
variable 's' set but not used [-Wunused-but-set-variable]
It's not used since commit b6334be64d ("net/liquidio: Delete driver version assignment")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During initialization the driver issues a software reset command and
then waits for the system status to change back to "ready" state.
However, before issuing the reset command the driver does not check that
the system is actually in "ready" state. On Spectrum-{1,2} systems this
was always the case as the hardware initialization time is very short.
On Spectrum-3 systems this is no longer the case. This results in the
software reset command timing-out and the driver failing to load:
[ 6.347591] mlxsw_spectrum3 0000:06:00.0: Cmd exec timed-out (opcode=40(ACCESS_REG),opcode_mod=0,in_mod=0)
[ 6.358382] mlxsw_spectrum3 0000:06:00.0: Reg cmd access failed (reg_id=9023(mrsr),type=write)
[ 6.368028] mlxsw_spectrum3 0000:06:00.0: cannot register bus device
[ 6.375274] mlxsw_spectrum3: probe of 0000:06:00.0 failed with error -110
Fix this by waiting for the system to become ready both before issuing
the reset command and afterwards. In case of failure, print the last
system status to aid in debugging.
Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5-updates-2020-03-17
1) Compiler warnings and cleanup for the connection tracking series
2) Bug fixes for the connection tracking series
3) Fix devlink port register sequence
4) Last five patches in the series, By Eli cohen
Add the support for forwarding traffic between two eswitch uplink
representors (Hairpin for eswitch), using mlx5 termination tables
to change the direction of a packet in hw from RX to TX pipeline.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We precompute the static-static ECDH during configuration time, in order
to save an expensive computation later when receiving network packets.
However, not all ECDH computations yield a contributory result. Prior,
we were just not letting those peers be added to the interface. However,
this creates a strange inconsistency, since it was still possible to add
other weird points, like a valid public key plus a low-order point, and,
like points that result in zeros, a handshake would not complete. In
order to make the behavior more uniform and less surprising, simply
allow all peers to be added. Then, we'll error out later when doing the
crypto if there's an issue. This also adds more separation between the
crypto layer and the configuration layer.
Discussed-with: Mathias Hall-Andersen <mathias@hall-andersen.dk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The situation in which we wind up hitting the default case here
indicates a major bug in earlier parsing code. It is not a usual thing
that should ever happen, which means a "friendly" message for it doesn't
make sense. Rather, replace this with a WARN_ON, just like we do earlier
in the file for a similar situation, so that somebody sends us a bug
report and we can fix it.
Reported-by: Fabian Freyer <fabianfreyer@radicallyopensecurity.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We carry out checks to the effect of:
if (skb->protocol != wg_examine_packet_protocol(skb))
goto err;
By having wg_skb_examine_untrusted_ip_hdr return 0 on failure, this
means that the check above still passes in the case where skb->protocol
is zero, which is possible to hit with AF_PACKET:
struct sockaddr_pkt saddr = { .spkt_device = "wg0" };
unsigned char buffer[5] = { 0 };
sendto(socket(AF_PACKET, SOCK_PACKET, /* skb->protocol = */ 0),
buffer, sizeof(buffer), 0, (const struct sockaddr *)&saddr, sizeof(saddr));
Additional checks mean that this isn't actually a problem in the code
base, but I could imagine it becoming a problem later if the function is
used more liberally.
I would prefer to fix this by having wg_examine_packet_protocol return a
32-bit ~0 value on failure, which will never match any value of
skb->protocol, which would simply change the generated code from a mov
to a movzx. However, sparse complains, and adding __force casts doesn't
seem like a good idea, so instead we just add a simple helper function
to check for the zero return value. Since wg_examine_packet_protocol
itself gets inlined, this winds up not adding an additional branch to
the generated code, since the 0 return value already happens in a
mergable branch.
Reported-by: Fabian Freyer <fabianfreyer@radicallyopensecurity.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>