Saeed Mahameed says:
====================
mlx5-updates-2020-08-03
This patchset introduces some updates to mlx5 driver.
1) Jakub converts mlx5 to use the new udp tunnel infrastructure.
Starting with a hack to allow drivers to request a static configuration
of the default vxlan port, and then a patch that converts mlx5.
2) Parav implements change_carrier ndo for VF eswitch representors,
to speedup link state control of representors netdevices.
3) Alex Vesker, makes a simple update to software steering to fix an issue
with push vlan action sequence
4) Leon removes a redundant dump stack on error flow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't yet have a .sriov_configure() to create them, though.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Self-tests for event and interrupt reception and NVRAM.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAC stats work much the same as on EF10, with a periodic DMA to a region
specified via an MCDI.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bring down the TX and RX queues at ifdown, so that we can then fini the
EVQs (otherwise the MC would return EBUSY because they're still in use).
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Includes checksum offload and TSO, so declare those in our netdev features.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several parts of the EF100 architecture are parameterised (to allow
varying capabilities on FPGAs according to resource constraints), and
these parameters are exposed to the driver through a TLV-encoded
region of the BAR.
For the most part we either don't care about these values at all or
just need to sanity-check them against the driver's assumptions, but
there are a number of TSO limits which we record so that we will be
able to check against them in the TX path when handling GSO skbs.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the future, EF100 is planned to have a credit-based scheme for
handling unsolicited events, which drivers will need to use in order
to function correctly. However, current EF100 hardware does not yet
generate unsolicited events and the credit scheme has not yet been
implemented in firmware. To prevent compatibility problems later if
the current driver is used with future firmware which does implement
it, we check for the corresponding capability flag (which that
future firmware will set), and if found, we refuse to probe.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Early in EF100 development there was a different format of event
descriptor; if the NIC is somehow running the very old firmware
which will use that format, fail the probe.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.
In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.
Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.
To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.
Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.
In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.
Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.
To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.
Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matching IPv6 traffic require allocating their own individual slots
in TCAM. So, fetch additional slots to insert IPv6 rules. Also, fetch
the cumulative stats of all the slots occupied by the Matchall rule.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When offloading action trap on a qevent, pass to_dev of NULL to the SPAN
module to trigger the mirror to the CPU port. Query the buffer drops
policer and use it for policing of the trapped traffic.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As previously explained, packets that are dropped due to buffer related
reasons (e.g., tail drop, early drop) can be mirrored to the CPU port.
These packets are then trapped with one of the "mirror session" traps
and their CQE includes the reason for which the packet was mirrored.
Register with devlink a new trap, early_drop, and initialize the
corresponding Rx listener with the appropriate mirror reason. Return an
error in case user tries to change the traps' action, as this is not
supported.
Since Spectrum-1 does not support these traps, the above is only done
for Spectrum-2 onwards.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches will need to register different traps for Spectrum-1
and Spectrum-2 onwards.
Enable that by invoking a per-ASIC operation during traps
initialization.
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches will need to register different trap groups for
Spectrum-1 and Spectrum-2 onwards.
Enable that by invoking a per-ASIC operation during trap groups
initialization.
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When unsetting policer base, the SPAN code currently uses refcount_dec().
However that function splats when the counter reaches zero, because
reaching zero without actually testing is in general indicative of a
missing cleanup. There is no cleanup to be done here, but nonetheless, use
refcount_dec_and_test() as required.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A later patch will refuse to set the action of certain traps in mlxsw
and also to change the policer binding of certain groups. Pass extack so
that failure could be communicated clearly to user space.
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the latest net-next tree, if test suspend/resume after enabling
WOL, we get error as below:
[ 487.086365] dpm_run_callback(): mdio_bus_suspend+0x0/0x30 returns -16
[ 487.086375] PM: Device stmmac-0:00 failed to suspend: error -16
-16 means -EBUSY, this is because I didn't enable wakeup of the correct
device when implementing phy based WOL feature. To be honest, I caught
the issue when implementing phy based WOL and then fix it locally, but
forgot to amend the phy based wol patch. Today, I found the issue by
testing net-next tree.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix memory allocation for ethernet address hash table.
The code was wrongly allocating an array for eth hash table which
is incorrect because this is the main structure for eth hash table
(struct eth_hash_t) that contains inside a number of elements.
Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a safe check to avoid dereferencing null pointer
Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The parameter 'priority' is incorrectly forced to zero which ultimately
induces logically dead code in the subsequent lines.
Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check before using returned value to avoid dereferencing null pointer.
Fixes: 18a6c85fcc ("fsl/fman: Add FMan Port Support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Potentially overflowing expression (ts_freq << 16 and intgr << 16)
declared as type u32 (32-bit unsigned) is evaluated using 32-bit
arithmetic and then used in a context that expects an expression of
type u64 (64-bit unsigned) which ultimately is used as 16-bit
unsigned by typecasting to u16. Fixed by using an unsigned 32-bit
integer since the value is truncated anyway in the end.
Fixes: 414fd46e77 ("fsl/fman: Add FMan support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925 ("dma-mapping: zero memory returned from dma_alloc_*")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the size used in 'dma_free_coherent()' in order to match the one
used in the corresponding 'dma_alloc_coherent()', in
'spider_net_init_chain()'.
Fixes: d4ed8f8d1f ("Spidernet DMA coalescing")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the size used in 'dma_free_coherent()' in order to match the one
used in the corresponding 'dma_alloc_coherent()'.
Fixes: 369a782af0 ("net: sgi: ioc3-eth: ensure tx ring is 16k aligned.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of invalid rule, a positive value EINVAL is returned here.
I think this is a typo error. It is necessary to return an error value.
Cc: Po Liu <Po.Liu@nxp.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In function hw_atl_a0_hw_multicast_list_set(), when an invalid
request is encountered, a negative error code should be returned.
Fixes: bab6de8fd1 ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions")
Cc: David VomLehn <vomlehn@texas.net>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2020-08-01
This series contains updates to the ice driver only.
Wei Yongjun marks power management functions with __maybe_unused.
Nick disables VLAN pruning in promiscuous mode and renames grst_delay to
grst_timeout.
Kiran modifies the check for linearization and corrects the vsi_id mask
value.
Vignesh replaces the use of flow profile locks to RSS profile locks for RSS
rule removal. Destroys flow profile lock on clearing XLT table and
clears extraction sequence entries.
Jesse adds some statistics and removes an unreported one.
Brett allows for 2 queue configuration for VFs.
Surabhi adds a check for failed allocation of an extraction sequence
table.
Tony updates the PTYPE lookup table and makes other trivial fixes.
Victor extends profile ID locks to be held until all references are
completed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_zero_addr() to clear mac address instead of memset().
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_zero_addr() to clear mac address instead of memset().
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c8729cac2a ("cxgb4: add ethtool n-tuple filter insertion")
has removed checking control key for determining IP address types
for TC-FLOWER rules, which causes all the rules being inserted to
hardware to become IPv6 rule type always. So, add back the check
to select the correct IP address type to extract and hence fix the
correct rule type being inserted to hardware.
Also, ethtool_rx_flow_key doesn't have any control key and instead
directly sets the IPv4/IPv6 address keys. So, explicitly set the
IP address type for ethtool n-tuple filters to reuse the same code.
Fixes: c8729cac2a ("cxgb4: add ethtool n-tuple filter insertion")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flag indicating the selftest to run is a bitmask. So, fix the
check. Also, the selftests will fail if adapter initialization has
not been completed yet. So, add appropriate check and bail sooner.
Fixes: 7235ffae3d ("cxgb4: add loopback ethtool self-test")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the capability to split the Tx queues onto their own
interrupts with their own napi contexts. This gives the
opportunity for more direct control of Tx interrupt
handling, such as CPU affinity and interrupt coalescing,
useful for some traffic loads.
v2: use ethtool -L, not a vendor specific priv-flag
v3: simplify logging, drop unnecessary "no-change" tests
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
We give the tx clean path its own budget and service routine in
order to give a little more leeway to be more aggressive, and
in preparation for coming changes. We've found this gives us
a little better performance in some packet processing scenarios
without hurting other scenarios.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
We really don't need to hit the Rx queue doorbell so many times,
we can wait to the end and cause a little less thrash.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Release skb memory in mvpp2_rx() if mvpp2_rx_refill routine fails
Fixes: b501585467 ("net: mvpp2: fix refilling BM pools in RX path")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate nic_info dynamically - n_entries is not constant.
Attach the tunnel offload info only to the uplink representor.
We expect the "main" netdev to be unregistered in switchdev
mode, and there to be only one uplink representor.
Drop the udp_tunnel_drop_rx_info() call, it was not there until
commit b3c2ed21c0 ("net/mlx5e: Fix VXLAN configuration restore after function reload")
so the device doesn't need it, and core should handle reloads and
reset just fine.
v2:
- don't drop the ndos on reprs, and register info on uplink repr.
v4:
- Move netdev tunnel structure handling to en_main.c
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The DR TX state machine supports the following order:
modify header, push vlan and encapsulation.
Instead fs_dr would pass:
push vlan, modify header and encapsulation.
The above caused the rule creation to fail on invalid action
sequence provided error.
Fixes: 6a48faeeca ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently PF and VF representor netdevice carrier is always controlled
by controlling the representor netdevice device state as up/down.
Representor netdevice state change undergoes one or more txq/rxq
destroy/create commands to firmware, skb and its rx buffer allocation,
health reporters creation and more.
Due to this limitation users do not have the ability to just change
the carrier of the non uplink representors without modifying the
device state.
In one use case when the eswitch physical port carrier is down/up,
user needs to update the VF link state to same as physical port
carrier.
Example of updating VF representor carrier state:
$ ip link set enp0s8f0npf0vf0 carrier off
$ ip link set enp0s8f0npf0vf0 carrier on
This enhancement results into VF link state change which is
represented by the VF representor netdevice carrier.
This enables users to modify the representor carrier without modifying
the representor netdevice state.
A simple test is run using [1] to calculate the time difference between
updating carrier vs updating device state (to update just the carrier)
with one VF to simulate 255 VFs.
Time taken to update the carrier using device up/down:
$ time ./calculate.sh dev enp0s8f0npf0vf0
real 0m30.913s
user 0m0.200s
sys 0m11.168s
Time taken to update just the carrier using carrier iproute2 command:
$ time ./calculate.sh carrier enp0s8f0npf0vf0
real 0m2.142s
user 0m0.160s
sys 0m2.021s
Test shows that its better to use carrier on/off user interface to notify
link up/down event to VF compare to device up/down interface, because
carrier user interface delivers the same event 15 times faster.
[1] https://github.com/paravmellanox/myscripts/blob/master/calculate_carrier_time.sh
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This is a collection of minor fixes including typos, white space, and
style. No functional changes.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
The profile ID map lock should be held till the caller completes
all references of that profile entries.
The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>