IRQ table should only exist for mlx5_core_dev for PF and VF only.
EQ table of mediated devices should hold a pointer to the IRQ table
of the parent PCI device.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Instead of requesting IRQ with eq creation, IRQs will be requested
before EQ table creation.
Instead of freeing the IRQs after EQ destroy, free IRQs after eq
table destroy.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Multiple EQs may share the same IRQ in subsequent patches.
Instead of calling the IRQ handler directly, the EQ will register
to an atomic chain notfier.
The Linux built-in shared IRQ is not used because it forces the caller
to disable the IRQ and clear affinity before free_irq() can be called.
This patch is the first step in the separation of IRQ and EQ logic.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Multiple EQs may share the same irq in subsequent patches.
To avoid starvation, a budget is set per EQ's interrupt handler.
Because of this change, it is no longer required to check that
MLX5_NUM_SPARE_EQE eqes were polled (to detect that arm is required).
It is guaranteed that MLX5_NUM_SPARE_EQE > budget, therefore the
handler will arm and exit the handler before all the entries in the
eq are polled.
In the scenario where the handler is out of budget and there are more
EQEs to poll, arming the EQ guarantees that the HW will send another
interrupt and the handler will be called again.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For ECPF with eswitch manager privilege, query the host max VF count
by querying the device using query_functions command.
With this enhancement:
1. flow steering entries are created only for valid vports based on
the max VF count of the PF.
2. Driver only queries cap of valid vport.
Eswitch requires the max VFs when doing initialization, so do sr-iov
init before eswitch init.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Current function only returns host num of VFs, later patch requires
other params such as host maximum num of VFs.
Return the raw output so that caller can extract info as needed.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Unified representors creation in esw_functions_changed context
handler. Emulate the esw_function_changed event for FW/HW that
does not support this event.
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Firmware FLR happens sequentially, in some cases, like when destroying
a VM that had many VFs, may require waiting much longer than 10 seconds.
Increase the timeout to 2 minutes, and print a wait countdown status
every 20 seconds.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Update driver version to match device specification.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let the compiler decide if the function should be inline in *.c files
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the set_ringparam() function of the ethtool interface
to enable the changing of io queue sizes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is not enough memory to allocate io queues the driver will
try to allocate smaller queues.
The backoff algorithm is as follows:
1. Try to allocate TX and RX and if successful.
1.1. return success
2. Divide by 2 the size of the larger of RX and TX queues (or both if their size is the same).
3. If TX or RX is smaller than 256
3.1. return failure.
4. else
4.1. go back to 1.
Also change the tx_queue_size, rx_queue_size field names in struct
adapter to requested_tx_queue_size and requested_rx_queue_size, and
use RX and TX queue 0 for actual queue sizes.
Explanation:
The original fields were useless as they were simply used to assign
values once from them to each of the queues in the adapter in ena_probe().
They could simply be deleted. However now that we have a backoff
feature, we have use for them. In case of backoff there is a difference
between the requested queue sizes and the actual sizes. Therefore there
is a need to save the requested queue size for future retries of queue
allocation (for example if allocation failed and then ifdown + ifup was
called we want to start the allocation from the original requested size of
the queues).
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ethtool -g shows the same size for current and max queue
sizes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new admin command to support different queue size for Tx/Rx
queues (the change also support different SQ/CQ sizes)
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement mqprio qdisc support by mapping traffic classes to
different hardware enqueue priorities. The maximum number of
supported traffic classes is an attribute of each DPNI object.
The traffic classes map to hardware priorities from highest (0)
to lowest (highest prio number). The skb priority information
received from the stack is used to select the hardware Tx queue
on which to enqueue the frame.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DPNI objects can have multiple traffic classes, as reflected by
the num_tc attribute. Until now we ignored its value and only
used traffic class 0.
This patch adds support for multiple Tx traffic classes; we have
num_queues x num_tcs hardware queues available for each interface.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the code configuring xps on the netdev TX queues to a
separate function. A subsequent patch will need to call
this in another context as well.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dependency to TI CPTS from Common CLK framework COMMON_CLK to fix
allyesconfig build for Powerpc:
drivers/net/ethernet/ti/cpts.c: In function 'cpts_of_mux_clk_setup':
drivers/net/ethernet/ti/cpts.c:567:2: error: implicit declaration of function 'of_clk_parent_fill'; did you mean 'of_clk_get_parent_name'? [-Werror=implicit-function-declaration]
of_clk_parent_fill(refclk_np, parent_names, num_parents);
^~~~~~~~~~~~~~~~~~
of_clk_get_parent_name
Fixes: a3047a81ba ("net: ethernet: ti: cpts: add support for ext rftclk selection")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be
called with the TCAM id incorrectly used as a VID, causing the wrong
TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VID filtering is implemented in the Header Parser, with one range of 11
vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in
the Parser.
Since we used a global range instead of a per-port one, this causes VIDs
to be removed from the whitelist from all ports of the same PPv2
instance.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PVID is removed from a bridge port, the Linux bridge drops both
untagged and prio-tagged packets. Align mlxsw with this behavior.
Fixes: 148f472da5 ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register")
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out
of 32 port buffers cannot be used. To work around this, the next FW release
will mark them as unused, and will report correspondingly lower total
shared buffer size. mlxsw will pick up the new value through a query to
cap_total_buffer_size resource. However the initial size for shared buffer
pool 0 is hard-coded and therefore needs to be updated.
Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the
total size of 42 MiB), and round down to the whole number of cells.
Fixes: fe099bf682 ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver tries to periodically refresh neighbours that are used to
reach nexthops. This is done by periodically calling neigh_event_send().
However, if the neighbour becomes dead, there is nothing we can do to
return it to a connected state and the above function call is basically
a NOP.
This results in the nexthop never being written to the device's
adjacency table and therefore never used to forward packets.
Fix this by dropping our reference from the dead neighbour and
associating the nexthop with a new neigbhour which we will try to
refresh.
Fixes: a7ff87acd9 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same hash function and seed are used for both ECMP and LAG hash.
Therefore, when a LAG device is used as a nexthop device as part of an
ECMP group, hash polarization can occur and all the traffic will be
hashed to a single LAG slave.
Fix this by using a different seed for the LAG hash.
Fixes: fa73989f26 ("mlxsw: spectrum: Use a stable ECMP/LAG seed")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During a port FDB dump operation, the mutex protecting the concurrent
access to the switch registers is currently held by the internal
mv88e6xxx_port_db_dump and mv88e6xxx_port_db_dump_fid helpers.
It must be held at the higher level in mv88e6xxx_port_fdb_dump which
is called directly by DSA through ds->ops->port_fdb_dump. Fix this.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The w5X00 chip provides an SPI to Ethernet inteface. This patch allows
platform devices to be defined through the device tree.
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TCP stream gets out of sync (driver stops receiving skbs
with expected TCP sequence numbers) request a TX resync from
the kernel.
We try to distinguish retransmissions from missed transmissions
by comparing the sequence number to expected - if it's further
than the expected one - we probably missed packets.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver. Rename the resync callback and add a direction
parameter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some control messages must be sent from atomic context. The mailbox
takes sleeping locks and uses a waitqueue so add a "posted" version
of communication.
Trylock the semaphore and if that's successful kick of the device
communication. The device communication will be completed from
a workqueue, which will also release the semaphore.
If locks are taken queue the message and return. Schedule a
different workqueue to take the semaphore and run the communication.
Note that the there are currently no atomic users which would actually
need the return value, so all replies to posted messages are just
freed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware indicates when a packet has been decrypted by reusing the
currently unused BPF flag. Transfer this information into the skb
and provide a statistic of all decrypted segments.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS offload code casts record number to a u64. The buffer
should be aligned to 8 bytes, but its actually a __be64, and
the rest of the TLS code treats it as big int. Make the
offload callbacks take a byte array, drivers can make the
choice to do the ugly cast if they want to.
Prepare for copying the record number onto the stack by
defining a constant for max size of the byte array.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit a07966447f ("geneve: ICMP error lookup handler") I wrongly
assumed buffers from icmp_socket_deliver() would be linear. This is not
the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear
data.
Eric fixed this same issue for fou and fou6 in commits 26fc181e6c
("fou, fou6: do not assume linear skbs") and 5355ed6388 ("fou, fou6:
avoid uninit-value in gue_err() and gue6_err()").
Use pskb_may_pull() instead of checking skb->len, and take into account
the fact we later access the GENEVE header with udp_hdr(), so we also
need to sum skb_transport_header() here.
Reported-by: Guillaume Nault <gnault@redhat.com>
Fixes: a07966447f ("geneve: ICMP error lookup handler")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit c3a43b9fec ("vxlan: ICMP error lookup handler") I wrongly
assumed buffers from icmp_socket_deliver() would be linear. This is not
the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear
data.
Eric fixed this same issue for fou and fou6 in commits 26fc181e6c
("fou, fou6: do not assume linear skbs") and 5355ed6388 ("fou, fou6:
avoid uninit-value in gue_err() and gue6_err()").
Use pskb_may_pull() instead of checking skb->len, and take into account
the fact we later access the VXLAN header with udp_hdr(), so we also
need to sum skb_transport_header() here.
Reported-by: Guillaume Nault <gnault@redhat.com>
Fixes: c3a43b9fec ("vxlan: ICMP error lookup handler")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify the code by removing struct rtl_cfg_info. Only info we need
per PCI ID is whether it supports GBit or not.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To prepare removal of struct rtl_cfg_info, set the coalesce
config based on the chip version number.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the latest changes we don't need separate functions
rtl_hw_start_8168 and rtl_hw_start_8101 any longer. This allows us to
simplify the code. For this change we need to move rtl_hw_start() and
rtl_hw_start_8169(). rtl_hw_start_8169() is unchanged.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CPCMD_QUIRK_MASK isn't specific to certain chip versions. The vendor
driver applies this mask to all 8168 versions. Therefore remove QUIRK
from the mask name and apply it on all chip versions.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far several places in the code deal with setting the interrupt mask
for the respective chip versions. Improve this by having one function
for this only. In addition don't set RxFIFOOver for all 8101 chip
versions like in the vendor driver.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Page pods are used for direct data placement, this patch
enables eDRAM page pods if firmware supports this feature.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides the MIB counters, some other useful counters can be exposed to
the user. This commit adds support for :
- Per-port counters, that indicate FIFO drops and classifier drops,
- Per-rxq counters,
- Per-txq counters
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we'll be adding support for other kind of internal counters, make
clear that the currently supported counters are the MIB counters.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When first configuring a port on PPv2, we want to clear the internal
counters so that we don't get values from previous boot stages.
However, we can't really clear these counters when resetting the MAC,
since there are valid reasons to do so while the port is being used,
such as when reconfiguring the interface mode with the PHY.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>