In qed_ll2_start_ooo() the ll2_info variable is uninitialized and then
passed to qed_ll2_acquire_connection() where it is copied into a new
memory space.
This shouldn't cause any issue as long as non of the copied memory is
every read.
But the potential for a bug being introduced by reading this memory
is real.
Detected by CoverityScan, CID#1399632 ("Uninitialized scalar variable")
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The axi variable was not being freed upon device removal.
With devm_kzalloc it ensures that it is properly freed.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_hw_addr_random() to set a random dev_addr and update
addr_assign_type instead of open-coding it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not consider IPv6 frames with zero UDP checksum as frames
with bad checksum and drop them.
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When booted with ACPI, random mac addresses are being
assigned to node1 interfaces due to mismatch of bgx_id
in BGX driver and ACPI tables.
This patch fixes this issue by setting maximum BGX devices
per node based on platform/soc instead of a macro. This
change will set the bgx_id appropriately.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When BGX/LMACs are in QSGMII mode, for some LMACs, mode info is
not being printed. This patch will fix that. With changes already
done to not do any sort of serdes 2 lane mapping config calculation
in kernel driver, we can get rid of this logic.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ACPI support has been added to ARM IOMMU driver in 4.10 kernel
and that has resulted in VNIC interfaces throwing translation
faults when kernel is booted with ACPI as driver was not using
DMA API. This patch fixes the issue by using DMA API which inturn
will create translation tables when IOMMU is enabled.
Also VNIC doesn't have a seperate receive buffer ring per receive
queue, so there is no 1:1 descriptor index matching between CQE_RX
and the index in buffer ring from where a buffer has been used for
DMA'ing. Unlike other NICs, here it's not possible to maintain dma
address to virt address mappings within the driver. This leaves us
no other choice but to use IOMMU's IOVA address conversion API to
get buffer's virtual address which can be given to network stack
for processing.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a Kconfig option: CONFIG_TIGON3_HWMON which allows to build
in/out support for thermal sensors reported by Tigon3 NICs.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the mvpp2 driver has been modified to accommodate the support
for PPv2.2, we can finally advertise this support by adding the
appropriate compatible string.
At the same time, we update the Kconfig description of the MVPP2 driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On PPv2.2, the streaming mappings can be anywhere in the first 40 bits
of the physical address space. However, for the coherent mappings, we
still need them to be in the first 32 bits of the address space,
because all BM pools share a single register to store the high 32 bits
of the BM pool address, which means all BM pools must be allocated in
the same 4GB memory area.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2.2 variant of the network controller needs an additional
clock, the "MG clock" in order for the IP block to operate
properly. This commit adds support for this additional clock to the
driver, reworking as needed the error handling path.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In PPv2.1, we have a maximum of 8 RXQs per port, with a default of 4
RXQs per port, and we were assigning RXQs 0->3 to the first port, 4->7
to the second port, 8->11 to the third port, etc.
In PPv2.2, we have a maximum of 32 RXQs per port, and we must allocate
RXQs from the range of 32 RXQs available for each port. So port 0 must
use RXQs in the range 0->31, port 1 in the range 32->63, etc.
This commit adapts the mvpp2 to this difference between PPv2.1 and
PPv2.2:
- The constant definition MVPP2_MAX_RXQ is replaced by a new field
'max_port_rxqs' in 'struct mvpp2', which stores the maximum number of
RXQs per port. This field is initialized during ->probe() depending
on the IP version.
- MVPP2_RXQ_TOTAL_NUM is removed, and instead we calculate the total
number of RXQs by multiplying the number of ports by the maximum of
RXQs per port. This was anyway used in only one place.
- In mvpp2_port_probe(), the calculation of port->first_rxq is adjusted
to cope with the different allocation strategy between PPv2.1 and
PPv2.2. Due to this change, the 'next_first_rxq' argument of this
function is no longer needed and is removed.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adjusts how the MVPP2_ISR_RXQ_GROUP_REG register is
configured, since it changed between PPv2.1 and PPv2.2.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2.2 unit is connected to an AXI bus on Armada 7K/8K, so this
commit adds the necessary initialization of the AXI bridge.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit handles a few miscellaneous differences between PPv2.1 and
PPv2.2 in different areas, where code done for PPv2.1 doesn't apply for
PPv2.2 or needs to be adjusted (getting the MAC address, disabling PHY
polling, etc.).
Thanks to Russell King for providing the initial implementation of
mvpp22_port_mii_set().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adjusts the mvpp2 driver register mapping and access logic
to support PPv2.2, to handle a number of differences.
Due to how the registers are laid out in memory, the Device Tree binding
for the "reg" property is different:
- On PPv2.1, we had a first area for the packet processor
registers (common to all ports), and then one area per port.
- On PPv2.2, we have a first area for the packet processor
registers (common to all ports), and a second area for numerous other
registers, including a large number of per-port registers
In addition, on PPv2.2, the area for the common registers is split into
so-called "address spaces" of 64 KB each. They allow to access per-CPU
registers, where each CPU has its own copy of some registers. A few
other registers, which have a single copy, also need to be accessed from
those per-CPU windows if they are related to a per-CPU register. For
example:
- Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a
per-CPU register, it must be accessed from the current CPU register
window.
- Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and
a few others) will affect the TX queue that was selected by the
write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU
window as the write to the TXQ_NUM_REG.
Therefore, the ->base member of 'struct mvpp2' is replaced with a
->cpu_base[] array, each entry pointing to a mapping of the per-CPU
area. Since PPv2.1 doesn't have this concept of per-CPU windows, all
entries in ->cpu_base[] point to the same io-remapped area.
The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0],
they are used for registers for which the CPU window doesn't matter.
mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to
access the registers for which the CPU window does matter, which is why
they take a "cpu" as argument.
The driver is then changed to use mvpp2_percpu_read() and
mvpp2_percpu_write() where it matters.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In PPv2.2, the MVPP2_RXQ_DESC_ADDR_REG and MVPP2_TXQ_DESC_ADDR_REG
registers have a slightly different layout, because they need to contain
a 64-bit address for the RX and TX descriptor arrays. This commit
adjusts those functions accordingly.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit modifies the mvpp2_defaults_set() function to not do the
loopback and FIFO threshold initialization, which are not needed for
PPv2.2.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MVPP2_RXQ_CONFIG_REG register has a slightly different layout
between PPv2.1 and PPv2.2, so this commit adapts the functions modifying
this register to accommodate for both the PPv2.1 and PPv2.2 cases.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adjusts the allocation and freeing of BM pools to support
PPv2.2. This involves:
- Checking that the number of buffer pointers is a multiple of 16, as
required by the hardware.
- Adjusting the size of the DMA coherent area allocated for buffer
pointers. Indeed, PPv2.2 needs space for 2 pointers of 64-bits per
buffer, as opposed to 2 pointers of 32-bits per buffer in
PPv2.1. The size in bytes is now stored in a new field of the
mvpp2_bm_pool structure.
- On PPv2.2, getting the DMA address and cookie (used for the physical
address) of each buffer requires reading the
MVPP22_BM_ADDR_HIGH_ALLOC to get the high order bits of those
addresses. A new utility function mvpp2_bm_bufs_get_addrs() is
introduced to handle this.
- On PPv2.2, releasing a buffer requires writing the high order 32 bits
of the DMA address and cookie to MVPP22_BM_PHY_VIRT_HIGH_RLS_REG.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds the definition of the PPv2.2 HW descriptors, adjusts
the mvpp2_tx_desc and mvpp2_rx_desc structures accordingly, and adapts
the accessors to work on both PPv2.1 and PPv2.2.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the format of the HW descriptors is different between PPv2.1 and
PPv2.2, this commit introduces an intermediate union, with for now
only the PPv2.1 descriptors. The bulk of the driver code only
manipulates opaque mvpp2_tx_desc and mvpp2_rx_desc pointers, and the
descriptors can only be accessed and modified through the accessor
functions. A follow-up commit will add the descriptor definitions for
PPv2.2.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to the introduction for the support of PPv2.2 in the
mvpp2 driver, this commit adds a hw_version field to the struct
mvpp2, and uses the .data field of the DT match table to fill it in.
Having the MVPP21 and MVPP22 definitions available will allow to start
adding the necessary conditional code to support PPv2.2.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2.2 IP has a different TX and RX descriptor layout compared to
PPv2.1. In order to prepare for the introduction of PPv2.2 support in
mvpp2, this commit adds accessors for the different fields of the TX
and RX descriptors, and changes the code to use them.
For now, the mvpp2_port argument passed to the accessors is not used,
but it will be used in follow-up to update the descriptor according to
the version of the IP being used.
Apart from the mechanical changes to use the newly introduced
accessors, a few other changes, needed to use the accessors, are made:
- The mvpp2_txq_inc_put() function now takes a mvpp2_port as first
argument, as it is needed to use the accessors.
- Similarly, the mvpp2_bm_cookie_build() gains a mvpp2_port first
argument, for the same reason.
- In mvpp2_rx_error(), instead of accessing the RX descriptor in each
case of the switch, we introduce a local variable to store the
packet size.
- In mvpp2_tx_frag_process() and mvpp2_tx() instead of accessing the
packet size from the TX descriptor, we use the actual value
available in the function, which is used to set the TX descriptor
packet size a few lines before.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RX descriptors of the PPv2 hardware allow to store several
information, amongst which:
- the DMA address of the buffer in which the data has been received
- a "cookie" field, left to the use of the driver, and not used by the
hardware
In the current implementation, the "cookie" field is used to store the
virtual address of the buffer, so that in the receive completion path,
we can easily get the virtual address of the buffer that corresponds to
a completed RX descriptors.
On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide,
which is enough to store a DMA address in the first field, and a virtual
address in the second field.
On PPv2.2, used on 64-bit platforms, these two fields have been extended
to 40 bits. While 40 bits is enough to store a DMA address (as long as
the DMA mask is 40 bits or lower), it is not enough to store a virtual
address. Therefore, the "cookie" field can no longer be used to store
the virtual address of the buffer.
However, as Russell King pointed out, the RX buffers are always
allocated in the kernel linear mapping, and therefore using
phys_to_virt() on the physical address of the RX buffer is possible and
correct.
Therefore, this commit changes the driver to use the "cookie" field to
store the physical address instead of the virtual
address. phys_to_virt() is used in the receive completion path to
retrieve the virtual address from the physical address.
It is obviously important to realize that the DMA address and physical
address are two different things, which is why we store both in the RX
descriptors. While those addresses may be identical in some situations,
it remains two distinct concepts, and both addresses should be handled
separately.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvpp2_txq_pend_desc_num_get() function only selects a TX queue, and
reads the number of pending descriptors. It is used in only one place,
in mvpp2_txq_clean(), where the TX queue has already been selected by a
write to MVPP2_TXQ_NUM_REG.
Therefore, this function is useless, and the caller can simply read the
value of the MVPP2_TXQ_PENDING_REG register instead.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "buffer header" functionality is a functionality used by the
hardware to split an incoming packets over multiple BM buffers if they
are not large enough. However, the mvpp2 driver guarantees that a pool
of BM buffers has buffers with a size large enough to store MTU-sized
packets. Therefore, this functionality is completely unused, and the
code can be removed, and we should never get a descriptor with bit
MVPP2_RXD_BUF_HDR set.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As indicated by Russell King, the mvpp2 driver currently uses a lot
"phys" or "phys_addr" to store what really is a DMA address. This commit
clarifies this by using "dma" or "dma_addr" where appropriate.
This is especially important as we are going to introduce more changes
where the distinction between physical address and DMA address will be
key.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should keep one way to build skbs, regardless of GRO being on or off.
Note that I made sure to defer as much as possible the point we need to
pull data from the frame, so that future prefetch() we might add
are more effective.
These skb attributes derive from the CQE or ring :
ip_summed, csum
hash
vlan offload
hwtstamps
queue_mapping
As a bonus, this patch removes mlx4 dependency on eth_get_headlen()
which is very often broken enough to give us headaches.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing a boolean in fast path is not worth duplicating
the code allocating packets, when GRO is on or off.
If this proves to be a problem, we might later use a jump label.
Next patch will remove this duplicated code and ease code review.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to compute the frame virtual address at different points.
Do it once.
Following patch will use the new va address for validate_loopback()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of fetching dma address from rx_desc->data[0].addr,
prefer using frags[0].dma + frags[0].page_offset to avoid
a potential cache line miss.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This new counter tracks number of pages that we allocated for one port.
lpaa24:~# ethtool -S eth0 | egrep 'rx_alloc_pages|rx_packets'
rx_packets: 306755183
rx_alloc_pages: 932897
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Same technique than some Intel drivers, for arches where PAGE_SIZE = 4096
In most cases, pages are reused because they were consumed
before we could loop around the RX ring.
This brings back performance, and is even better,
a single TCP flow reaches 30Gbit on my hosts.
v2: added full memset() in mlx4_en_free_frag(), as Tariq found it was needed
if we switch to large MTU, as priv->log_rx_info can dynamically be changed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of order-3 pages is problematic in some cases.
This patch might add three kinds of regression :
1) a CPU performance regression, but we will add later page
recycling and performance should be back.
2) TCP receiver could grow its receive window slightly slower,
because skb->len/skb->truesize ratio will decrease.
This is mostly ok, we prefer being conservative to not risk OOM,
and eventually tune TCP better in the future.
This is consistent with other drivers using 2048 per ethernet frame.
3) Because we allocate one page per RX slot, we consume more
memory for the ring buffers. XDP already had this constraint anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will soon use order-0 pages, and frag truesize will more precisely
match real sizes.
In the new model, we prefer to use <= 2048 bytes fragments, so that
we can use page-recycle technique on PAGE_SIZE=4096 arches.
We will still pack as much frames as possible on arches with big
pages, like PowerPC.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using per frag storage for frag_prefix_size is really silly.
mlx4_en_complete_rx_desc() has all needed info already.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is really a port attribute, no need to duplicate it per
RX queue and per frag.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to duplicate it for all queues and frags.
num_frags & log_rx_info become u8 to save space.
u8 accesses are a bit faster than u16 anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new value to the oob_mode module parameter for
supporting AP certification.
All enabled values of oob_mode (>0) are intended only
for debugging and diagnostics.
Signed-off-by: Lior David <qca_liord@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Driver always invoke cfg80211_disconnected() with locally_generated as
false.
Fix this by reporting true whenever the disconnect is triggered from
upper layers (cfg80211) or from within the driver itself (reset,
deinit).
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Upon connect timeout driver invokes _wil6210_disconnect() which iterates
over sta array and disconnects each connected sta. In practice, because
the connection is still ongoing and because cid is not yet allocated,
disconnect is not actually happening. This leaves FW in connecting
state while driver is in disconnected state.
To fix this, upon connect timeout, explicitly send WMI_DISCONNECT_CMDID
to FW to make sure it gets disconnected.
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
When flush is done, pending events list is manipulated
without taking the proper spinlock, which could lead to
memory corruption if list is manipulated by wmi worker
or by interrupt routine.
Signed-off-by: Hamad Kadmany <qca_hkadmany@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
In a fast disconnect/connect sequence, cfg80211_connect_result() can
fail to find the bss object which the driver is connecting to. Detailed
sequence of events:
* Driver is connected in STA mode
* Disconnect request arrives from user space. Driver disconnects and
calls cfg80211_disconnected() which adds new event to the
cfg80211_wq worker thread
* Connect request arrives from user space. cfg80211_connect() stores
ssid/ssid_len and calls rdev_connect()
* __cfg80211_disconnected() runs in worker thread and zero
wdev->ssid_len
* Connect succeeds. Driver calls cfg80211_connect_result() which fails
to find the bss because wdev->ssid_len is zero
To overcome this, upon connect request, store the bss object in the
driver and upon connect completion pass it to kernel using
cfg80211_connect_bss().
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
After setting HALP ICR bit, we keep it set until HALP unvote.
Masking HALP ICR should protect the driver from hitting the HALP ICR
over and over again. However, in case there is another MISC ICR
we will read the HALP ICR and issue a completion. This can lead to
a case where HALP voting is completed immediately, as the completion
is already set.
Reinit the HALP completion before the actual vote will clear previous
completions and protect from such cases.
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>