The XOFF received statistic registers are per priority based and not per
traffic class. The ixgbe driver was incorrectly considering them to be for
each traffic class; and then disabling the "Tx hang" check for the queues
that belonged to the particular traffic class that had received PFC frames.
The above logic worked fine in scenario where the user priority and traffic
class number matched e.g. priority 0 is mapped to traffic class 0 and so on.
But, when multiple user priorities are mapped to a single traffic class or
when user priorities and traffic class numbers do not line up; the ixgbe
driver may disable the "Tx hang" check for queues belonging to a traffic
class that did not receive PFC frames and keep the "Tx hang" check enabled
for the queues that did receive the PFC frames.
This patch corrects the above in the code by considering the statistics
on a per priority basis; then getting the traffic class the user priority
belongs to and disabling the "Tx hang" check for queues that belong
to that traffic class.
Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For some devices, the result of the flow control high watermark gets
truncated when programming it into the registers because of the mask used.
Switch the mask to 32-bit to prevent this from happening.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On i350 VF devices, VLAN tags will be byte-swapped in the receive
descriptor only when received packets are looped back from other
VFs. Check for this condition and swab the tag if needed.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch contains updates to firmware/hardware header files shared
between csiostor and cxgb4/cxgb4vf, and the resulting changes to the
cxgb4/cxgb4vf source files.
Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The Common Platform Time Sync function of the CPSW does not depend the
CPSW configuration option as it should. This patch fixes the issue by
adding the dependency.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many new feauture have been introduced in the driver:
ethtool coalesce options, Rx HW watchdog... so this patch updates the
driver's version.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to get/set the tx/rx coalesce parameters
via ethtool interface.
Tests have been done on several platform with different GMAC chips w/o and w/
RX watchdog feature.
V2: reject coalesce settings that are not supported.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GMAC devices newer than databook 3.40 has an embedded timer
that can be used for mitigating the number of interrupts.
So this patch adds this optimizations.
At any rate, the Rx watchdog can be disable (on bugged HW) by
passing from the platform the riwt_off field.
In this implementation the rx timer stored in the Reg9 is fixed
to the max value. This will be tuned by using ethtool.
V2: added a platform parameter to force to disable the rx-watchdog
for example on new core where it is bugged.
V3: do not disable NAPI when Rx watchdog is used.
V4: a new extra statistic field has been added to show the early
receive status in the interrupt handler.
This patch also adds an extra check to avoid to call
napi_schedule when the DMA_INTR_ENA_RIE bit is disabled in the
Interrupt Mask register.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new schema used for mitigating the
number of transmit interrupts.
It is based on a SW timer and a threshold value.
The timer is used to periodically call the stmmac_tx_clean
function; the threshold is used for setting the IC (Interrupt
on Completion bit). The ISR will then invoke the poll method.
Also the patch improves some ethtool stat fields.
V2: review the logic to manage the IC bit in the TDESC
that was bugged because it didn't take care about the
fragments. Also fix the tx_count_frames that has not to be
limited to TX DMA ring. Thanks to Ben Hutchings.
V3: removed the spin_lock irqsave/restore as D. Miller suggested.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIMER option is not longer supported and this
code can be considered dead for this driver in
the new kernel series.
In fact, It was not updated at all and never used.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ConnectX-3 devices can use either 64- or 32-byte completion queue
entries (CQEs) and event queue entries (EQEs). Using 64-byte
EQEs/CQEs performs better because each entry is aligned to a complete
cacheline. This patch queries the HCA's capabilities, and if it
supports 64-byte CQEs and EQES the driver will configure the HW to
work in 64-byte mode.
The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ.
Since this mode is global, userspace (libmlx4) must be updated to work
with the configured CQE size, and guests using SR-IOV virtual
functions need to know both EQE and CQE size.
In case one of the 64-byte CQE/EQE capabilities is activated, the
patch makes sure that older guest drivers that use the QUERY_DEV_FUNC
command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that
they need an update to be able to work with the PPF. This is done by
changing the returned pf_context_behaviour not to be zero any more. In
case none of these capabilities is activated that value remains zero
and older guest drivers can run OK.
The SRIOV related flow is as follows
1. the PPF does the detection of the new capabilities using
QUERY_DEV_CAP command.
2. the PPF activates the new capabilities using INIT_HCA.
3. the VF detects if the PPF activated the capabilities using
QUERY_HCA, and if this is the case activates them for itself too.
Note that the VF detects that it must be aware to the new PF behaviour
using QUERY_FUNC_CAP. Steps 1 and 2 apply also for native mode.
User space notification is done through a new field introduced in
struct mlx4_ib_ucontext which holds device capabilities for which user
space must take action. This changes the binary interface so the ABI
towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only
when **needed** i.e. only when the driver does use 64-byte CQEs or
future device capabilities which must be in sync by user space. This
practice allows to work with unmodified libmlx4 on older devices (e.g
A0, B0) which don't support 64-byte CQEs.
In order to keep existing systems functional when they update to a
newer kernel that contains these changes in VF and userspace ABI, a
module parameter enable_64b_cqe_eqe must be set to enable 64-byte
mode; the default is currently false.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Merge my own merge branch to get various fixes from there
and upstream, especially the hvc console tty refcouting fixes
which which testing is quite a bit harder...
This fixes (for me) a regression introduced by commit b01af457 ("8139cp:
set ring address before enabling receiver"). That commit configured the
descriptor ring addresses earlier in the initialisation sequence, in
order to avoid the possibility of triggering stray DMA before the
correct address had been set up.
Unfortunately, it seems that the hardware will scribble garbage into the
TxRingAddr registers when we enable "plus mode" Tx in the CpCmd
register. Observed on a Traverse Geos router board.
To deal with this, while not reintroducing the problem which led to the
original commit, we augment cp_start_hw() to write to the CpCmd register
*first*, then set the descriptor ring addresses, and then finally to
enable Rx and Tx in the original 8139 Cmd register. The datasheet
actually indicates that we should enable Tx/Rx in the Cmd register
*before* configuring the descriptor addresses, but that would appear to
re-introduce the problem that the offending commit b01af457 was trying
to solve. And this variant appears to work fine on real hardware.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org [3.5+]
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit b26623dab7.
This reverts the revert, in net-next we'll try another scheme
to fix this bug using patches from David Woodhouse.
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/wireless/iwlwifi/pcie/tx.c
Minor iwlwifi conflict in TX queue disabling between 'net', which
removed a bogus warning, and 'net-next' which added some status
register poking code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add information to the DMA Configuration Register to
maximize system performance:
- rx/tx packet buffer full memory size
- allow possibility to use INCR16 if supported
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Joachim Eastwood <manabian@gmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
On BE2 chip, an interrupt being raised even when EQ is in un-armed state has
been observed a few times. This is not expected and has never been
observed on BE3/Lancer chips.
As a consequence, be_msix()::events_get() and be_poll()::events_get()
can race and notify an EQ wrongly causing a CEV UE. The other possible
side-effect would be traffic stalling because after notifying EQ,
napi_schedule() is ignored as NAPI is already running.
This patch fixes this issue by counting events only in be_poll().
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fix bug where a register which was only meant to be read in 578xx/57712
devices causes a bogus error message to be logged when read from other
devices.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change makes it so that only the first fragment in a series of fragments
will have the L4 header pulled. Previously we were always pulling the L4
header as well and in the case of UDP this can harm performance since only the
first fragment will have the header, the rest just contain data which should
be left in the paged portion of the packet.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Historically, we've been using the APME bit to determine whether a device
supports wake on a given port or not. However, this bit specifies the
default wake setting, rather than the wake support. Change the behavior so
that we use a flag to keep the capabilities separate from the enablement
while meeting customer requirements.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update the filters to be more consistent with what the driver wants to do.
For example, for devices that timestamp all packets, report that the filter
is set for timestamping all packets.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a bitwise operation error in the fdb_add block
that was only allowing FDB types that were not permanent.
This was the opposite of the intent because the hardware
never ages out address these are the _only_ type of addrs
that should be allowed.
This was missed because until recently iproute2 did not
set any bit for this by default. And our test code to
manage FDB entries on embedded devices similarly did not
set these bits.
I am going to chalk this up as a bug and fix it now.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables ethtool to correctly identify flow control (pause
frame) auto negotiation, as well as disallow enabling it when it is not
supported. The ixgbe_device_supports_autoneg_fc function is exported and
used for this purpose.
There is also one minor cleanup of the device_supports_autoneg_fc by
removing an unnecessary return statement.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes the queuing that was previously done for L4 packets
as it is not needed. The filter does not provide functionality, and it
is possible that queue setup here could trample settings done else-where
in the driver. (for example it may use a queue which isn't setup.)
Setting of the queue is not required for hardware timestamping and could
have inadverdent side effects.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes a magic number that was used for the ETQF used for
filtering L2 ptp packets and replaces it with the supplied define that
previously existed. The intent is to clarify that this filter is already
set aside for L2 1588 work.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reformats the output of the Tx/Rx descriptor dumps to more
appropriately align the output of the ixgbe_dump and improve readability.
Prevents empty Tx descriptors from being displayed to decrease the size
of the dump and make it more manageable.
Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
mvneta_deinit() can be called from the ->probe() hook in the error
path, so it shouldn't be marked as __devexit. It fixes the following
section mismatch warning:
WARNING: vmlinux.o(.devinit.text+0x239c): Section mismatch in reference
from the function mvneta_probe() to the function .devexit.text:mvneta_deinit()
The function __devinit mvneta_probe() references
a function __devexit mvneta_deinit().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Now that the Armada 370/XP platform has gained proper integration with
the clock framework, we add clk support in the Marvell Armada 370/XP
Ethernet driver.
Since the existing Device Tree binding that exposes a
'clock-frequency' property has never been exposed in any stable kernel
release, we take the freedom of removing this property to replace it
with the standard 'clocks' clock pointer property.
The Device Tree binding documentation is updated accordingly.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
As reported by checkpatch, the multiline comments for net/ and
drivers/net/ have a slightly different format than the one used in the
rest of the kernel, so we adjust our multiline comments accordingly.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
As reported by checkpatch, the multiline comments for net/ and
drivers/net/ have a slightly different format than the one used in the
rest of the kernel, so we adjust our multiline comment accordingly.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
As suggested by checkpatch, using <linux/delay.h> instead of
<asm/delay.h> is appropriate.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Commit 71c6c837 (drivers/net: fix tasklet misuse issue) introduced a
build failure in the xilinx driver. axienet_dma_err_handler isn't
declared before its use in axienet_open.
This patch provides the prototype before axienet_open.
Cc: Xiaotian Feng <dannyfeng@tencent.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this udev doesn't have a way to key the ne device to the platform
device.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various drivers depend on INET because they used to select INET_LRO,
but they have all been converted to use GRO which has no such
dependency.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit fa37a9586f ('mlx4_en: Moving to
work with GRO') left behind the Kconfig depends/select, some dead
code and comments referring to LRO.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ndo_validate_addr is set to the generic eth_validate_addr
function there is no point in calling is_valid_ether_addr
from driver ndo_open if ndo_open is not used elsewhere in
the driver.
With this change is_valid_ether_addr will be called from the
generic eth_validate_addr function. So there should be no change
in the actual behavior.
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move to circular buffers management macro and correct an error
with circular buffer initial condition.
Without this patch, the macb_tx_ring_avail() function was
not reporting the proper ring availability at startup:
macb macb: eth0: BUG! Tx Ring full when queue awake!
macb macb: eth0: tx_head = 0, tx_tail = 0
And hanginig forever...
I remove the macb_tx_ring_avail() function and use the
proven macros from circ_buf.h. CIRC_CNT() is used in the
"consumer" part of the driver: macb_tx_interrupt() to match
advice from Documentation/circular-buffers.txt.
Reported-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Tested-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"Asynchronous" is misspelled in some comments. No code changes.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>