Interrupt state should be consistently guarded by the RTNL lock once
the net device is registered.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
I don't think these PM functions can race with userland net device
operations, but it's much easier to reason about locking if state is
consistently guarded by the same lock.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
STATE_INIT and STATE_FINI are equivalent and represent incompletely
initialised states; combine them as STATE_UNINIT.
Rename STATE_RUNNING to STATE_READY, to avoid confusion with
netif_running() and IFF_RUNNING.
The comments do not quite match current usage, but this will be
corrected in subsequent fixes.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We only use tso_state::full_packet_space to calculate the IPv4 tot_len
or IPv6 payload_len, not to set tso_state::packet_space. Replace it
with an ip_base_len field holding the value of tot_len or payload_len
before including the TCP payload, which is much more useful when
constructing the new headers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
TSO header buffers contain a control structure immediately followed by
the packet headers, and are kept on a free list when not in use. This
complicates buffer management and tends to result in cache read misses
when we recycle such buffers (particularly if DMA-coherent memory
requires caches to be disabled).
Replace the free list with a simple mapping by descriptor index. We
know that there is always a payload descriptor between any two
descriptors with TSO header buffers, so we can allocate only one
such buffer for each two descriptors.
While we're at it, use a standard error code for allocation failure,
not -1.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We now have a definite upper bound on the number of descriptors per
skb; use that to stop the queue when the next packet might not fit.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Add a flags field to struct efx_tx_buffer, replacing the
continuation and map_single booleans.
Since a single descriptor cannot be both a TSO header and the last
descriptor for an skb, unionise efx_tx_buffer::{skb,tsoh} and add
flags for validity of these fields.
Clear all flags in free buffers (whereas previously the continuation
flag would be set).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Using eth_hw_addr_random() to generate a random Ethernet address
(MAC) to be used by a net device and set addr_assign_type.
Not need to duplicating its implementation.
spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using eth_hw_addr_random() to generate a random Ethernet address
(MAC) to be used by a net device and set addr_assign_type.
Not need to duplicating its implementation.
spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to fsl_pq_mdio.c, this driver is for the 10G MDIO controller on
Freescale Frame Manager Ethernet controllers.
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ETHTOOL_GRXCLSRULE returns filters for a TCP/IPv4 or UDP/IPv4 4-tuple
with source and destination swapped.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Use PCI Express Capability access functions to simplify cxgb3 driver.
[bhelgaas: split cxgb3 and cxgb4 into separate patches]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use PCI Express Capability access functions to simplify tg3 driver.
[bhelgaas: split bnx2x and tg3 into separate patches]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use PCI Express Capability access functions to simplify bnx2x driver.
[bhelgaas: split bnx2x and tg3 into separate patches]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously, when we turned on the "Enable No Snoop Bit," we cleared all
the other Device Control bits, including error reporting enables,
Max_Payload_Size, Max_Read_Request_Size, etc. This patch preserves
all the other bits.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Introduce an inline function pci_pcie_type(dev) to extract PCIe
device type from pci_dev->pcie_flags_reg field, and prepare for
removing pci_dev->pcie_type.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch fixes the name of the macro used to mask the
mmc interrupt: erroneously it was used: MMC_DEFAUL_MASK.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Erroneously the DWMAC_CORE_3_40 was set to 34 instead of 0x34.
This can generate problems when run on old chips because
the driver assumes that there are the extra 16 regs available
for perfect filtering.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Gianni Antoniazzi <gianni.antoniazzi-ext@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
This series contains updates to ethtool.h, e1000, e1000e, and igb to
implement MDI/MDIx control.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking update from David Miller:
"A couple weeks of bug fixing in there. The largest chunk is all the
broken crap Amerigo Wang found in the netpoll layer."
1) netpoll and it's users has several serious bugs:
a) uses GFP_KERNEL with locks held
b) interfaces requiring interrupts disabled are called with them
enabled
c) and vice versa
d) VLAN tag demuxing, as per all other RX packet input paths, is not
applied
All from Amerigo Wang.
2) Hopefully cure the ipv4 mapped ipv6 address TCP early demux bugs for
good, from Neal Cardwell.
3) Unlike AF_UNIX, AF_PACKET sockets don't set a default credentials
when the user doesn't specify one explicitly during sendmsg().
Instead we attach an empty (zero) SCM credential block which is
definitely not what we want. Fix from Eric Dumazet.
4) IPv6 illegally invokes netdevice notifiers with RCU lock held, fix
from Ben Hutchings.
5) inet_csk_route_child_sock() checks wrong inet options pointer, fix
from Christoph Paasch.
6) When AF_PACKET is used for transmit, packet loopback doesn't behave
properly when a socket fanout is enabled, from Eric Leblond.
7) On bluetooth l2cap channel create failure, we leak the socket, from
Jaganath Kanakkassery.
8) Fix all the netprio file handling bugs found by Al Viro, from John
Fastabend.
9) Several error return and NULL deref bug fixes in networking drivers
from Julia Lawall.
10) A large smattering of struct padding et al. kernel memory leaks to
userspace found of Mathias Krause.
11) Conntrack expections in netfilter can access an uninitialized timer,
fix from Pablo Neira Ayuso.
12) Several netfilter SIP tracker bug fixes from Patrick McHardy.
13) IPSEC ipv6 routes are not initialized correctly all the time,
resulting in an OOPS in inet_putpeer(). Also from Patrick McHardy.
14) Bridging does rcu_dereference() outside of RCU protected area, from
Stephen Hemminger.
15) Fix routing cache removal performance regression when looking up
output routes that have a local destination. From Zheng Yan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
af_netlink: force credentials passing [CVE-2012-3520]
ipv4: fix ip header ident selection in __ip_make_skb()
ipv4: Use newinet->inet_opt in inet_csk_route_child_sock()
tcp: fix possible socket refcount problem
net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child()
net/core/dev.c: fix kernel-doc warning
netconsole: remove a redundant netconsole_target_put()
net: ipv6: fix oops in inet_putpeer()
net/stmmac: fix issue of clk_get for Loongson1B.
caif: Do not dereference NULL in chnl_recv_cb()
af_packet: don't emit packet on orig fanout group
drivers/net/irda: fix error return code
drivers/net/wan/dscc4.c: fix error return code
drivers/net/wimax/i2400m/fw.c: fix error return code
smsc75xx: add missing entry to MAINTAINERS
net: qmi_wwan: new devices: UML290 and K5006-Z
net: sh_eth: Add eth support for R8A7779 device
netdev/phy: skip disabled mdio-mux nodes
dt: introduce for_each_available_child_of_node, of_get_next_available_child
net: netprio: fix cgrp create and write priomap race
...
Initalizers for deferrable delayed_work are confused.
* __DEFERRED_WORK_INITIALIZER()
* DECLARE_DEFERRED_WORK()
* INIT_DELAYED_WORK_DEFERRABLE()
Rename them to
* __DEFERRABLE_WORK_INITIALIZER()
* DECLARE_DEFERRABLE_WORK()
* INIT_DEFERRABLE_WORK()
This patch doesn't cause any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
This is the implementation for igb to allow forcing MDI state
via ethtool, allowing users to work around some improperly
behaving switches.
Forcing in this driver is for now only allowed when auto-neg is
enabled.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown aaron.f.brown@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some users report issues with link failing when connected to certain
switches. This gives the user the ability to control the MDI state
from the driver, allowing users to work around some improperly
behaving switches.
Forcing in this driver is for now only allowed when auto-neg is
enabled.
This is in regards to the related ethtool app patch and
bugzilla.kernel.org bug 11998
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: bruce.w.allan@intel.com
CC: n.poppelier@xs4all.nl
CC: bastien@durel.org
CC: jsveiga@it.eng.br
Tested-by: Aaron Brown aaron.f.brown@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is the implementation in e1000 to allow ethtool to force
MDI state, allowing users to work around some improperly
behaving switches.
Forcing in this driver is for now only allowed when auto-neg is enabled.
To use must have the matching version of ethtool app that supports
this functionality.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown aaron.f.brown@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order for igb to support MDI setting support via
ethtool this code is needed to allow setting the MDI state
via software.
This is in regards to the related ethtool patch
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown aaron.f.brown@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order for e1000e to support MDI setting support via
ethtool this code is needed to allow setting the MDI state
via software.
This is in regards to the related ethtool patch and
fixes bugzilla.kernel.org bug 11998
Signed-off-by: Bruce W Allan <bruce.w.allan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown aaron.f.brown@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When getting clock, give a chance to the CPUs without DT support,
which use Common Clock Framework, such as Loongson1B.
Signed-off-by: Kelvin Cheung <keguang.zhang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull infiniband/rdma fixes from Roland Dreier:
"Grab bag of InfiniBand/RDMA fixes:
- IPoIB fixes for regressions introduced by path database conversion
- mlx4 fixes for bugs with large memory systems and regressions from
SR-IOV patches
- RDMA CM fix for passing bad event up to userspace
- Other minor fixes"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Check iboe netdev pointer before dereferencing it
mlx4_core: Clean up buddy bitmap allocation
mlx4_core: Fix integer overflow issues around MTT table
mlx4_core: Allow large mlx4_buddy bitmaps
IB/srp: Fix a race condition
IB/qib: Fix error return code in qib_init_7322_variables()
IB: Fix typos in infiniband drivers
IB/ipoib: Fix RCU pointer dereference of wrong object
IB/ipoib: Add missing locking when CM object is deleted
RDMA/ucma.c: Fix for events with wrong context on iWARP
RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs
IB/mlx4: Fix possible deadlock on sm_lock spinlock
This change updates the code related to configuring the transmit frame
checksum. Specifically I have updated the code so that we can only skip
inserting the checksum in the case that we are not performing some other
offload that will modify the frame data.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This change moves the RSC code into the non-EOP descriptor handling
function. The main motivation behind this change is to help reduce the
overhead in the non-RSC case. Previously the non-RSC path code would
always be checking for append count even if RSC had been disabled. Now
this code is completely skipped in a single conditional check instead of
having to make two separate checks.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This patch creates a function named ixgbe_fetch_rx_buffer. The sole
purpose of this function is to retrieve a single buffer off of the ring and
to place it in an skb.
The advantage to doing this is that it helps improve the readability since
I can decrease the indentation and for the code in this section.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This change makes it so that if only the first 256 bytes of a buffer are
used we just copy the data out and leave the offset and page count
unchanged. There are multiple advantages to this. First it allows us to
reuse the page much more in the case of pages larger than 4K. It also
allows us to avoid some expensive atomic operations in the form of
get_page/put_page. In perf I have seen CPU utilization for put_page drop
from 3.5% to 1.8% as a result of this patch when doing small packet routing,
and packet rates increased by about 3%.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This change creates a separate function for functionality similar to
pskb_pull_tail. The main motivation for moving it to a separate function
is so that later I can just skip this function in the case where we have
already copied the buffer into skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This patch makes it so that we will always have ownership of the buffers by
the time we get to ixgbe_add_rx_frag. This is necessary as I am planning to
add a copy-break to ixgbe_add_rx_frag and in order for that to function
correctly we need the CPU to have ownership of the buffer.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This change makes it so that we do not use double buffering if the page
size is larger than 4K. Instead we will simply walk through the page using
up to 3K per receive, and if we receive less than we only move the offset
by that amount. We will free the page when there is no longer any space
left that we can use instead of checking the page count to see if we can
cycle back to the start.
The main motivation behind this is to avoid the unnecessary truesize cost
for using a half page when most packets are 2K or smaller. With this new
approach the largest possible truesize for a page fragment will be 3K when
PAGE_SIZE is larger than 4K.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>