/spare/repo/netdev-2.6 branch 'master'
This commit is contained in:
352
Documentation/networking/cxgb.txt
Normal file
352
Documentation/networking/cxgb.txt
Normal file
@@ -0,0 +1,352 @@
|
||||
Chelsio N210 10Gb Ethernet Network Controller
|
||||
|
||||
Driver Release Notes for Linux
|
||||
|
||||
Version 2.1.1
|
||||
|
||||
June 20, 2005
|
||||
|
||||
CONTENTS
|
||||
========
|
||||
INTRODUCTION
|
||||
FEATURES
|
||||
PERFORMANCE
|
||||
DRIVER MESSAGES
|
||||
KNOWN ISSUES
|
||||
SUPPORT
|
||||
|
||||
|
||||
INTRODUCTION
|
||||
============
|
||||
|
||||
This document describes the Linux driver for Chelsio 10Gb Ethernet Network
|
||||
Controller. This driver supports the Chelsio N210 NIC and is backward
|
||||
compatible with the Chelsio N110 model 10Gb NICs.
|
||||
|
||||
|
||||
FEATURES
|
||||
========
|
||||
|
||||
Adaptive Interrupts (adaptive-rx)
|
||||
---------------------------------
|
||||
|
||||
This feature provides an adaptive algorithm that adjusts the interrupt
|
||||
coalescing parameters, allowing the driver to dynamically adapt the latency
|
||||
settings to achieve the highest performance during various types of network
|
||||
load.
|
||||
|
||||
The interface used to control this feature is ethtool. Please see the
|
||||
ethtool manpage for additional usage information.
|
||||
|
||||
By default, adaptive-rx is disabled.
|
||||
To enable adaptive-rx:
|
||||
|
||||
ethtool -C <interface> adaptive-rx on
|
||||
|
||||
To disable adaptive-rx, use ethtool:
|
||||
|
||||
ethtool -C <interface> adaptive-rx off
|
||||
|
||||
After disabling adaptive-rx, the timer latency value will be set to 50us.
|
||||
You may set the timer latency after disabling adaptive-rx:
|
||||
|
||||
ethtool -C <interface> rx-usecs <microseconds>
|
||||
|
||||
An example to set the timer latency value to 100us on eth0:
|
||||
|
||||
ethtool -C eth0 rx-usecs 100
|
||||
|
||||
You may also provide a timer latency value while disabling adpative-rx:
|
||||
|
||||
ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
|
||||
|
||||
If adaptive-rx is disabled and a timer latency value is specified, the timer
|
||||
will be set to the specified value until changed by the user or until
|
||||
adaptive-rx is enabled.
|
||||
|
||||
To view the status of the adaptive-rx and timer latency values:
|
||||
|
||||
ethtool -c <interface>
|
||||
|
||||
|
||||
TCP Segmentation Offloading (TSO) Support
|
||||
-----------------------------------------
|
||||
|
||||
This feature, also known as "large send", enables a system's protocol stack
|
||||
to offload portions of outbound TCP processing to a network interface card
|
||||
thereby reducing system CPU utilization and enhancing performance.
|
||||
|
||||
The interface used to control this feature is ethtool version 1.8 or higher.
|
||||
Please see the ethtool manpage for additional usage information.
|
||||
|
||||
By default, TSO is enabled.
|
||||
To disable TSO:
|
||||
|
||||
ethtool -K <interface> tso off
|
||||
|
||||
To enable TSO:
|
||||
|
||||
ethtool -K <interface> tso on
|
||||
|
||||
To view the status of TSO:
|
||||
|
||||
ethtool -k <interface>
|
||||
|
||||
|
||||
PERFORMANCE
|
||||
===========
|
||||
|
||||
The following information is provided as an example of how to change system
|
||||
parameters for "performance tuning" an what value to use. You may or may not
|
||||
want to change these system parameters, depending on your server/workstation
|
||||
application. Doing so is not warranted in any way by Chelsio Communications,
|
||||
and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
|
||||
of data or damage to equipment.
|
||||
|
||||
Your distribution may have a different way of doing things, or you may prefer
|
||||
a different method. These commands are shown only to provide an example of
|
||||
what to do and are by no means definitive.
|
||||
|
||||
Making any of the following system changes will only last until you reboot
|
||||
your system. You may want to write a script that runs at boot-up which
|
||||
includes the optimal settings for your system.
|
||||
|
||||
Setting PCI Latency Timer:
|
||||
setpci -d 1425:* 0x0c.l=0x0000F800
|
||||
|
||||
Disabling TCP timestamp:
|
||||
sysctl -w net.ipv4.tcp_timestamps=0
|
||||
|
||||
Disabling SACK:
|
||||
sysctl -w net.ipv4.tcp_sack=0
|
||||
|
||||
Setting large number of incoming connection requests:
|
||||
sysctl -w net.ipv4.tcp_max_syn_backlog=3000
|
||||
|
||||
Setting maximum receive socket buffer size:
|
||||
sysctl -w net.core.rmem_max=1024000
|
||||
|
||||
Setting maximum send socket buffer size:
|
||||
sysctl -w net.core.wmem_max=1024000
|
||||
|
||||
Set smp_affinity (on a multiprocessor system) to a single CPU:
|
||||
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
|
||||
|
||||
Setting default receive socket buffer size:
|
||||
sysctl -w net.core.rmem_default=524287
|
||||
|
||||
Setting default send socket buffer size:
|
||||
sysctl -w net.core.wmem_default=524287
|
||||
|
||||
Setting maximum option memory buffers:
|
||||
sysctl -w net.core.optmem_max=524287
|
||||
|
||||
Setting maximum backlog (# of unprocessed packets before kernel drops):
|
||||
sysctl -w net.core.netdev_max_backlog=300000
|
||||
|
||||
Setting TCP read buffers (min/default/max):
|
||||
sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
|
||||
|
||||
Setting TCP write buffers (min/pressure/max):
|
||||
sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
|
||||
|
||||
Setting TCP buffer space (min/pressure/max):
|
||||
sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
|
||||
|
||||
TCP window size for single connections:
|
||||
The receive buffer (RX_WINDOW) size must be at least as large as the
|
||||
Bandwidth-Delay Product of the communication link between the sender and
|
||||
receiver. Due to the variations of RTT, you may want to increase the buffer
|
||||
size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
|
||||
"TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
|
||||
At 10Gb speeds, use the following formula:
|
||||
RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
|
||||
Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
|
||||
RX_WINDOW sizes of 256KB - 512KB should be sufficient.
|
||||
Setting the min, max, and default receive buffer (RX_WINDOW) size:
|
||||
sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
|
||||
|
||||
TCP window size for multiple connections:
|
||||
The receive buffer (RX_WINDOW) size may be calculated the same as single
|
||||
connections, but should be divided by the number of connections. The
|
||||
smaller window prevents congestion and facilitates better pacing,
|
||||
especially if/when MAC level flow control does not work well or when it is
|
||||
not supported on the machine. Experimentation may be necessary to attain
|
||||
the correct value. This method is provided as a starting point fot the
|
||||
correct receive buffer size.
|
||||
Setting the min, max, and default receive buffer (RX_WINDOW) size is
|
||||
performed in the same manner as single connection.
|
||||
|
||||
|
||||
DRIVER MESSAGES
|
||||
===============
|
||||
|
||||
The following messages are the most common messages logged by syslog. These
|
||||
may be found in /var/log/messages.
|
||||
|
||||
Driver up:
|
||||
Chelsio Network Driver - version 2.1.1
|
||||
|
||||
NIC detected:
|
||||
eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
|
||||
|
||||
Link up:
|
||||
eth#: link is up at 10 Gbps, full duplex
|
||||
|
||||
Link down:
|
||||
eth#: link is down
|
||||
|
||||
|
||||
KNOWN ISSUES
|
||||
============
|
||||
|
||||
These issues have been identified during testing. The following information
|
||||
is provided as a workaround to the problem. In some cases, this problem is
|
||||
inherent to Linux or to a particular Linux Distribution and/or hardware
|
||||
platform.
|
||||
|
||||
1. Large number of TCP retransmits on a multiprocessor (SMP) system.
|
||||
|
||||
On a system with multiple CPUs, the interrupt (IRQ) for the network
|
||||
controller may be bound to more than one CPU. This will cause TCP
|
||||
retransmits if the packet data were to be split across different CPUs
|
||||
and re-assembled in a different order than expected.
|
||||
|
||||
To eliminate the TCP retransmits, set smp_affinity on the particular
|
||||
interrupt to a single CPU. You can locate the interrupt (IRQ) used on
|
||||
the N110/N210 by using ifconfig:
|
||||
ifconfig <dev_name> | grep Interrupt
|
||||
Set the smp_affinity to a single CPU:
|
||||
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
|
||||
|
||||
It is highly suggested that you do not run the irqbalance daemon on your
|
||||
system, as this will change any smp_affinity setting you have applied.
|
||||
The irqbalance daemon runs on a 10 second interval and binds interrupts
|
||||
to the least loaded CPU determined by the daemon. To disable this daemon:
|
||||
chkconfig --level 2345 irqbalance off
|
||||
|
||||
By default, some Linux distributions enable the kernel feature,
|
||||
irqbalance, which performs the same function as the daemon. To disable
|
||||
this feature, add the following line to your bootloader:
|
||||
noirqbalance
|
||||
|
||||
Example using the Grub bootloader:
|
||||
title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
|
||||
root (hd0,0)
|
||||
kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
|
||||
initrd /initrd-2.4.21-27.ELsmp.img
|
||||
|
||||
2. After running insmod, the driver is loaded and the incorrect network
|
||||
interface is brought up without running ifup.
|
||||
|
||||
When using 2.4.x kernels, including RHEL kernels, the Linux kernel
|
||||
invokes a script named "hotplug". This script is primarily used to
|
||||
automatically bring up USB devices when they are plugged in, however,
|
||||
the script also attempts to automatically bring up a network interface
|
||||
after loading the kernel module. The hotplug script does this by scanning
|
||||
the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
|
||||
for HWADDR=<mac_address>.
|
||||
|
||||
If the hotplug script does not find the HWADDRR within any of the
|
||||
ifcfg-eth# files, it will bring up the device with the next available
|
||||
interface name. If this interface is already configured for a different
|
||||
network card, your new interface will have incorrect IP address and
|
||||
network settings.
|
||||
|
||||
To solve this issue, you can add the HWADDR=<mac_address> key to the
|
||||
interface config file of your network controller.
|
||||
|
||||
To disable this "hotplug" feature, you may add the driver (module name)
|
||||
to the "blacklist" file located in /etc/hotplug. It has been noted that
|
||||
this does not work for network devices because the net.agent script
|
||||
does not use the blacklist file. Simply remove, or rename, the net.agent
|
||||
script located in /etc/hotplug to disable this feature.
|
||||
|
||||
3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
|
||||
on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
|
||||
|
||||
If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
|
||||
chipset, you may experience the "133-Mhz Mode Split Completion Data
|
||||
Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
|
||||
bus PCI-X bus.
|
||||
|
||||
AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
|
||||
can provide stale data via split completion cycles to a PCI-X card that
|
||||
is operating at 133 Mhz", causing data corruption.
|
||||
|
||||
AMD's provides three workarounds for this problem, however, Chelsio
|
||||
recommends the first option for best performance with this bug:
|
||||
|
||||
For 133Mhz secondary bus operation, limit the transaction length and
|
||||
the number of outstanding transactions, via BIOS configuration
|
||||
programming of the PCI-X card, to the following:
|
||||
|
||||
Data Length (bytes): 1k
|
||||
Total allowed outstanding transactions: 2
|
||||
|
||||
Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
|
||||
section 56, "133-MHz Mode Split Completion Data Corruption" for more
|
||||
details with this bug and workarounds suggested by AMD.
|
||||
|
||||
It may be possible to work outside AMD's recommended PCI-X settings, try
|
||||
increasing the Data Length to 2k bytes for increased performance. If you
|
||||
have issues with these settings, please revert to the "safe" settings
|
||||
and duplicate the problem before submitting a bug or asking for support.
|
||||
|
||||
NOTE: The default setting on most systems is 8 outstanding transactions
|
||||
and 2k bytes data length.
|
||||
|
||||
4. On multiprocessor systems, it has been noted that an application which
|
||||
is handling 10Gb networking can switch between CPUs causing degraded
|
||||
and/or unstable performance.
|
||||
|
||||
If running on an SMP system and taking performance measurements, it
|
||||
is suggested you either run the latest netperf-2.4.0+ or use a binding
|
||||
tool such as Tim Hockin's procstate utilities (runon)
|
||||
<http://www.hockin.org/~thockin/procstate/>.
|
||||
|
||||
Binding netserver and netperf (or other applications) to particular
|
||||
CPUs will have a significant difference in performance measurements.
|
||||
You may need to experiment which CPU to bind the application to in
|
||||
order to achieve the best performance for your system.
|
||||
|
||||
If you are developing an application designed for 10Gb networking,
|
||||
please keep in mind you may want to look at kernel functions
|
||||
sched_setaffinity & sched_getaffinity to bind your application.
|
||||
|
||||
If you are just running user-space applications such as ftp, telnet,
|
||||
etc., you may want to try the runon tool provided by Tim Hockin's
|
||||
procstate utility. You could also try binding the interface to a
|
||||
particular CPU: runon 0 ifup eth0
|
||||
|
||||
|
||||
SUPPORT
|
||||
=======
|
||||
|
||||
If you have problems with the software or hardware, please contact our
|
||||
customer support team via email at support@chelsio.com or check our website
|
||||
at http://www.chelsio.com
|
||||
|
||||
===============================================================================
|
||||
|
||||
Chelsio Communications
|
||||
370 San Aleso Ave.
|
||||
Suite 100
|
||||
Sunnyvale, CA 94085
|
||||
http://www.chelsio.com
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License, version 2, as
|
||||
published by the Free Software Foundation.
|
||||
|
||||
You should have received a copy of the GNU General Public License along
|
||||
with this program; if not, write to the Free Software Foundation, Inc.,
|
||||
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
|
||||
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
Copyright (c) 2003-2005 Chelsio Communications. All rights reserved.
|
||||
|
||||
===============================================================================
|
288
Documentation/networking/phy.txt
Normal file
288
Documentation/networking/phy.txt
Normal file
@@ -0,0 +1,288 @@
|
||||
|
||||
-------
|
||||
PHY Abstraction Layer
|
||||
(Updated 2005-07-21)
|
||||
|
||||
Purpose
|
||||
|
||||
Most network devices consist of set of registers which provide an interface
|
||||
to a MAC layer, which communicates with the physical connection through a
|
||||
PHY. The PHY concerns itself with negotiating link parameters with the link
|
||||
partner on the other side of the network connection (typically, an ethernet
|
||||
cable), and provides a register interface to allow drivers to determine what
|
||||
settings were chosen, and to configure what settings are allowed.
|
||||
|
||||
While these devices are distinct from the network devices, and conform to a
|
||||
standard layout for the registers, it has been common practice to integrate
|
||||
the PHY management code with the network driver. This has resulted in large
|
||||
amounts of redundant code. Also, on embedded systems with multiple (and
|
||||
sometimes quite different) ethernet controllers connected to the same
|
||||
management bus, it is difficult to ensure safe use of the bus.
|
||||
|
||||
Since the PHYs are devices, and the management busses through which they are
|
||||
accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
|
||||
In doing so, it has these goals:
|
||||
|
||||
1) Increase code-reuse
|
||||
2) Increase overall code-maintainability
|
||||
3) Speed development time for new network drivers, and for new systems
|
||||
|
||||
Basically, this layer is meant to provide an interface to PHY devices which
|
||||
allows network driver writers to write as little code as possible, while
|
||||
still providing a full feature set.
|
||||
|
||||
The MDIO bus
|
||||
|
||||
Most network devices are connected to a PHY by means of a management bus.
|
||||
Different devices use different busses (though some share common interfaces).
|
||||
In order to take advantage of the PAL, each bus interface needs to be
|
||||
registered as a distinct device.
|
||||
|
||||
1) read and write functions must be implemented. Their prototypes are:
|
||||
|
||||
int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
|
||||
int read(struct mii_bus *bus, int mii_id, int regnum);
|
||||
|
||||
mii_id is the address on the bus for the PHY, and regnum is the register
|
||||
number. These functions are guaranteed not to be called from interrupt
|
||||
time, so it is safe for them to block, waiting for an interrupt to signal
|
||||
the operation is complete
|
||||
|
||||
2) A reset function is necessary. This is used to return the bus to an
|
||||
initialized state.
|
||||
|
||||
3) A probe function is needed. This function should set up anything the bus
|
||||
driver needs, setup the mii_bus structure, and register with the PAL using
|
||||
mdiobus_register. Similarly, there's a remove function to undo all of
|
||||
that (use mdiobus_unregister).
|
||||
|
||||
4) Like any driver, the device_driver structure must be configured, and init
|
||||
exit functions are used to register the driver.
|
||||
|
||||
5) The bus must also be declared somewhere as a device, and registered.
|
||||
|
||||
As an example for how one driver implemented an mdio bus driver, see
|
||||
drivers/net/gianfar_mii.c and arch/ppc/syslib/mpc85xx_devices.c
|
||||
|
||||
Connecting to a PHY
|
||||
|
||||
Sometime during startup, the network driver needs to establish a connection
|
||||
between the PHY device, and the network device. At this time, the PHY's bus
|
||||
and drivers need to all have been loaded, so it is ready for the connection.
|
||||
At this point, there are several ways to connect to the PHY:
|
||||
|
||||
1) The PAL handles everything, and only calls the network driver when
|
||||
the link state changes, so it can react.
|
||||
|
||||
2) The PAL handles everything except interrupts (usually because the
|
||||
controller has the interrupt registers).
|
||||
|
||||
3) The PAL handles everything, but checks in with the driver every second,
|
||||
allowing the network driver to react first to any changes before the PAL
|
||||
does.
|
||||
|
||||
4) The PAL serves only as a library of functions, with the network device
|
||||
manually calling functions to update status, and configure the PHY
|
||||
|
||||
|
||||
Letting the PHY Abstraction Layer do Everything
|
||||
|
||||
If you choose option 1 (The hope is that every driver can, but to still be
|
||||
useful to drivers that can't), connecting to the PHY is simple:
|
||||
|
||||
First, you need a function to react to changes in the link state. This
|
||||
function follows this protocol:
|
||||
|
||||
static void adjust_link(struct net_device *dev);
|
||||
|
||||
Next, you need to know the device name of the PHY connected to this device.
|
||||
The name will look something like, "phy0:0", where the first number is the
|
||||
bus id, and the second is the PHY's address on that bus.
|
||||
|
||||
Now, to connect, just call this function:
|
||||
|
||||
phydev = phy_connect(dev, phy_name, &adjust_link, flags);
|
||||
|
||||
phydev is a pointer to the phy_device structure which represents the PHY. If
|
||||
phy_connect is successful, it will return the pointer. dev, here, is the
|
||||
pointer to your net_device. Once done, this function will have started the
|
||||
PHY's software state machine, and registered for the PHY's interrupt, if it
|
||||
has one. The phydev structure will be populated with information about the
|
||||
current state, though the PHY will not yet be truly operational at this
|
||||
point.
|
||||
|
||||
flags is a u32 which can optionally contain phy-specific flags.
|
||||
This is useful if the system has put hardware restrictions on
|
||||
the PHY/controller, of which the PHY needs to be aware.
|
||||
|
||||
Now just make sure that phydev->supported and phydev->advertising have any
|
||||
values pruned from them which don't make sense for your controller (a 10/100
|
||||
controller may be connected to a gigabit capable PHY, so you would need to
|
||||
mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions
|
||||
for these bitfields. Note that you should not SET any bits, or the PHY may
|
||||
get put into an unsupported state.
|
||||
|
||||
Lastly, once the controller is ready to handle network traffic, you call
|
||||
phy_start(phydev). This tells the PAL that you are ready, and configures the
|
||||
PHY to connect to the network. If you want to handle your own interrupts,
|
||||
just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start.
|
||||
Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL.
|
||||
|
||||
When you want to disconnect from the network (even if just briefly), you call
|
||||
phy_stop(phydev).
|
||||
|
||||
Keeping Close Tabs on the PAL
|
||||
|
||||
It is possible that the PAL's built-in state machine needs a little help to
|
||||
keep your network device and the PHY properly in sync. If so, you can
|
||||
register a helper function when connecting to the PHY, which will be called
|
||||
every second before the state machine reacts to any changes. To do this, you
|
||||
need to manually call phy_attach() and phy_prepare_link(), and then call
|
||||
phy_start_machine() with the second argument set to point to your special
|
||||
handler.
|
||||
|
||||
Currently there are no examples of how to use this functionality, and testing
|
||||
on it has been limited because the author does not have any drivers which use
|
||||
it (they all use option 1). So Caveat Emptor.
|
||||
|
||||
Doing it all yourself
|
||||
|
||||
There's a remote chance that the PAL's built-in state machine cannot track
|
||||
the complex interactions between the PHY and your network device. If this is
|
||||
so, you can simply call phy_attach(), and not call phy_start_machine or
|
||||
phy_prepare_link(). This will mean that phydev->state is entirely yours to
|
||||
handle (phy_start and phy_stop toggle between some of the states, so you
|
||||
might need to avoid them).
|
||||
|
||||
An effort has been made to make sure that useful functionality can be
|
||||
accessed without the state-machine running, and most of these functions are
|
||||
descended from functions which did not interact with a complex state-machine.
|
||||
However, again, no effort has been made so far to test running without the
|
||||
state machine, so tryer beware.
|
||||
|
||||
Here is a brief rundown of the functions:
|
||||
|
||||
int phy_read(struct phy_device *phydev, u16 regnum);
|
||||
int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
|
||||
|
||||
Simple read/write primitives. They invoke the bus's read/write function
|
||||
pointers.
|
||||
|
||||
void phy_print_status(struct phy_device *phydev);
|
||||
|
||||
A convenience function to print out the PHY status neatly.
|
||||
|
||||
int phy_clear_interrupt(struct phy_device *phydev);
|
||||
int phy_config_interrupt(struct phy_device *phydev, u32 interrupts);
|
||||
|
||||
Clear the PHY's interrupt, and configure which ones are allowed,
|
||||
respectively. Currently only supports all on, or all off.
|
||||
|
||||
int phy_enable_interrupts(struct phy_device *phydev);
|
||||
int phy_disable_interrupts(struct phy_device *phydev);
|
||||
|
||||
Functions which enable/disable PHY interrupts, clearing them
|
||||
before and after, respectively.
|
||||
|
||||
int phy_start_interrupts(struct phy_device *phydev);
|
||||
int phy_stop_interrupts(struct phy_device *phydev);
|
||||
|
||||
Requests the IRQ for the PHY interrupts, then enables them for
|
||||
start, or disables then frees them for stop.
|
||||
|
||||
struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
|
||||
u32 flags);
|
||||
|
||||
Attaches a network device to a particular PHY, binding the PHY to a generic
|
||||
driver if none was found during bus initialization. Passes in
|
||||
any phy-specific flags as needed.
|
||||
|
||||
int phy_start_aneg(struct phy_device *phydev);
|
||||
|
||||
Using variables inside the phydev structure, either configures advertising
|
||||
and resets autonegotiation, or disables autonegotiation, and configures
|
||||
forced settings.
|
||||
|
||||
static inline int phy_read_status(struct phy_device *phydev);
|
||||
|
||||
Fills the phydev structure with up-to-date information about the current
|
||||
settings in the PHY.
|
||||
|
||||
void phy_sanitize_settings(struct phy_device *phydev)
|
||||
|
||||
Resolves differences between currently desired settings, and
|
||||
supported settings for the given PHY device. Does not make
|
||||
the changes in the hardware, though.
|
||||
|
||||
int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
|
||||
int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
|
||||
|
||||
Ethtool convenience functions.
|
||||
|
||||
int phy_mii_ioctl(struct phy_device *phydev,
|
||||
struct mii_ioctl_data *mii_data, int cmd);
|
||||
|
||||
The MII ioctl. Note that this function will completely screw up the state
|
||||
machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to
|
||||
use this only to write registers which are not standard, and don't set off
|
||||
a renegotiation.
|
||||
|
||||
|
||||
PHY Device Drivers
|
||||
|
||||
With the PHY Abstraction Layer, adding support for new PHYs is
|
||||
quite easy. In some cases, no work is required at all! However,
|
||||
many PHYs require a little hand-holding to get up-and-running.
|
||||
|
||||
Generic PHY driver
|
||||
|
||||
If the desired PHY doesn't have any errata, quirks, or special
|
||||
features you want to support, then it may be best to not add
|
||||
support, and let the PHY Abstraction Layer's Generic PHY Driver
|
||||
do all of the work.
|
||||
|
||||
Writing a PHY driver
|
||||
|
||||
If you do need to write a PHY driver, the first thing to do is
|
||||
make sure it can be matched with an appropriate PHY device.
|
||||
This is done during bus initialization by reading the device's
|
||||
UID (stored in registers 2 and 3), then comparing it to each
|
||||
driver's phy_id field by ANDing it with each driver's
|
||||
phy_id_mask field. Also, it needs a name. Here's an example:
|
||||
|
||||
static struct phy_driver dm9161_driver = {
|
||||
.phy_id = 0x0181b880,
|
||||
.name = "Davicom DM9161E",
|
||||
.phy_id_mask = 0x0ffffff0,
|
||||
...
|
||||
}
|
||||
|
||||
Next, you need to specify what features (speed, duplex, autoneg,
|
||||
etc) your PHY device and driver support. Most PHYs support
|
||||
PHY_BASIC_FEATURES, but you can look in include/mii.h for other
|
||||
features.
|
||||
|
||||
Each driver consists of a number of function pointers:
|
||||
|
||||
config_init: configures PHY into a sane state after a reset.
|
||||
For instance, a Davicom PHY requires descrambling disabled.
|
||||
probe: Does any setup needed by the driver
|
||||
suspend/resume: power management
|
||||
config_aneg: Changes the speed/duplex/negotiation settings
|
||||
read_status: Reads the current speed/duplex/negotiation settings
|
||||
ack_interrupt: Clear a pending interrupt
|
||||
config_intr: Enable or disable interrupts
|
||||
remove: Does any driver take-down
|
||||
|
||||
Of these, only config_aneg and read_status are required to be
|
||||
assigned by the driver code. The rest are optional. Also, it is
|
||||
preferred to use the generic phy driver's versions of these two
|
||||
functions if at all possible: genphy_read_status and
|
||||
genphy_config_aneg. If this is not possible, it is likely that
|
||||
you only need to perform some actions before and after invoking
|
||||
these functions, and so your functions will wrap the generic
|
||||
ones.
|
||||
|
||||
Feel free to look at the Marvell, Cicada, and Davicom drivers in
|
||||
drivers/net/phy/ for examples (the lxt and qsemi drivers have
|
||||
not been tested as of this writing)
|
Reference in New Issue
Block a user