Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (867 commits)
  [SKY2]: status polling loop (post merge)
  [NET]: Fix NAPI completion handling in some drivers.
  [TCP]: Limit processing lost_retrans loop to work-to-do cases
  [TCP]: Fix lost_retrans loop vs fastpath problems
  [TCP]: No need to re-count fackets_out/sacked_out at RTO
  [TCP]: Extract tcp_match_queue_to_sack from sacktag code
  [TCP]: Kill almost unused variable pcount from sacktag
  [TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L
  [TCP]: Add bytes_acked (ABC) clearing to FRTO too
  [IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2
  [NETFILTER]: x_tables: add missing ip6t_modulename aliases
  [NETFILTER]: nf_conntrack_tcp: fix connection reopening
  [QETH]: fix qeth_main.c
  [NETLINK]: fib_frontend build fixes
  [IPv6]: Export userland ND options through netlink (RDNSS support)
  [9P]: build fix with !CONFIG_SYSCTL
  [NET]: Fix dev_put() and dev_hold() comments
  [NET]: make netlink user -> kernel interface synchronious
  [NET]: unify netlink kernel socket recognition
  [NET]: cleanup 3rd argument in netlink_sendskb
  ...

Fix up conflicts manually in Documentation/feature-removal-schedule.txt
and my new least favourite crap, the "mod_devicetable" support in the
files include/linux/mod_devicetable.h and scripts/mod/file2alias.c.

(The latter files seem to be explicitly _designed_ to get conflicts when
different subsystems work with them - that have an absolutely horrid
lack of subsystem separation!)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
这个提交包含在:
Linus Torvalds
2007-10-11 19:40:14 -07:00
当前提交 038a5008b2
修改 1247 个文件,包含 204245 行新增44520 行删除

查看文件

@@ -240,17 +240,23 @@ X!Ilib/string.c
<sect1><title>Driver Support</title>
!Enet/core/dev.c
!Enet/ethernet/eth.c
!Enet/sched/sch_generic.c
!Iinclude/linux/etherdevice.h
!Iinclude/linux/netdevice.h
</sect1>
<sect1><title>PHY Support</title>
!Edrivers/net/phy/phy.c
!Idrivers/net/phy/phy.c
!Edrivers/net/phy/phy_device.c
!Idrivers/net/phy/phy_device.c
!Edrivers/net/phy/mdio_bus.c
!Idrivers/net/phy/mdio_bus.c
<!-- FIXME: Removed for now since no structured comments in source
X!Enet/core/wireless.c
-->
</sect1>
<!-- FIXME: Removed for now since no structured comments in source
<sect1><title>Wireless</title>
X!Enet/core/wireless.c
</sect1>
-->
<sect1><title>Synchronous PPP</title>
!Edrivers/net/wan/syncppp.c
</sect1>

查看文件

@@ -314,3 +314,16 @@ Why: The i386/x86_64 merge provides a symlink to the old bzImage
location so not yet updated user space tools, e.g. package
scripts, do not break.
Who: Thomas Gleixner <tglx@linutronix.de>
---------------------------
What: shaper network driver
When: January 2008
Files: drivers/net/shaper.c, include/linux/if_shaper.h
Why: This driver has been marked obsolete for many years.
It was only designed to work on lower speed links and has design
flaws that lead to machine crashes. The qdisc infrastructure in
2.4 or later kernels, provides richer features and is more robust.
Who: Stephen Hemminger <shemminger@linux-foundation.org>
---------------------------

查看文件

@@ -1,766 +0,0 @@
HISTORY:
February 16/2002 -- revision 0.2.1:
COR typo corrected
February 10/2002 -- revision 0.2:
some spell checking ;->
January 12/2002 -- revision 0.1
This is still work in progress so may change.
To keep up to date please watch this space.
Introduction to NAPI
====================
NAPI is a proven (www.cyberus.ca/~hadi/usenix-paper.tgz) technique
to improve network performance on Linux. For more details please
read that paper.
NAPI provides a "inherent mitigation" which is bound by system capacity
as can be seen from the following data collected by Robert on Gigabit
ethernet (e1000):
Psize Ipps Tput Rxint Txint Done Ndone
---------------------------------------------------------------
60 890000 409362 17 27622 7 6823
128 758150 464364 21 9301 10 7738
256 445632 774646 42 15507 21 12906
512 232666 994445 241292 19147 241192 1062
1024 119061 1000003 872519 19258 872511 0
1440 85193 1000003 946576 19505 946569 0
Legend:
"Ipps" stands for input packets per second.
"Tput" == packets out of total 1M that made it out.
"txint" == transmit completion interrupts seen
"Done" == The number of times that the poll() managed to pull all
packets out of the rx ring. Note from this that the lower the
load the more we could clean up the rxring
"Ndone" == is the converse of "Done". Note again, that the higher
the load the more times we couldn't clean up the rxring.
Observe that:
when the NIC receives 890Kpackets/sec only 17 rx interrupts are generated.
The system cant handle the processing at 1 interrupt/packet at that load level.
At lower rates on the other hand, rx interrupts go up and therefore the
interrupt/packet ratio goes up (as observable from that table). So there is
possibility that under low enough input, you get one poll call for each
input packet caused by a single interrupt each time. And if the system
cant handle interrupt per packet ratio of 1, then it will just have to
chug along ....
0) Prerequisites:
==================
A driver MAY continue using the old 2.4 technique for interfacing
to the network stack and not benefit from the NAPI changes.
NAPI additions to the kernel do not break backward compatibility.
NAPI, however, requires the following features to be available:
A) DMA ring or enough RAM to store packets in software devices.
B) Ability to turn off interrupts or maybe events that send packets up
the stack.
NAPI processes packet events in what is known as dev->poll() method.
Typically, only packet receive events are processed in dev->poll().
The rest of the events MAY be processed by the regular interrupt handler
to reduce processing latency (justified also because there are not that
many of them).
Note, however, NAPI does not enforce that dev->poll() only processes
receive events.
Tests with the tulip driver indicated slightly increased latency if
all of the interrupt handler is moved to dev->poll(). Also MII handling
gets a little trickier.
The example used in this document is to move the receive processing only
to dev->poll(); this is shown with the patch for the tulip driver.
For an example of code that moves all the interrupt driver to
dev->poll() look at the ported e1000 code.
There are caveats that might force you to go with moving everything to
dev->poll(). Different NICs work differently depending on their status/event
acknowledgement setup.
There are two types of event register ACK mechanisms.
I) what is known as Clear-on-read (COR).
when you read the status/event register, it clears everything!
The natsemi and sunbmac NICs are known to do this.
In this case your only choice is to move all to dev->poll()
II) Clear-on-write (COW)
i) you clear the status by writing a 1 in the bit-location you want.
These are the majority of the NICs and work the best with NAPI.
Put only receive events in dev->poll(); leave the rest in
the old interrupt handler.
ii) whatever you write in the status register clears every thing ;->
Cant seem to find any supported by Linux which do this. If
someone knows such a chip email us please.
Move all to dev->poll()
C) Ability to detect new work correctly.
NAPI works by shutting down event interrupts when there's work and
turning them on when there's none.
New packets might show up in the small window while interrupts were being
re-enabled (refer to appendix 2). A packet might sneak in during the period
we are enabling interrupts. We only get to know about such a packet when the
next new packet arrives and generates an interrupt.
Essentially, there is a small window of opportunity for a race condition
which for clarity we'll refer to as the "rotting packet".
This is a very important topic and appendix 2 is dedicated for more
discussion.
Locking rules and environmental guarantees
==========================================
-Guarantee: Only one CPU at any time can call dev->poll(); this is because
only one CPU can pick the initial interrupt and hence the initial
netif_rx_schedule(dev);
- The core layer invokes devices to send packets in a round robin format.
This implies receive is totally lockless because of the guarantee that only
one CPU is executing it.
- contention can only be the result of some other CPU accessing the rx
ring. This happens only in close() and suspend() (when these methods
try to clean the rx ring);
****guarantee: driver authors need not worry about this; synchronization
is taken care for them by the top net layer.
-local interrupts are enabled (if you dont move all to dev->poll()). For
example link/MII and txcomplete continue functioning just same old way.
This improves the latency of processing these events. It is also assumed that
the receive interrupt is the largest cause of noise. Note this might not
always be true.
[according to Manfred Spraul, the winbond insists on sending one
txmitcomplete interrupt for each packet (although this can be mitigated)].
For these broken drivers, move all to dev->poll().
For the rest of this text, we'll assume that dev->poll() only
processes receive events.
new methods introduce by NAPI
=============================
a) netif_rx_schedule(dev)
Called by an IRQ handler to schedule a poll for device
b) netif_rx_schedule_prep(dev)
puts the device in a state which allows for it to be added to the
CPU polling list if it is up and running. You can look at this as
the first half of netif_rx_schedule(dev) above; the second half
being c) below.
c) __netif_rx_schedule(dev)
Add device to the poll list for this CPU; assuming that _prep above
has already been called and returned 1.
d) netif_rx_reschedule(dev, undo)
Called to reschedule polling for device specifically for some
deficient hardware. Read Appendix 2 for more details.
e) netif_rx_complete(dev)
Remove interface from the CPU poll list: it must be in the poll list
on current cpu. This primitive is called by dev->poll(), when
it completes its work. The device cannot be out of poll list at this
call, if it is then clearly it is a BUG(). You'll know ;->
All of the above methods are used below, so keep reading for clarity.
Device driver changes to be made when porting NAPI
==================================================
Below we describe what kind of changes are required for NAPI to work.
1) introduction of dev->poll() method
=====================================
This is the method that is invoked by the network core when it requests
for new packets from the driver. A driver is allowed to send upto
dev->quota packets by the current CPU before yielding to the network
subsystem (so other devices can also get opportunity to send to the stack).
dev->poll() prototype looks as follows:
int my_poll(struct net_device *dev, int *budget)
budget is the remaining number of packets the network subsystem on the
current CPU can send up the stack before yielding to other system tasks.
*Each driver is responsible for decrementing budget by the total number of
packets sent.
Total number of packets cannot exceed dev->quota.
dev->poll() method is invoked by the top layer, the driver just sends if it
can to the stack the packet quantity requested.
more on dev->poll() below after the interrupt changes are explained.
2) registering dev->poll() method
===================================
dev->poll should be set in the dev->probe() method.
e.g:
dev->open = my_open;
.
.
/* two new additions */
/* first register my poll method */
dev->poll = my_poll;
/* next register my weight/quanta; can be overridden in /proc */
dev->weight = 16;
.
.
dev->stop = my_close;
3) scheduling dev->poll()
=============================
This involves modifying the interrupt handler and the code
path which takes the packet off the NIC and sends them to the
stack.
it's important at this point to introduce the classical D Becker
interrupt processor:
------------------
static irqreturn_t
netdevice_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
struct net_device *dev = (struct net_device *)dev_instance;
struct my_private *tp = (struct my_private *)dev->priv;
int work_count = my_work_count;
status = read_interrupt_status_reg();
if (status == 0)
return IRQ_NONE; /* Shared IRQ: not us */
if (status == 0xffff)
return IRQ_HANDLED; /* Hot unplug */
if (status & error)
do_some_error_handling()
do {
acknowledge_ints_ASAP();
if (status & link_interrupt) {
spin_lock(&tp->link_lock);
do_some_link_stat_stuff();
spin_lock(&tp->link_lock);
}
if (status & rx_interrupt) {
receive_packets(dev);
}
if (status & rx_nobufs) {
make_rx_buffs_avail();
}
if (status & tx_related) {
spin_lock(&tp->lock);
tx_ring_free(dev);
if (tx_died)
restart_tx();
spin_unlock(&tp->lock);
}
status = read_interrupt_status_reg();
} while (!(status & error) || more_work_to_be_done);
return IRQ_HANDLED;
}
----------------------------------------------------------------------
We now change this to what is shown below to NAPI-enable it:
----------------------------------------------------------------------
static irqreturn_t
netdevice_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
struct net_device *dev = (struct net_device *)dev_instance;
struct my_private *tp = (struct my_private *)dev->priv;
status = read_interrupt_status_reg();
if (status == 0)
return IRQ_NONE; /* Shared IRQ: not us */
if (status == 0xffff)
return IRQ_HANDLED; /* Hot unplug */
if (status & error)
do_some_error_handling();
do {
/************************ start note *********************************/
acknowledge_ints_ASAP(); // dont ack rx and rxnobuff here
/************************ end note *********************************/
if (status & link_interrupt) {
spin_lock(&tp->link_lock);
do_some_link_stat_stuff();
spin_unlock(&tp->link_lock);
}
/************************ start note *********************************/
if (status & rx_interrupt || (status & rx_nobuffs)) {
if (netif_rx_schedule_prep(dev)) {
/* disable interrupts caused
* by arriving packets */
disable_rx_and_rxnobuff_ints();
/* tell system we have work to be done. */
__netif_rx_schedule(dev);
} else {
printk("driver bug! interrupt while in poll\n");
/* FIX by disabling interrupts */
disable_rx_and_rxnobuff_ints();
}
}
/************************ end note note *********************************/
if (status & tx_related) {
spin_lock(&tp->lock);
tx_ring_free(dev);
if (tx_died)
restart_tx();
spin_unlock(&tp->lock);
}
status = read_interrupt_status_reg();
/************************ start note *********************************/
} while (!(status & error) || more_work_to_be_done(status));
/************************ end note note *********************************/
return IRQ_HANDLED;
}
---------------------------------------------------------------------
We note several things from above:
I) Any interrupt source which is caused by arriving packets is now
turned off when it occurs. Depending on the hardware, there could be
several reasons that arriving packets would cause interrupts; these are the
interrupt sources we wish to avoid. The two common ones are a) a packet
arriving (rxint) b) a packet arriving and finding no DMA buffers available
(rxnobuff) .
This means also acknowledge_ints_ASAP() will not clear the status
register for those two items above; clearing is done in the place where
proper work is done within NAPI; at the poll() and refill_rx_ring()
discussed further below.
netif_rx_schedule_prep() returns 1 if device is in running state and
gets successfully added to the core poll list. If we get a zero value
we can _almost_ assume are already added to the list (instead of not running.
Logic based on the fact that you shouldn't get interrupt if not running)
We rectify this by disabling rx and rxnobuf interrupts.
II) that receive_packets(dev) and make_rx_buffs_avail() may have disappeared.
These functionalities are still around actually......
infact, receive_packets(dev) is very close to my_poll() and
make_rx_buffs_avail() is invoked from my_poll()
4) converting receive_packets() to dev->poll()
===============================================
We need to convert the classical D Becker receive_packets(dev) to my_poll()
First the typical receive_packets() below:
-------------------------------------------------------------------
/* this is called by interrupt handler */
static void receive_packets (struct net_device *dev)
{
struct my_private *tp = (struct my_private *)dev->priv;
rx_ring = tp->rx_ring;
cur_rx = tp->cur_rx;
int entry = cur_rx % RX_RING_SIZE;
int received = 0;
int rx_work_limit = tp->dirty_rx + RX_RING_SIZE - tp->cur_rx;
while (rx_ring_not_empty) {
u32 rx_status;
unsigned int rx_size;
unsigned int pkt_size;
struct sk_buff *skb;
/* read size+status of next frame from DMA ring buffer */
/* the number 16 and 4 are just examples */
rx_status = le32_to_cpu (*(u32 *) (rx_ring + ring_offset));
rx_size = rx_status >> 16;
pkt_size = rx_size - 4;
/* process errors */
if ((rx_size > (MAX_ETH_FRAME_SIZE+4)) ||
(!(rx_status & RxStatusOK))) {
netdrv_rx_err (rx_status, dev, tp, ioaddr);
return;
}
if (--rx_work_limit < 0)
break;
/* grab a skb */
skb = dev_alloc_skb (pkt_size + 2);
if (skb) {
.
.
netif_rx (skb);
.
.
} else { /* OOM */
/*seems very driver specific ... some just pass
whatever is on the ring already. */
}
/* move to the next skb on the ring */
entry = (++tp->cur_rx) % RX_RING_SIZE;
received++ ;
}
/* store current ring pointer state */
tp->cur_rx = cur_rx;
/* Refill the Rx ring buffers if they are needed */
refill_rx_ring();
.
.
}
-------------------------------------------------------------------
We change it to a new one below; note the additional parameter in
the call.
-------------------------------------------------------------------
/* this is called by the network core */
static int my_poll (struct net_device *dev, int *budget)
{
struct my_private *tp = (struct my_private *)dev->priv;
rx_ring = tp->rx_ring;
cur_rx = tp->cur_rx;
int entry = cur_rx % RX_BUF_LEN;
/* maximum packets to send to the stack */
/************************ note note *********************************/
int rx_work_limit = dev->quota;
/************************ end note note *********************************/
do { // outer beginning loop starts here
clear_rx_status_register_bit();
while (rx_ring_not_empty) {
u32 rx_status;
unsigned int rx_size;
unsigned int pkt_size;
struct sk_buff *skb;
/* read size+status of next frame from DMA ring buffer */
/* the number 16 and 4 are just examples */
rx_status = le32_to_cpu (*(u32 *) (rx_ring + ring_offset));
rx_size = rx_status >> 16;
pkt_size = rx_size - 4;
/* process errors */
if ((rx_size > (MAX_ETH_FRAME_SIZE+4)) ||
(!(rx_status & RxStatusOK))) {
netdrv_rx_err (rx_status, dev, tp, ioaddr);
return 1;
}
/************************ note note *********************************/
if (--rx_work_limit < 0) { /* we got packets, but no quota */
/* store current ring pointer state */
tp->cur_rx = cur_rx;
/* Refill the Rx ring buffers if they are needed */
refill_rx_ring(dev);
goto not_done;
}
/********************** end note **********************************/
/* grab a skb */
skb = dev_alloc_skb (pkt_size + 2);
if (skb) {
.
.
/************************ note note *********************************/
netif_receive_skb (skb);
/********************** end note **********************************/
.
.
} else { /* OOM */
/*seems very driver specific ... common is just pass
whatever is on the ring already. */
}
/* move to the next skb on the ring */
entry = (++tp->cur_rx) % RX_RING_SIZE;
received++ ;
}
/* store current ring pointer state */
tp->cur_rx = cur_rx;
/* Refill the Rx ring buffers if they are needed */
refill_rx_ring(dev);
/* no packets on ring; but new ones can arrive since we last
checked */
status = read_interrupt_status_reg();
if (rx status is not set) {
/* If something arrives in this narrow window,
an interrupt will be generated */
goto done;
}
/* done! at least that's what it looks like ;->
if new packets came in after our last check on status bits
they'll be caught by the while check and we go back and clear them
since we havent exceeded our quota */
} while (rx_status_is_set);
done:
/************************ note note *********************************/
dev->quota -= received;
*budget -= received;
/* If RX ring is not full we are out of memory. */
if (tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL)
goto oom;
/* we are happy/done, no more packets on ring; put us back
to where we can start processing interrupts again */
netif_rx_complete(dev);
enable_rx_and_rxnobuf_ints();
/* The last op happens after poll completion. Which means the following:
* 1. it can race with disabling irqs in irq handler (which are done to
* schedule polls)
* 2. it can race with dis/enabling irqs in other poll threads
* 3. if an irq raised after the beginning of the outer beginning
* loop (marked in the code above), it will be immediately
* triggered here.
*
* Summarizing: the logic may result in some redundant irqs both
* due to races in masking and due to too late acking of already
* processed irqs. The good news: no events are ever lost.
*/
return 0; /* done */
not_done:
if (tp->cur_rx - tp->dirty_rx > RX_RING_SIZE/2 ||
tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL)
refill_rx_ring(dev);
if (!received) {
printk("received==0\n");
received = 1;
}
dev->quota -= received;
*budget -= received;
return 1; /* not_done */
oom:
/* Start timer, stop polling, but do not enable rx interrupts. */
start_poll_timer(dev);
return 0; /* we'll take it from here so tell core "done"*/
/************************ End note note *********************************/
}
-------------------------------------------------------------------
From above we note that:
0) rx_work_limit = dev->quota
1) refill_rx_ring() is in charge of clearing the bit for rxnobuff when
it does the work.
2) We have a done and not_done state.
3) instead of netif_rx() we call netif_receive_skb() to pass the skb.
4) we have a new way of handling oom condition
5) A new outer for (;;) loop has been added. This serves the purpose of
ensuring that if a new packet has come in, after we are all set and done,
and we have not exceeded our quota that we continue sending packets up.
-----------------------------------------------------------
Poll timer code will need to do the following:
a)
if (tp->cur_rx - tp->dirty_rx > RX_RING_SIZE/2 ||
tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL)
refill_rx_ring(dev);
/* If RX ring is not full we are still out of memory.
Restart the timer again. Else we re-add ourselves
to the master poll list.
*/
if (tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL)
restart_timer();
else netif_rx_schedule(dev); /* we are back on the poll list */
5) dev->close() and dev->suspend() issues
==========================================
The driver writer needn't worry about this; the top net layer takes
care of it.
6) Adding new Stats to /proc
=============================
In order to debug some of the new features, we introduce new stats
that need to be collected.
TODO: Fill this later.
APPENDIX 1: discussion on using ethernet HW FC
==============================================
Most chips with FC only send a pause packet when they run out of Rx buffers.
Since packets are pulled off the DMA ring by a softirq in NAPI,
if the system is slow in grabbing them and we have a high input
rate (faster than the system's capacity to remove packets), then theoretically
there will only be one rx interrupt for all packets during a given packetstorm.
Under low load, we might have a single interrupt per packet.
FC should be programmed to apply in the case when the system cant pull out
packets fast enough i.e send a pause only when you run out of rx buffers.
Note FC in itself is a good solution but we have found it to not be
much of a commodity feature (both in NICs and switches) and hence falls
under the same category as using NIC based mitigation. Also, experiments
indicate that it's much harder to resolve the resource allocation
issue (aka lazy receiving that NAPI offers) and hence quantify its usefulness
proved harder. In any case, FC works even better with NAPI but is not
necessary.
APPENDIX 2: the "rotting packet" race-window avoidance scheme
=============================================================
There are two types of associations seen here
1) status/int which honors level triggered IRQ
If a status bit for receive or rxnobuff is set and the corresponding
interrupt-enable bit is not on, then no interrupts will be generated. However,
as soon as the "interrupt-enable" bit is unmasked, an immediate interrupt is
generated. [assuming the status bit was not turned off].
Generally the concept of level triggered IRQs in association with a status and
interrupt-enable CSR register set is used to avoid the race.
If we take the example of the tulip:
"pending work" is indicated by the status bit(CSR5 in tulip).
the corresponding interrupt bit (CSR7 in tulip) might be turned off (but
the CSR5 will continue to be turned on with new packet arrivals even if
we clear it the first time)
Very important is the fact that if we turn on the interrupt bit on when
status is set that an immediate irq is triggered.
If we cleared the rx ring and proclaimed there was "no more work
to be done" and then went on to do a few other things; then when we enable
interrupts, there is a possibility that a new packet might sneak in during
this phase. It helps to look at the pseudo code for the tulip poll
routine:
--------------------------
do {
ACK;
while (ring_is_not_empty()) {
work-work-work
if quota is exceeded: exit, no touching irq status/mask
}
/* No packets, but new can arrive while we are doing this*/
CSR5 := read
if (CSR5 is not set) {
/* If something arrives in this narrow window here,
* where the comments are ;-> irq will be generated */
unmask irqs;
exit poll;
}
} while (rx_status_is_set);
------------------------
CSR5 bit of interest is only the rx status.
If you look at the last if statement:
you just finished grabbing all the packets from the rx ring .. you check if
status bit says there are more packets just in ... it says none; you then
enable rx interrupts again; if a new packet just came in during this check,
we are counting that CSR5 will be set in that small window of opportunity
and that by re-enabling interrupts, we would actually trigger an interrupt
to register the new packet for processing.
[The above description nay be very verbose, if you have better wording
that will make this more understandable, please suggest it.]
2) non-capable hardware
These do not generally respect level triggered IRQs. Normally,
irqs may be lost while being masked and the only way to leave poll is to do
a double check for new input after netif_rx_complete() is invoked
and re-enable polling (after seeing this new input).
Sample code:
---------
.
.
restart_poll:
while (ring_is_not_empty()) {
work-work-work
if quota is exceeded: exit, not touching irq status/mask
}
.
.
.
enable_rx_interrupts()
netif_rx_complete(dev);
if (ring_has_new_packet() && netif_rx_reschedule(dev, received)) {
disable_rx_and_rxnobufs()
goto restart_poll
} while (rx_status_is_set);
---------
Basically netif_rx_complete() removes us from the poll list, but because a
new packet which will never be caught due to the possibility of a race
might come in, we attempt to re-add ourselves to the poll list.
APPENDIX 3: Scheduling issues.
==============================
As seen NAPI moves processing to softirq level. Linux uses the ksoftirqd as the
general solution to schedule softirq's to run before next interrupt and by putting
them under scheduler control. Also this prevents consecutive softirq's from
monopolize the CPU. This also have the effect that the priority of ksoftirq needs
to be considered when running very CPU-intensive applications and networking to
get the proper balance of softirq/user balance. Increasing ksoftirq priority to 0
(eventually more) is reported cure problems with low network performance at high
CPU load.
Most used processes in a GIGE router:
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND
root 3 0.2 0.0 0 0 ? RWN Aug 15 602:00 (ksoftirqd_CPU0)
root 232 0.0 7.9 41400 40884 ? S Aug 15 74:12 gated
--------------------------------------------------------------------
relevant sites:
==================
ftp://robur.slu.se/pub/Linux/net-development/NAPI/
--------------------------------------------------------------------
TODO: Write net-skeleton.c driver.
-------------------------------------------------------------
Authors:
========
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Jamal Hadi Salim <hadi@cyberus.ca>
Robert Olsson <Robert.Olsson@data.slu.se>
Acknowledgements:
================
People who made this document better:
Lennert Buytenhek <buytenh@gnu.org>
Andrew Morton <akpm@zip.com.au>
Manfred Spraul <manfred@colorfullife.com>
Donald Becker <becker@scyld.com>
Jeff Garzik <jgarzik@pobox.com>

查看文件

@@ -38,8 +38,13 @@ Socket options
DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
the socket will fall back to 0 (which means that no meaningful service code
is present). Connecting sockets set at most one service option; for
listening sockets, multiple service codes can be specified.
is present). On active sockets this is set before connect(); specifying more
than one code has no effect (all subsequent service codes are ignored). The
case is different for passive sockets, where multiple service codes (up to 32)
can be set before calling bind().
DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
size (application payload size) in bytes, see RFC 4340, section 14.
DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
@@ -50,12 +55,13 @@ be enabled at the receiver, too with suitable choice of CsCov.
DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
range 0..15 are acceptable. The default setting is 0 (full coverage),
values between 1..15 indicate partial coverage.
DCCP_SOCKOPT_SEND_CSCOV is for the receiver and has a different meaning: it
DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
sets a threshold, where again values 0..15 are acceptable. The default
of 0 means that all packets with a partial coverage will be discarded.
Values in the range 1..15 indicate that packets with minimally such a
coverage value are also acceptable. The higher the number, the more
restrictive this setting (see [RFC 4340, sec. 9.2.1]).
restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
settings are inherited to the child socket after accept().
The following two options apply to CCID 3 exclusively and are getsockopt()-only.
In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
@@ -112,9 +118,14 @@ tx_qlen = 5
The size of the transmit buffer in packets. A value of 0 corresponds
to an unbounded transmit buffer.
sync_ratelimit = 125 ms
The timeout between subsequent DCCP-Sync packets sent in response to
sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
of this parameter is milliseconds; a value of 0 disables rate-limiting.
Notes
=====
DCCP does not travel through NAT successfully at present on many boxes. This is
because the checksum covers the psuedo-header as per TCP and UDP. Linux NAT
because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
support for DCCP has been added.

查看文件

@@ -1,52 +0,0 @@
The Digi International RightSwitch SE-X (dgrs) Device Driver
This is a Linux driver for the Digi International RightSwitch SE-X
EISA and PCI boards. These are 4 (EISA) or 6 (PCI) port Ethernet
switches and a NIC combined into a single board. This driver can
be compiled into the kernel statically or as a loadable module.
There is also a companion management tool, called "xrightswitch".
The management tool lets you watch the performance graphically,
as well as set the SNMP agent IP and IPX addresses, IEEE Spanning
Tree, and Aging time. These can also be set from the command line
when the driver is loaded. The driver command line options are:
debug=NNN Debug printing level
dma=0/1 Disable/Enable DMA on PCI card
spantree=0/1 Disable/Enable IEEE spanning tree
hashexpire=NNN Change address aging time (default 300 seconds)
ipaddr=A,B,C,D Set SNMP agent IP address i.e. 199,86,8,221
iptrap=A,B,C,D Set SNMP agent IP trap address i.e. 199,86,8,221
ipxnet=NNN Set SNMP agent IPX network number
nicmode=0/1 Disable/Enable multiple NIC mode
There is also a tool for setting up input and output packet filters
on each port, called "dgrsfilt".
Both the management tool and the filtering tool are available
separately from the following FTP site:
ftp://ftp.dgii.com/drivers/rightswitch/linux/
When nicmode=1, the board and driver operate as 4 or 6 individual
NIC ports (eth0...eth5) instead of as a switch. All switching
functions are disabled. In the future, the board firmware may include
a routing cache when in this mode.
Copyright 1995-1996 Digi International Inc.
This software may be used and distributed according to the terms
of the GNU General Public License, incorporated herein by reference.
For information on purchasing a RightSwitch SE-4 or SE-6
board, please contact Digi's sales department at 1-612-912-3444
or 1-800-DIGIBRD. Outside the U.S., please check our Web page at:
http://www.dgii.com
for sales offices worldwide. Tech support is also available through
the channels listed on the Web site, although as long as I am
employed on networking products at Digi I will be happy to provide
any bug fixes that may be needed.
-Rick Richardson, rick@dgii.com

查看文件

@@ -180,13 +180,20 @@ tcp_fin_timeout - INTEGER
to live longer. Cf. tcp_max_orphans.
tcp_frto - INTEGER
Enables F-RTO, an enhanced recovery algorithm for TCP retransmission
Enables Forward RTO-Recovery (F-RTO) defined in RFC4138.
F-RTO is an enhanced recovery algorithm for TCP retransmission
timeouts. It is particularly beneficial in wireless environments
where packet loss is typically due to random radio interference
rather than intermediate router congestion. If set to 1, basic
version is enabled. 2 enables SACK enhanced F-RTO, which is
EXPERIMENTAL. The basic version can be used also when SACK is
enabled for a flow through tcp_sack sysctl.
rather than intermediate router congestion. FRTO is sender-side
only modification. Therefore it does not require any support from
the peer, but in a typical case, however, where wireless link is
the local access link and most of the data flows downlink, the
faraway servers should have FRTO enabled to take advantage of it.
If set to 1, basic version is enabled. 2 enables SACK enhanced
F-RTO if flow uses SACK. The basic version can be used also when
SACK is in use though scenario(s) with it exists where FRTO
interacts badly with the packet counting of the SACK enabled TCP
flow.
tcp_frto_response - INTEGER
When F-RTO has detected that a TCP retransmission timeout was

查看文件

@@ -13,15 +13,35 @@ The radiotap format is discussed in
./Documentation/networking/radiotap-headers.txt.
Despite 13 radiotap argument types are currently defined, most only make sense
to appear on received packets. Currently three kinds of argument are used by
the injection code, although it knows to skip any other arguments that are
present (facilitating replay of captured radiotap headers directly):
to appear on received packets. The following information is parsed from the
radiotap headers and used to control injection:
- IEEE80211_RADIOTAP_RATE - u8 arg in 500kbps units (0x02 --> 1Mbps)
* IEEE80211_RADIOTAP_RATE
- IEEE80211_RADIOTAP_ANTENNA - u8 arg, 0x00 = ant1, 0x01 = ant2
rate in 500kbps units, automatic if invalid or not present
- IEEE80211_RADIOTAP_DBM_TX_POWER - u8 arg, dBm
* IEEE80211_RADIOTAP_ANTENNA
antenna to use, automatic if not present
* IEEE80211_RADIOTAP_DBM_TX_POWER
transmit power in dBm, automatic if not present
* IEEE80211_RADIOTAP_FLAGS
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
current fragmentation threshold. Note that
this flag is only reliable when software
fragmentation is enabled)
The injection code can also skip all other currently defined radiotap fields
facilitating replay of captured radiotap headers directly.
Here is an example valid radiotap header defining these three parameters

查看文件

@@ -3,6 +3,10 @@ started by Ingo Molnar <mingo@redhat.com>, 2001.09.17
2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003
Please send bug reports to Matt Mackall <mpm@selenic.com>
and Satyam Sharma <satyam.sharma@gmail.com>
Introduction:
=============
This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.
@@ -13,6 +17,9 @@ the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.
Sender and receiver configuration:
==================================
It takes a string configuration parameter "netconsole" in the
following format:
@@ -34,21 +41,113 @@ Examples:
insmod netconsole netconsole=@/,@10.0.0.2/
It also supports logging to multiple remote agents by specifying
parameters for the multiple agents separated by semicolons and the
complete string enclosed in "quotes", thusly:
modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.
The remote host can run either 'netcat -u -l -p <port>' or syslogd.
Dynamic reconfiguration:
========================
Dynamic reconfigurability is a useful addition to netconsole that enables
remote logging targets to be dynamically added, removed, or have their
parameters reconfigured at runtime from a configfs-based userspace interface.
[ Note that the parameters of netconsole targets that were specified/created
from the boot/module option are not exposed via this interface, and hence
cannot be modified dynamically. ]
To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
netconsole module (or kernel, if netconsole is built-in).
Some examples follow (where configfs is mounted at the /sys/kernel/config
mountpoint).
To add a remote logging target (target names can be arbitrary):
cd /sys/kernel/config/netconsole/
mkdir target1
Note that newly created targets have default parameter values (as mentioned
above) and are disabled by default -- they must first be enabled by writing
"1" to the "enabled" attribute (usually after setting parameters accordingly)
as described below.
To remove a target:
rmdir /sys/kernel/config/netconsole/othertarget/
The interface exposes these parameters of a netconsole target to userspace:
enabled Is this target currently enabled? (read-write)
dev_name Local network interface name (read-write)
local_port Source UDP port to use (read-write)
remote_port Remote agent's UDP port (read-write)
local_ip Source IP address to use (read-write)
remote_ip Remote agent's IP address (read-write)
local_mac Local interface's MAC address (read-only)
remote_mac Remote agent's MAC address (read-write)
The "enabled" attribute is also used to control whether the parameters of
a target can be updated or not -- you can modify the parameters of only
disabled targets (i.e. if "enabled" is 0).
To update a target's parameters:
cat enabled # check if enabled is 1
echo 0 > enabled # disable the target (if required)
echo eth2 > dev_name # set local interface
echo 10.0.0.4 > remote_ip # update some parameter
echo cb:a9:87:65:43:21 > remote_mac # update more parameters
echo 1 > enabled # enable target again
You can also update the local interface dynamically. This is especially
useful if you want to use interfaces that have newly come up (and may not
have existed when netconsole was loaded / initialized).
Miscellaneous notes:
====================
WARNING: the default target ethernet setting uses the broadcast
ethernet address to send packets, which can cause increased load on
other systems on the same ethernet segment.
TIP: some LAN switches may be configured to suppress ethernet broadcasts
so it is advised to explicitly specify the remote agents' MAC addresses
from the config parameters passed to netconsole.
TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
TIP: in case the remote logging agent is on a separate LAN subnet than
the sender, it is suggested to try specifying the MAC address of the
default gateway (you may use /sbin/route -n to find it out) as the
remote MAC address instead.
NOTE: the network device (eth1 in the above case) can run any kind
of other network traffic, netconsole is not intrusive. Netconsole
might cause slight delays in other traffic if the volume of kernel
messages is high, but should have no other impact.
NOTE: if you find that the remote logging agent is not receiving or
printing all messages from the sender, it is likely that you have set
the "console_loglevel" parameter (on the sender) to only send high
priority messages to the console. You can change this at runtime using:
dmesg -n 8
or by specifying "debug" on the kernel command line at boot, to send
all kernel messages to the console. A specific value for this parameter
can also be set using the "loglevel" kernel boot option. See the
dmesg(8) man page and Documentation/kernel-parameters.txt for details.
Netconsole was designed to be as instantaneous as possible, to
enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while

查看文件

@@ -73,7 +73,8 @@ dev->hard_start_xmit:
has to lock by itself when needed. It is recommended to use a try lock
for this and return NETDEV_TX_LOCKED when the spin lock fails.
The locking there should also properly protect against
set_multicast_list.
set_multicast_list. Note that the use of NETIF_F_LLTX is deprecated.
Dont use it for new drivers.
Context: Process with BHs disabled or BH (timer),
will be called with interrupts disabled by netconsole.
@@ -95,9 +96,13 @@ dev->set_multicast_list:
Synchronization: netif_tx_lock spinlock.
Context: BHs disabled
dev->poll:
Synchronization: __LINK_STATE_RX_SCHED bit in dev->state. See
dev_close code and comments in net/core/dev.c for more info.
struct napi_struct synchronization rules
========================================
napi->poll:
Synchronization: NAPI_STATE_SCHED bit in napi->state. Device
driver's dev->close method will invoke napi_disable() on
all NAPI instances which will do a sleeping poll on the
NAPI_STATE_SCHED napi->state bit, waiting for all pending
NAPI activity to cease.
Context: softirq
will be called with interrupts disabled by netconsole.

查看文件

@@ -1824,6 +1824,162 @@ platforms are moved over to use the flattened-device-tree model.
fsl,has-rstcr;
};
h) 4xx/Axon EMAC ethernet nodes
The EMAC ethernet controller in IBM and AMCC 4xx chips, and also
the Axon bridge. To operate this needs to interact with a ths
special McMAL DMA controller, and sometimes an RGMII or ZMII
interface. In addition to the nodes and properties described
below, the node for the OPB bus on which the EMAC sits must have a
correct clock-frequency property.
i) The EMAC node itself
Required properties:
- device_type : "network"
- compatible : compatible list, contains 2 entries, first is
"ibm,emac-CHIP" where CHIP is the host ASIC (440gx,
405gp, Axon) and second is either "ibm,emac" or
"ibm,emac4". For Axon, thus, we have: "ibm,emac-axon",
"ibm,emac4"
- interrupts : <interrupt mapping for EMAC IRQ and WOL IRQ>
- interrupt-parent : optional, if needed for interrupt mapping
- reg : <registers mapping>
- local-mac-address : 6 bytes, MAC address
- mal-device : phandle of the associated McMAL node
- mal-tx-channel : 1 cell, index of the tx channel on McMAL associated
with this EMAC
- mal-rx-channel : 1 cell, index of the rx channel on McMAL associated
with this EMAC
- cell-index : 1 cell, hardware index of the EMAC cell on a given
ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on
each Axon chip)
- max-frame-size : 1 cell, maximum frame size supported in bytes
- rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec
operations.
For Axon, 2048
- tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec
operations.
For Axon, 2048.
- fifo-entry-size : 1 cell, size of a fifo entry (used to calculate
thresholds).
For Axon, 0x00000010
- mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds)
in bytes.
For Axon, 0x00000100 (I think ...)
- phy-mode : string, mode of operations of the PHY interface.
Supported values are: "mii", "rmii", "smii", "rgmii",
"tbi", "gmii", rtbi", "sgmii".
For Axon on CAB, it is "rgmii"
- mdio-device : 1 cell, required iff using shared MDIO registers
(440EP). phandle of the EMAC to use to drive the
MDIO lines for the PHY used by this EMAC.
- zmii-device : 1 cell, required iff connected to a ZMII. phandle of
the ZMII device node
- zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII
channel or 0xffffffff if ZMII is only used for MDIO.
- rgmii-device : 1 cell, required iff connected to an RGMII. phandle
of the RGMII device node.
For Axon: phandle of plb5/plb4/opb/rgmii
- rgmii-channel : 1 cell, required iff connected to an RGMII. Which
RGMII channel is used by this EMAC.
Fox Axon: present, whatever value is appropriate for each
EMAC, that is the content of the current (bogus) "phy-port"
property.
Recommended properties:
- linux,network-index : This is the intended "index" of this
network device. This is used by the bootwrapper to interpret
MAC addresses passed by the firmware when no information other
than indices is available to associate an address with a device.
Optional properties:
- phy-address : 1 cell, optional, MDIO address of the PHY. If absent,
a search is performed.
- phy-map : 1 cell, optional, bitmap of addresses to probe the PHY
for, used if phy-address is absent. bit 0x00000001 is
MDIO address 0.
For Axon it can be absent, thouugh my current driver
doesn't handle phy-address yet so for now, keep
0x00ffffff in it.
- rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec
operations (if absent the value is the same as
rx-fifo-size). For Axon, either absent or 2048.
- tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec
operations (if absent the value is the same as
tx-fifo-size). For Axon, either absent or 2048.
- tah-device : 1 cell, optional. If connected to a TAH engine for
offload, phandle of the TAH device node.
- tah-channel : 1 cell, optional. If appropriate, channel used on the
TAH engine.
Example:
EMAC0: ethernet@40000800 {
linux,network-index = <0>;
device_type = "network";
compatible = "ibm,emac-440gp", "ibm,emac";
interrupt-parent = <&UIC1>;
interrupts = <1c 4 1d 4>;
reg = <40000800 70>;
local-mac-address = [00 04 AC E3 1B 1E];
mal-device = <&MAL0>;
mal-tx-channel = <0 1>;
mal-rx-channel = <0>;
cell-index = <0>;
max-frame-size = <5dc>;
rx-fifo-size = <1000>;
tx-fifo-size = <800>;
phy-mode = "rmii";
phy-map = <00000001>;
zmii-device = <&ZMII0>;
zmii-channel = <0>;
};
ii) McMAL node
Required properties:
- device_type : "dma-controller"
- compatible : compatible list, containing 2 entries, first is
"ibm,mcmal-CHIP" where CHIP is the host ASIC (like
emac) and the second is either "ibm,mcmal" or
"ibm,mcmal2".
For Axon, "ibm,mcmal-axon","ibm,mcmal2"
- interrupts : <interrupt mapping for the MAL interrupts sources:
5 sources: tx_eob, rx_eob, serr, txde, rxde>.
For Axon: This is _different_ from the current
firmware. We use the "delayed" interrupts for txeob
and rxeob. Thus we end up with mapping those 5 MPIC
interrupts, all level positive sensitive: 10, 11, 32,
33, 34 (in decimal)
- dcr-reg : < DCR registers range >
- dcr-parent : if needed for dcr-reg
- num-tx-chans : 1 cell, number of Tx channels
- num-rx-chans : 1 cell, number of Rx channels
iii) ZMII node
Required properties:
- compatible : compatible list, containing 2 entries, first is
"ibm,zmii-CHIP" where CHIP is the host ASIC (like
EMAC) and the second is "ibm,zmii".
For Axon, there is no ZMII node.
- reg : <registers mapping>
iv) RGMII node
Required properties:
- compatible : compatible list, containing 2 entries, first is
"ibm,rgmii-CHIP" where CHIP is the host ASIC (like
EMAC) and the second is "ibm,rgmii".
For Axon, "ibm,rgmii-axon","ibm,rgmii"
- reg : <registers mapping>
- revision : as provided by the RGMII new version register if
available.
For Axon: 0x0000012a
More devices will be defined as this spec matures.
VII - Specifying interrupt information for devices

89
Documentation/rfkill.txt 普通文件
查看文件

@@ -0,0 +1,89 @@
rfkill - RF switch subsystem support
====================================
1 Implementation details
2 Driver support
3 Userspace support
===============================================================================
1: Implementation details
The rfkill switch subsystem offers support for keys often found on laptops
to enable wireless devices like WiFi and Bluetooth.
This is done by providing the user 3 possibilities:
1 - The rfkill system handles all events; userspace is not aware of events.
2 - The rfkill system handles all events; userspace is informed about the events.
3 - The rfkill system does not handle events; userspace handles all events.
The buttons to enable and disable the wireless radios are important in
situations where the user is for example using his laptop on a location where
wireless radios _must_ be disabled (e.g. airplanes).
Because of this requirement, userspace support for the keys should not be
made mandatory. Because userspace might want to perform some additional smarter
tasks when the key is pressed, rfkill still provides userspace the possibility
to take over the task to handle the key events.
The system inside the kernel has been split into 2 separate sections:
1 - RFKILL
2 - RFKILL_INPUT
The first option enables rfkill support and will make sure userspace will
be notified of any events through the input device. It also creates several
sysfs entries which can be used by userspace. See section "Userspace support".
The second option provides an rfkill input handler. This handler will
listen to all rfkill key events and will toggle the radio accordingly.
With this option enabled userspace could either do nothing or simply
perform monitoring tasks.
====================================
2: Driver support
To build a driver with rfkill subsystem support, the driver should
depend on the Kconfig symbol RFKILL; it should _not_ depend on
RKFILL_INPUT.
Unless key events trigger an interrupt to which the driver listens, polling
will be required to determine the key state changes. For this the input
layer providers the input-polldev handler.
A driver should implement a few steps to correctly make use of the
rfkill subsystem. First for non-polling drivers:
- rfkill_allocate()
- input_allocate_device()
- rfkill_register()
- input_register_device()
For polling drivers:
- rfkill_allocate()
- input_allocate_polled_device()
- rfkill_register()
- input_register_polled_device()
When a key event has been detected, the correct event should be
sent over the input device which has been registered by the driver.
====================================
3: Userspace support
For each key an input device will be created which will send out the correct
key event when the rfkill key has been pressed.
The following sysfs entries will be created:
name: Name assigned by driver to this key (interface or driver name).
type: Name of the key type ("wlan", "bluetooth", etc).
state: Current state of the key. 1: On, 0: Off.
claim: 1: Userspace handles events, 0: Kernel handles events
Both the "state" and "claim" entries are also writable. For the "state" entry
this means that when 1 or 0 is written all radios, not yet in the requested
state, will be will be toggled accordingly.
For the "claim" entry writing 1 to it means that the kernel no longer handles
key events even though RFKILL_INPUT input was enabled. When "claim" has been
set to 0, userspace should make sure that it listens for the input events or
check the sysfs "state" entry regularly to correctly perform the required
tasks when the rkfill key is pressed.