Commit Graph

125835 Commits

Author SHA1 Message Date
Jarkko Nikula
0be43050d4 ASoC: OMAP: Apply channel constrains to N810 machine driver
Prepare for upcoming McBSP DAI update adding support for mono links by
restricting number of channels to 2 in N810. This is due tlv320aic3x which
claims channels_min = 1 and playing pure mono audio over I2S would cause
it to be played only from left channel if both cpu and codec DAI's claim to
support mono.

Signed-off-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2008-11-25 15:20:55 +00:00
Takashi Iwai
b0e6481a9a ALSA: hda - Really fix bits value in proc output
The fix in 82894b6f6f resulted in zero
due to wrong mask and bit shifts.  Now fixed really.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 16:07:01 +01:00
Julia Lawall
eff79aee91 arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
Impact: fix theoretical option string parsing overflow

Since bridge is unsigned, it would seem better to use simple_strtoul that
simple_strtol.

A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r2@
long e;
position p;
@@

e = simple_strtol@p(...)

@@
position p != r2.p;
type T;
T e;
@@

e =
- simple_strtol@p
+ simple_strtoul
  (...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: muli@il.ibm.com
Cc: jdmason@kudzu.us
Cc: discuss@x86-64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 15:56:03 +01:00
Peter Zijlstra
ca109491f6 hrtimer: removing all ur callback modes
Impact: cleanup, move all hrtimer processing into hardirq context

This is an attempt at removing some of the hrtimer complexity by
reducing the number of callback modes to 1.

This means that all hrtimer callback functions will be ran from HARD-irq
context.

I went through all the 30 odd hrtimer callback functions in the kernel
and saw only one that I'm not quite sure of, which is the one in
net/can/bcm.c - hence I'm CC-ing the folks responsible for that code.

Furthermore, the hrtimer core now calls callbacks directly with IRQs
disabled in case you try to enqueue an expired timer. If this timer is a
periodic timer (which should use hrtimer_forward() to advance its time)
then it might be possible to end up in an inf. recursive loop due to the
fact that hrtimer_forward() doesn't round up to the next timer
granularity, and therefore keeps on calling the callback - obviously
this needs a fix.

Aside from that, this seems to compile and actually boot on my dual core
test box - although I'm sure there are some bugs in, me not hitting any
makes me certain :-)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 15:45:46 +01:00
Steven Rostedt
5cf02b7baf x86: use limited register constraint for setnz
Impact: build fix with certain compilers

GCC can decide to use %dil when "r" is used, which is not valid for
setnz.

This bug was brought out by Stephen Rothwell's merging of the
branch tracer into linux-next.

[ Thanks to Uros Bizjak for recommending 'q' over 'Q' ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 15:38:03 +01:00
David Vrabel
5a4e1a795d uwb: clean up whci_wait_for() timeout error message
All callers of whci_wait_for() should get consistant error message if a
timeout occurs.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-11-25 14:34:47 +00:00
David Vrabel
56968d0c1a wusb: whci-hcd shouldn't do ASL/PZL updates while channel is inactive
ASL/PZL updates while the WUSB channel is inactive (i.e., the PZL and
ASL are stopped) may not complete.  This causes hangs when removing the
whci-hcd module if a device is still connected (removing the device
does an endpoint_disable which results in an ASL update to remove the
qset).

If the WUSB channel is inactive the update can simply be skipped as the
WHC doesn't care about the state of the ASL/PZL.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-11-25 14:23:40 +00:00
Takashi Iwai
eefe93b995 Merge branch 'topic/fix/hda' into topic/hda
Conflicts:
	sound/pci/hda/patch_sigmatel.c
2008-11-25 15:20:57 +01:00
Takashi Iwai
661cd8fb52 ALSA: hda - Check model for Dell 92HD73xx laptops
Check the model type instead of PCI SSID for detection of the mic types
on Dell laptops with IDT 92HD73xx codecs.  In this way, a new laptop
can be tested via model module option.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 15:18:29 +01:00
Takashi Iwai
c65574abad ALSA: hda - mark Dell studio 1535 quirk
Fixed the quirk string for Dell studio 1535 (the product name wasn't
published at the time the patch was made).

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 15:17:08 +01:00
Takashi Iwai
95026623da ALSA: hda - No 'Headphone as Line-out' swich without line-outs
STAC/IDT driver creates "Headphone as Line-Out" switch even if there
is no line-out pins on the machine.  For devices only with headpohnes
and speaker-outs, this switch shouldn't be created.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 15:15:05 +01:00
Takashi Iwai
ee09543c86 ALSA: hda - Add quirk for MSI 7260 mobo
Added preset model=targa-dig for MSI 7260 mobo.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 15:03:38 +01:00
David Vrabel
65d76f3682 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream 2008-11-25 13:52:56 +00:00
Markus Bollinger
c0193f39f4 ALSA: pcxhr - add support for pcxhr stereo sound cards (mixer part)
- add support for pcxhr stereo cards mixer controls
- adjust tlv db scales to real dBu values
- fix bug with monitoring volume control pcxhr_monitor_vol_put
- do some cleanup

Signed-off-by: Markus Bollinger <bollinger@digigram.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 12:53:24 +01:00
David S. Miller
2f9889a20c Revert "hso: Fix crashes on close."
This reverts commit 4a3e818181.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:53:09 -08:00
David S. Miller
ab153d84d9 Revert "hso: Fix free of mutexes still in use."
This reverts commit 52429eb216.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:52:46 -08:00
David S. Miller
cd90ee1799 Revert "hso: Add TIOCM ioctl handling."
This reverts commit 7ea3a9ad9b.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:52:17 -08:00
Markus Bollinger
7628700e08 ALSA: pcxhr - add support for pcxhr stereo sound cards (firmware support)
- Add support for pcxhr stereo cards and their firmware
- autorize sound cards without analog IO
- do some cleanup

Signed-off-by: Markus Bollinger <bollinger@digigram.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 12:28:06 +01:00
Markus Bollinger
9d948d2700 ALSA: pcxhr - add support for pcxhr stereo sound cards (core change)
- Add support for pcxhr stereo cards
- minor bugfixes : period and buffer size consraints
- fix PLL register values
- do some clean up

Signed-off-by: Markus Bollinger <bollinger@digigram.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 12:27:03 +01:00
Markus Bollinger
93bf5d8753 ALSA: pcxhr - add support for pcxhr stereo sound cards
- Add support for pcxhr stereo cards
- do some clean up

Signed-off-by: Markus Bollinger <bollinger@digigram.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 12:26:46 +01:00
Eric Leblond
9f40ac713c netfilter: nfmark IPV6 routing in OUTPUT, mangle, NFQUEUE
This patch let nfmark to be evaluated for routing decision for OUTPUT
packet, in mangle table, when process paquet in NFQUEUE. This patch is
an IPv6 port of Laurent Licour IPv4 one.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-25 12:18:11 +01:00
Eric Leblond
5f145e44ae netfilter: nfmark routing in OUTPUT, mangle, NFQUEUE
This patch let nfmark to be evaluated for routing decision for OUTPUT
packet, in mangle table, when process paquet in NFQUEUE
Until now, only change (in NFQUEUE process) on fields src_addr,
dest_addr and tos could make netfilter to reevalute the routing.

From: Laurent Licour <laurent@licour.com>
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-25 12:15:16 +01:00
Takashi Iwai
c6e4c66613 ALSA: hda - Assign unsol tags dynamically in patch_sigmatel.c
Since we need to handle many unsolicited events assigned to different
widgets, allocate the event dynamically using the existing events
array, and use the tag appropriately instead of combination of fixed
number and widget nid.  (Note that widget nid can be over 4 bits!)

Also, replaced the call of unsol_event handler with a dedicated
function to be more readable.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 11:58:19 +01:00
Takashi Iwai
0e19e7d2bf Merge branch 'topic/fix/hda' into topic/hda
Conflicts:
	sound/pci/hda/patch_sigmatel.c
2008-11-25 11:56:25 +01:00
Takashi Iwai
f73d35853e ALSA: hda - Fix AFG power management on IDT 92HD* codecs
The AFG pin power-mapping isn't properly set for the fixed I/O pins
on IDT 92HD* codecs.  This resulted in the low power mode after the
boot until any jack detection is executed, thus no output from the
speaker.

This patch fixes the power mapping for the fixed pins, and also fixes
the GPIO bits and digital I/O pin settings properly in stac92xx_ini().

Reference: Novell bnc#446025
	https://bugzilla.novell.com/show_bug.cgi?id=446025

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 11:53:50 +01:00
Takashi Iwai
82894b6f6f ALSA: hda - Fix proc pcm rate bits
Show only the relevant bits in the PCM rate bits as in the earlier version.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 11:42:54 +01:00
Hollis Blanchard
c30f8a6c6d KVM: ppc: stop leaking host memory on VM exit
When the VM exits, we must call put_page() for every page referenced in the
shadow TLB.

Without this patch, we usually leak 30-50 host pages (120 - 200 KiB with 4 KiB
pages). The maximum number of pages leaked is the size of our shadow TLB, 64
pages.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-11-25 12:02:48 +02:00
Takashi Iwai
9e97697666 ALSA: hda - Fix caching of SPDIF status bits
SPDIF status bits controls are written via snd_hda_codec_write()
without caching.  This causes a regression at resume that the bits
are lost.

Simply replacing it with the cached version fixes the problem.

Reference:
	http://lkml.org/lkml/2008/11/24/324

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-25 10:31:44 +01:00
Alexey Dobriyan
fb7e06748c xfrm: remove useless forward declarations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:05:54 -08:00
Alexey Dobriyan
6daad37230 ah4/ah6: remove useless NULL assignments
struct will be kfreed in a moment, so...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:05:09 -08:00
Alexander Duyck
69d728baf6 igb: loopback bits not correctly cleared from RCTL register
This change forces the bits to 0 by using an &= operation with an inverted
mask of all options instead of using an |= with a value of 0.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:04:03 -08:00
Alexander Duyck
9b07f3d315 igb: remove unneeded bit refrence when enabling jumbo frames
There is a reference to a Buffer Size extention bit that is unneded by
82575/82576 hardware.  Since it is not needed it should be removed from the
code.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:03:26 -08:00
Jeff Kirsher
7a6b6f515f DCB: fix kconfig option
Since the netlink option for DCB is necessary to actually be useful,
simplified the Kconfig option.  In addition, added useful help text for the
Kconfig option.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:02:08 -08:00
Trent Piepho
11c6dd2c72 phylib: Add Vitesse VSC8221 SGMII PHY
PHY is mostly compatible with the existing VSC8244 PHY.  The init sequence
is different and the interrupt mask lacks some bits present in the VSC8244.

Rather than making a copy of the existing VSC234x config_intr function and
change one constant, I modify it to select the interrupt mask based on
which driver is calling it.  This lets it be used by both drivers.

Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:00:47 -08:00
Bernard Pidoux
244f46ae6e rose: zero length frame filtering in af_rose.c
Since changeset e79ad711a0 from  mainline,
>From David S. Miller,
empty packet can be transmitted on connected socket for datagram protocols.

However, this patch broke a high level application using ROSE network protocol with connected datagram.

Bulletin Board Stations perform bulletins forwarding between BBS stations via ROSE network using a forward protocol.
Now, if for some reason, a buffer in the application software happens to be empty at a specific moment,
ROSE sends an empty packet via unfiltered packet socket.
When received, this ROSE packet introduces perturbations of data exchange of BBS forwarding,
for the application message forwarding protocol is waiting for something else.
We agree that a more careful programming of the application protocol would avoid this situation and we are
willing to debug it.
But, as an empty frame is no use and does not have any meaning for ROSE protocol,
we may consider filtering zero length data both when sending and receiving socket data.

The proposed patch repaired BBS data exchange through ROSE network that were broken since 2.6.22.11 kernel.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:56:20 -08:00
Harvey Harrison
411c41eea5 aoe: remove private mac address format function
Add %pm to omit the colons when printing a mac address.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:40:37 -08:00
Denis Joseph Barrow
9c8f92aed1 hso: Hook up ->reset_resume
Made usb_drivers reset_resume function point to hso_resume this 
fixes problems a usb reset is done when the network interface
is left idle for a few minutes. Possibly reset_resume should
initialise hardware more but this works in the common case.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:36:10 -08:00
Denis Joseph Barrow
7ea3a9ad9b hso: Add TIOCM ioctl handling.
Makes TIOCM ioctls for Data Carrier Detect & related functions
work like /drivers/serial/serial-core.c potentially needed 
for pppd & similar user programs.   

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:35:26 -08:00
Denis Joseph Barrow
52429eb216 hso: Fix free of mutexes still in use.
A new structure hso_mutex_table had to be declared statically
& used as as hso_device mutex_lock(&serial->parent->mutex) etc
is freed in hso_serial_open & hso_serial_close by kref_put while
the mutex is still in use.

This is a substantial change but should make the driver much stabler.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:33:13 -08:00
Denis Joseph Barrow
89930b7b5e hso: Fix URB submission -EINVAL.
Added check for IFF_UP in hso_resume, this should eliminate -EINVAL (-22)
errors caused from urb's being submitted twice, once by hso_resume
& once in hso_net_open, if suspend/resume USB power saving  mode is enabled

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:30:48 -08:00
Denis Joseph Barrow
4a3e818181 hso: Fix crashes on close.
Moved serial_open_count in hso_serial_open to
prevent crashes owing to the serial structure being made NULL
when hso_serial_close is called even though hso_serial_open
returned -ENODEV, Alan Cox pointed out this happens,
also put in sanity check in hso_serial_close
to check for a valid serial structure which should prevent
the most reproducable crash in the driver when the hso device
is disconnected while in use.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:27:50 -08:00
Denis Joseph Barrow
bab04c3adb hso: Add new usb device id's.
Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:26:12 -08:00
Stephen Hemminger
47fd5b8373 netdev: add HAVE_NET_DEVICE_OPS
As a concession to vendors who have to deal with one source for different
kernel versions, add a HAVE_NET_DEVICE_OPS so they don't end up hard
coding ifdef against kernel version.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:20:43 -08:00
Ingo Molnar
7807fafa52 lockdep: fix unused function warning in kernel/lockdep.c
Impact: fix build warning

this warning:

  kernel/lockdep.c:584: warning: ‘print_lock_dependencies’ defined but not used

triggers because print_lock_dependencies() is only used if both
CONFIG_TRACE_IRQFLAGS and CONFIG_PROVE_LOCKING are enabled.

But adding #ifdefs is not an option here - it would spread out to 4-5
other helper functions and uglify the file. So mark this function
as __used - it's static and the compiler can eliminate it just fine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 09:07:03 +01:00
Ingo Molnar
e951e4af2e x86: fix unused variable warning in arch/x86/kernel/hpet.c
Impact: fix build warning

this warning:

  arch/x86/kernel/hpet.c:36: warning: ‘hpet_num_timers’ defined but not used

Triggers because hpet_num_timers is unused in the !CONFIG_PCI_MSI case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 09:03:43 +01:00
Ingo Molnar
14bfc987e3 tracing, tty: fix warnings caused by branch tracing and tty_kref_get()
Stephen Rothwell reported tht this warning started triggering in
linux-next:

  In file included from init/main.c:27:
  include/linux/tty.h: In function ‘tty_kref_get’:
  include/linux/tty.h:330: warning: ‘______f’ is static but declared in inline function ‘tty_kref_get’ which is not static

Which gcc emits for 'extern inline' functions that nevertheless define
static variables. Change it to 'static inline', which is the norm
in the kernel anyway.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 08:59:44 +01:00
Ilpo Järvinen
0ace285605 tcp: handle shift/merge of cloned skbs too
This caused me to get repeatably:

  tcpdump: pcap_loop: recvfrom: Bad address

Happens occassionally when I tcpdump my for-looped test xfers:
  while [ : ]; do echo -n "$(date '+%s.%N') "; ./sendfile; sleep 20; done

Rest of the relevant commands:
  ethtool -K eth0 tso off
  tc qdisc add dev eth0 root netem drop 4%
  tcpdump -n -s0 -i eth0 -w sacklog.all

Running net-next under kvm, connection goes to the same host
(basically just out of kvm). The connection itself works ok
and data gets sent without corruption even with a large
number of tests while tcpdump fails usually within less than
5 tests.

Whether it only happens because of this change or not, I
don't know for sure but it's the only thing with which
I've seen that error. The non-cloned variant works w/o it
for much longer time. I'm yet to debug where the error
actually comes from.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:30:21 -08:00
Ilpo Järvinen
111cc8b913 tcp: add some mibs to track collapsing
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:27:22 -08:00
Ilpo Järvinen
92ee76b6d9 tcp: Make shifting not clear the hints
The earlier version was just very basic one which is "playing
safe" by always clearing the hints. However, clearing of a hint
is extremely costly operation with large windows, so it must be
avoided at all cost whenever possible, there is a way with
shifting too achieve not-clearing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:26:56 -08:00
Ilpo Järvinen
832d11c5cd tcp: Try to restore large SKBs while SACK processing
During SACK processing, most of the benefits of TSO are eaten by
the SACK blocks that one-by-one fragment SKBs to MSS sized chunks.
Then we're in problems when cleanup work for them has to be done
when a large cumulative ACK comes. Try to return back to pre-split
state already while more and more SACK info gets discovered by
combining newly discovered SACK areas with the previous skb if
that's SACKed as well.

This approach has a number of benefits:

1) The processing overhead is spread more equally over the RTT
2) Write queue has less skbs to process (affect everything
   which has to walk in the queue past the sacked areas)
3) Write queue is consistent whole the time, so no other parts
   of TCP has to be aware of this (this was not the case with
   some other approach that was, well, quite intrusive all
   around).
4) Clean_rtx_queue can release most of the pages using single
   put_page instead of previous PAGE_SIZE/mss+1 calls

In case a hole is fully filled by the new SACK block, we attempt
to combine the next skb too which allows construction of skbs
that are even larger than what tso split them to and it handles
hole per on every nth patterns that often occur during slow start
overshoot pretty nicely. Though this to be really useful also
a retransmission would have to get lost since cumulative ACKs
advance one hole at a time in the most typical case.

TODO: handle upwards only merging. That should be rather easy
when segment is fully sacked but I'm leaving that as future
work item (it won't make very large difference anyway since
this current approach already covers quite a lot of normal
cases).

I was earlier thinking of some sophisticated way of tracking
timestamps of the first and the last segment but later on
realized that it won't be that necessary at all to store the
timestamp of the last segment. The cases that can occur are
basically either:
  1) ambiguous => no sensible measurement can be taken anyway
  2) non-ambiguous is due to reordering => having the timestamp
     of the last segment there is just skewing things more off
     than does some good since the ack got triggered by one of
     the holes (besides some substle issues that would make
     determining right hole/skb even harder problem). Anyway,
     it has nothing to do with this change then.

I choose to route some abnormal looking cases with goto noop,
some could be handled differently (eg., by stopping the
walking at that skb but again). In general, they either
shouldn't happen at all or are rare enough to make no difference
in practice.

In theory this change (as whole) could cause some macroscale
regression (global) because of cache misses that are taken over
the round-trip time but it gets very likely better because of much
less (local) cache misses per other write queue walkers and the
big recovery clearing cumulative ack.

Worth to note that these benefits would be very easy to get also
without TSO/GSO being on as long as the data is in pages so that
we can merge them. Currently I won't let that happen because
DSACK splitting at fragment that would mess up pcounts due to
sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets
avoided, we have some conditions that can be made less strict.

TODO: I will probably have to convert the excessive pointer
passing to struct sacktag_state... :-)

My testing revealed that considerable amount of skbs couldn't
be shifted because they were cloned (most likely still awaiting
tx reclaim)...

[The rest is considering future work instead since I got
repeatably EFAULT to tcpdump's recvfrom when I added
pskb_expand_head to deal with clones, so I separated that
into another, later patch]

...To counter that, I gave up on the fifth advantage:

5) When growing previous SACK block, less allocs for new skbs
   are done, basically a new alloc is needed only when new hole
   is detected and when the previous skb runs out of frags space

...which now only happens of if reclaim is fast enough to dispose
the clone before the SACK block comes in (the window is RTT long),
otherwise we'll have to alloc some.

With clones being handled I got these numbers (will be somewhat
worse without that), taken with fine-grained mibs:

                  TCPSackShifted 398
                   TCPSackMerged 877
            TCPSackShiftFallback 320
      TCPSACKCOLLAPSEFALLBACKGSO 0
  TCPSACKCOLLAPSEFALLBACKSKBBITS 0
  TCPSACKCOLLAPSEFALLBACKSKBDATA 0
    TCPSACKCOLLAPSEFALLBACKBELOW 0
    TCPSACKCOLLAPSEFALLBACKFIRST 1
 TCPSACKCOLLAPSEFALLBACKPREVBITS 318
      TCPSACKCOLLAPSEFALLBACKMSS 1
   TCPSACKCOLLAPSEFALLBACKNOHEAD 0
    TCPSACKCOLLAPSEFALLBACKSHIFT 0
          TCPSACKCOLLAPSENOOPSEQ 0
  TCPSACKCOLLAPSENOOPSMALLPCOUNT 0
     TCPSACKCOLLAPSENOOPSMALLLEN 0
             TCPSACKCOLLAPSEHOLE 12

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:20:15 -08:00