Commit Graph

30487 Commits

Author SHA1 Message Date
Tejun Heo
8d408b4be3 percpu: give more latitude to arch specific first chunk initialization
Impact: more latitude for first percpu chunk allocation

The first percpu chunk serves the kernel static percpu area and may or
may not contain extra room for further dynamic allocation.
Initialization of the first chunk needs to be done before normal
memory allocation service is up, so it has its own init path -
pcpu_setup_static().

It seems archs need more latitude while initializing the first chunk
for example to take advantage of large page mapping.  This patch makes
the following changes to allow this.

* Define PERCPU_DYNAMIC_RESERVE to give arch hint about how much space
  to reserve in the first chunk for further dynamic allocation.

* Rename pcpu_setup_static() to pcpu_setup_first_chunk().

* Make pcpu_setup_first_chunk() much more flexible by fetching page
  pointer by callback and adding optional @unit_size, @free_size and
  @base_addr arguments which allow archs to selectively part of chunk
  initialization to their likings.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-24 11:57:21 +09:00
Tejun Heo
c0c0a29379 vmalloc: add @align to vm_area_register_early()
Impact: allow larger alignment for early vmalloc area allocation

Some early vmalloc users might want larger alignment, for example, for
custom large page mapping.  Add @align to vm_area_register_early().
While at it, drop docbook comment on non-existent @size.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
2009-02-24 11:57:21 +09:00
Tejun Heo
2d0aae4169 bootmem: reorder interface functions and add a missing one
Impact: cleanup and addition of missing interface wrapper

The interface functions in bootmem.h was ordered in not so orderly
manner.  Reorder them such that

* functions allocating the same area group together -
  ie. alloc_bootmem group and alloc_bootmem_low group.

* functions w/o node parameter come before the ones w/ node parameter.

* nopanic variants are immediately below their panicky counterparts.

While at it, add alloc_bootmem_pages_node_nopanic() which was missing.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
2009-02-24 11:57:21 +09:00
Tejun Heo
c132937556 bootmem: clean up arch-specific bootmem wrapping
Impact: cleaner and consistent bootmem wrapping

By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
arch-specific wrappers for bootmem allocation.  However, this is done
a bit strangely in that only the high level convenience macros can be
changed while lower level, but still exported, interface functions
can't be wrapped.  This not only is messy but also leads to strange
situation where alloc_bootmem() does what the arch wants it to do but
the equivalent __alloc_bootmem() call doesn't although they should be
able to be used interchangeably.

This patch updates bootmem such that archs can override / wrap the
backend function - alloc_bootmem_core() instead of the highlevel
interface functions to allow simpler and consistent wrapping.  Also,
HAVE_ARCH_BOOTMEM_NODE is renamed to HAVE_ARCH_BOOTMEM.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
2009-02-24 11:57:20 +09:00
Linus Torvalds
d38e84ee39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netns: fix double free at netns creation
  veth : add the set_mac_address capability
  sunlance: Beyond ARRAY_SIZE of ib->btx_ring
  sungem: another error printed one too early
  ISDN: fix sc/shmem printk format warning
  SMSC: timeout reaches -1
  smsc9420: handle magic field of ethtool_eeprom
  sundance: missing parentheses?
  smsc9420: fix another postfixed timeout
  wimax/i2400m: driver loads firmware v1.4 instead of v1.3
  vlan: Update skb->mac_header in __vlan_put_tag().
  cxgb3: Add support for PCI ID 0x35.
  tcp: remove obsoleted comment about different passes
  TG3: &&/|| confusion
  ATM: misplaced parentheses?
  net/mv643xx: don't disable the mib timer too early and lock properly
  net/mv643xx: use GFP_ATOMIC while atomic
  atl1c: Atheros L1C Gigabit Ethernet driver
  net: Kill skb_truesize_check(), it only catches false-positives.
  net: forcedeth: Fix wake-on-lan regression
2009-02-23 14:36:05 -08:00
David Rientjes
3b89d7d881 slub: move min_partial to struct kmem_cache
Although it allows for better cacheline use, it is unnecessary to save a
copy of the cache's min_partial value in each kmem_cache_node.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-23 12:05:41 +02:00
Andreas Mohr
ce71bfd1aa ALSA: ALS4000, slight mixer improvements
- add 8kHz / 20 kHz low-pass filter switch control
- add ALS4000 Mono capture route control
- add annotations to specs pages
- improve ALS4000 PM saved regs selection (remove SB dummy register,
  add missing ones)
- add some missing ALS4000 register defines
- constify two variables

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-23 07:47:52 +01:00
Eric W. Biederman
ce16c5337a netns: Remove net_alive
It turns out that net_alive is unnecessary, and the original problem
that led to it being added was simply that the icmp code thought
it was a network device and wound up being unable to handle packets
while there were still packets in the network namespace.

Now that icmp and tcp have been fixed to properly register themselves
this problem is no longer present and we have a stronger guarantee
that packets will not arrive in a network namespace then that provided
by net_alive in netif_receive_skb.  So remove net_alive allowing
packet reception run a little faster.

Additionally document the strong reason why network namespace cleanup
is safe so that if something happens again someone else will have
a chance of figuring it out.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-22 19:54:50 -08:00
Benjamin Herrenschmidt
35f88e6b06 Merge commit 'ftrace/function-graph' into next 2009-02-23 10:47:23 +11:00
Ingo Molnar
fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
Rafael J. Wysocki
770824bdc4 PM: Split up sysdev_[suspend|resume] from device_power_[down|up]
Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.

This is based on an earlier patch from Linus.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-22 10:33:44 -08:00
Alexander Clouter
9c3c133b1e hwrng: timeriomem - New driver
Some hardware platforms, the TS-7800[1] is one for example, can
supply the kernel with an entropy source, albeit a slow one for
TS-7800 users, by just reading a particular IO address.  This
source must not be read above a certain rate otherwise the quality
suffers.

The driver is then hooked into by calling
platform_device_(register|add|del) passing a structure similar to:
------
static struct timeriomem_rng_data ts78xx_ts_rng_data = {
        .address        = (u32 *__iomem) TS_RNG,
        .period         = 1000000, /* one second */
};

static struct platform_device ts78xx_ts_rng_device = {
        .name           = "timeriomem_rng",
        .id             = -1,
        .dev            = {
                .platform_data  = &ts78xx_ts_rng_data,
        },
        .num_resources  = 0,
};
------

[1] http://www.embeddedarm.com/products/board-detail.php?product=TS-7800

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-22 12:03:56 +08:00
Linus Torvalds
adfafefd10 Merge branch 'hibernate'
* hibernate:
  PM: Fix suspend_console and resume_console to use only one semaphore
  PM: Wait for console in resume
  PM: Fix pm_notifiers during user mode hibernation
  swsusp: clean up shrink_all_zones()
  swsusp: dont fiddle with swappiness
  PM: fix build for CONFIG_PM unset
  PM/hibernate: fix "swap breaks after hibernation failures"
  PM/resume: wait for device probing to finish
  Consolidate driver_probe_done() loops into one place
2009-02-21 14:17:26 -08:00
Arjan van de Ven
216773a787 Consolidate driver_probe_done() loops into one place
there's a few places that currently loop over driver_probe_done(), and
I'm about to add another one. This patch abstracts it into a helper
to reduce duplication.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21 14:17:17 -08:00
Linus Torvalds
2ec77fc93c Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (26 commits)
  drm/radeon: update sarea copies of last_ variables on resume.
  drm/i915: Keep refs on the object over the lifetime of vmas for GTT mmap.
  drm/i915: take struct mutex around fb unref
  drm: Use spread spectrum when the bios tells us it's ok.
  drm: Collapse identical i8xx_clock() and i9xx_clock().
  drm: Bring PLL limits in sync with DDX values.
  drm: Add locking around cursor gem operations.
  drm: Propagate failure from setting crtc base.
  drm: Check for a NULL encoder when reverting on error path
  drm/i915: Cleanup the hws on ringbuffer constrution failure.
  drm/i915: Don't add panel_fixed_mode to the probed modes list at LVDS init.
  drm: Release user fbs in drm_release
  drm/i915: Unpin the fb on error during construction.
  drm/i915: Unpin the hws if we fail to kmap.
  drm/i915: Unpin the ringbuffer if we fail to ioremap it.
  drm/i915: unpin for an invalid memory domain.
  drm/i915: Release and unlock on mmap_gtt error path.
  drm/i915: Set framebuffer alignment based upon the fence constraints.
  drm: Do not leak a new reference for flink() on an existing name
  drm/i915: Fix potential AB-BA deadlock in i915_gem_execbuffer()
  ...
2009-02-20 18:04:53 -08:00
Mauro Carvalho Chehab
b6adea334c 8250: fix boot hang with serial console when using with Serial Over Lan port
Intel 8257x Ethernet boards have a feature called Serial Over Lan.

This feature works by emulating a serial port, and it is detected by
kernel as a normal 8250 port.  However, this emulation is not perfect, as
also noticed on changeset 7500b1f602.

Before this patch, the kernel were trying to check if the serial TX is
capable of work using IRQ's.

This were done with a code similar this:

        serial_outp(up, UART_IER, UART_IER_THRI);
        lsr = serial_in(up, UART_LSR);
        iir = serial_in(up, UART_IIR);
        serial_outp(up, UART_IER, 0);

        if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
		up->bugs |= UART_BUG_TXEN;

This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.

Due to that, UART_BUG_TXEN is sometimes enabled.  However, as TX IRQ keeps
working, and the TX polling is now enabled, the driver miss-interprets the
IRQ received later, hanging up the machine until a key is pressed at the
serial console.

This is the 6 version of this patch.  Previous versions were trying to
introduce a large enough delay between serial_outp and serial_in(up,
UART_IIR), but not taking forever.  However, the needed delay couldn't be
safely determined.

At the experimental tests, a delay of 1us solves most of the cases, but
still hangs sometimes.  Increasing the delay to 5us was better, but still
doesn't solve.  A very high delay of 50 ms seemed to work every time.

However, poking around with delays and pray for it to be enough doesn't
seem to be a good approach, even for a quirk.

So, instead of playing with random large arbitrary delays, let's just
disable UART_BUG_TXEN for all SoL ports.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:50 -08:00
Michael Buesch
01b24fee28 spi_bitbang: add more lowlevel function documentation
This adds more documentation of the lowlevel API to avoid future bugs.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:49 -08:00
Johannes Weiner
3ef0e5ba46 slab: introduce kzfree()
kzfree() is a wrapper for kfree() that additionally zeroes the underlying
memory before releasing it to the slab allocator.

Currently there is code which memset()s the memory region of an object
before releasing it back to the slab allocator to make sure
security-sensitive data are really zeroed out after use.

These callsites can then just use kzfree() which saves some code, makes
users greppable and allows for a stupid destructor that isn't necessarily
aware of the actual object size.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:48 -08:00
Ingo Molnar
b18018126f x86, mm, kprobes: fault.c, simplify notify_page_fault()
Impact: cleanup

Remove an #ifdef from notify_page_fault(). The function still
compiles to nothing in the !CONFIG_KPROBES case.

Introduce kprobes_built_in() and kprobe_fault_handler() helpers
to allow this - they returns 0 if !CONFIG_KPROBES.

No code changed:

   text	   data	    bss	    dec	    hex	filename
   4618	     32	     24	   4674	   1242	fault.o.before
   4618	     32	     24	   4674	   1242	fault.o.after

Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:42 +01:00
Ingo Molnar
b814d41f09 x86, mm: fault.c, simplify kmmio_fault()
Impact: cleanup

Remove an #ifdef from kmmio_fault() - we can do this by
providing default implementations for is_kmmio_active()
and kmmio_handler(). The compiler optimizes it all away
in the !CONFIG_MMIOTRACE case.

Also, while at it, clean up mmiotrace.h a bit:

 - standard header guards
 - standard vertical spaces for structure definitions

No code changed (both with mmiotrace on and off in the config):

   text	   data	    bss	    dec	    hex	filename
   2947	     12	     12	   2971	    b9b	fault.o.before
   2947	     12	     12	   2971	    b9b	fault.o.after

Cc: Pekka Paalanen <pq@iki.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:42 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Pekka Enberg
51735a7ca6 SLUB: Do not pass 8k objects through to the page allocator
Increase the maximum object size in SLUB so that 8k objects are not
passed through to the page allocator anymore. The network stack uses 8k
objects for performance critical operations.

The patch is motivated by a SLAB vs. SLUB regression in the netperf
benchmark. The problem is that the kfree(skb->head) call in
skb_release_data() that is subject to page allocator pass-through as the
size passed to __alloc_skb() is larger than 4 KB in this test.

As explained by Yanmin Zhang:

  I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server
  pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's
  result is about 2.3 times of SLUB's. After applying the reverting patch,
  the result difference between SLUB and SLAB becomes 1% which we might
  consider as fluctuation.

[ penberg@cs.helsinki.fi: fix oops in kmalloc() ]
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-20 12:25:47 +02:00
Christoph Lameter
ffadd4d0fe SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants
As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-20 12:22:44 +02:00
Adam Nielsen
268cb38e18 netfilter: x_tables: add LED trigger target
Kernel module providing implementation of LED netfilter target.  Each
instance of the target appears as a led-trigger device, which can be
associated with one or more LEDs in /sys/class/leds/

Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:55:14 +01:00
Hagen Paul Pfeifer
e478075c6f netfilter: nf_conntrack: table max size should hold at least table size
Table size is defined as unsigned, wheres the table maximum size is
defined as a signed integer. The calculation of max is 8 or 4,
multiplied the table size. Therefore the max value is aligned to
unsigned.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:47:09 +01:00
Stephen Hemminger
784544739a netfilter: iptables: lock free counters
The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:

1) it prevents changes to table state (xt_replace) while table is in use.
   This is now handled by doing rcu on the xt_table. When table is
   replaced, the new table(s) are put in and the old one table(s) are freed
   after RCU period.

2) it provides synchronization when accesing the counter values.
   This is now handled by swapping in new table_info entries for each cpu
   then summing the old values, and putting the result back onto one
   cpu.  On a busy system it may cause sampling to occur at different
   times on each cpu, but no packet/byte counts are lost in the process.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)

Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:35:32 +01:00
Pablo Neira Ayuso
be0c22a46c netlink: add NETLINK_BROADCAST_ERROR socket option
This patch adds NETLINK_BROADCAST_ERROR which is a netlink
socket option that the listener can set to make netlink_broadcast()
return errors in the delivery to the caller. This option is useful
if the caller of netlink_broadcast() do something with the result
of the message delivery, like in ctnetlink where it drops a network
packet if the event delivery failed, this is used to enable reliable
logging and state-synchronization. If this socket option is not set,
netlink_broadcast() only reports ESRCH errors and silently ignore
ENOBUFS errors, which is what most netlink_broadcast() callers
should do.

This socket option is based on a suggestion from Patrick McHardy.
Patrick McHardy can exchange this patch for a beer from me ;).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20 01:01:08 -08:00
Santwona Behera
59089d8d16 ethtool: Add RX pkt classification interface
Signed-off-by: Santwona Behera <santwona.behera@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20 00:58:13 -08:00
Tejun Heo
fbf59bc9d7 percpu: implement new dynamic percpu allocator
Impact: new scalable dynamic percpu allocator which allows dynamic
        percpu areas to be accessed the same way as static ones

Implement scalable dynamic percpu allocator which can be used for both
static and dynamic percpu areas.  This will allow static and dynamic
areas to share faster direct access methods.  This feature is optional
and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
arch.  Please read comment on top of mm/percpu.c for details.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
2009-02-20 16:29:08 +09:00
Tejun Heo
8fc4898500 vmalloc: add un/map_kernel_range_noflush()
Impact: two more public map/unmap functions

Implement map_kernel_range_noflush() and unmap_kernel_range_noflush().
These functions respectively map and unmap address range in kernel VM
area but doesn't do any vcache or tlb flushing.  These will be used by
new percpu allocator.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
2009-02-20 16:29:08 +09:00
Tejun Heo
f0aa661790 vmalloc: implement vm_area_register_early()
Impact: allow multiple early vm areas

There are places where kernel VM area needs to be allocated before
vmalloc is initialized.  This is done by allocating static vm_struct,
initializing several fields and linking it to vmlist and later vmalloc
initialization picking up these from vmlist.  This is currently done
manually and if there's more than one such areas, there's no defined
way to arbitrate who gets which address.

This patch implements vm_area_register_early(), which takes vm_area
struct with flags and size initialized, assigns address to it and puts
it on the vmlist.  This way, multiple early vm areas can determine
which addresses they should use.  The only current user - alpha mm
init - is converted to use it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Tejun Heo
f2a8205c4e percpu: kill percpu_alloc() and friends
Impact: kill unused functions

percpu_alloc() and its friends never saw much action.  It was supposed
to replace the cpu-mask unaware __alloc_percpu() but it never happened
and in fact __percpu_alloc_mask() itself never really grew proper
up/down handling interface either (no exported interface for
populate/depopulate).

percpu allocation is about to go through major reimplementation and
there's no reason to carry this unused interface around.  Replace it
with __alloc_percpu() and free_percpu().

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Rusty Russell
313e458f81 alloc_percpu: add align argument to __alloc_percpu.
This prepares for a real __alloc_percpu, by adding an alignment argument.
Only one place uses __alloc_percpu directly, and that's for a string.

tj: af_inet also uses __alloc_percpu(), update it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
2009-02-20 16:29:08 +09:00
Rusty Russell
b36128c830 alloc_percpu: change percpu_ptr to per_cpu_ptr
Impact: cleanup

There are two allocated per-cpu accessor macros with almost identical
spelling.  The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.

tj: kill percpu_ptr() and update UP too

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Jesse Barnes
ab00b3e521 drm/i915: Keep refs on the object over the lifetime of vmas for GTT mmap.
This fixes potential fault at fault time if the object was unreferenced
while the mapping still existed.  Now, while the mmap_offset only lives
for the lifetime of the object, the object also stays alive while a vma
exists that needs it.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-02-20 12:21:13 +10:00
Chris Wilson
5c3b82e2b2 drm: Propagate failure from setting crtc base.
Check the error paths within intel_pipe_set_base() to first cleanup and
then report back the error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
2009-02-20 12:21:12 +10:00
Kristian Høgsberg
ea39f83516 drm: Release user fbs in drm_release
Avoids leaking fbs and associated buffers on release.

Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Tested-by: Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
2009-02-20 12:21:11 +10:00
Jarek Poplawski
e4dd61882e vlan: Update skb->mac_header in __vlan_put_tag().
After moving mac addresses in __vlan_put_tag() skb->mac_header needs
to be updated.

Reported-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-18 23:31:11 -08:00
Huang Ying
25c38d3fb9 crypto: api - Use dedicated workqueue for crypto subsystem
Use dedicated workqueue for crypto subsystem

A dedicated workqueue named kcrypto_wq is created to be used by crypto
subsystem. The system shared keventd_wq is not suitable for
encryption/decryption, because of potential starvation problem.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-19 14:33:40 +08:00
Linus Torvalds
ba95fd47d1 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
  block: fix booting from partitioned md array
  block: revert part of 18ce3751cc
  cciss: PCI power management reset for kexec
  paride/pg.c: xs(): &&/|| confusion
  fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free
  block: fix bad definition of BIO_RW_SYNC
  bsg: Fix sense buffer bug in SG_IO
2009-02-18 18:33:04 -08:00
Bernhard Walle
97bef7dd05 Bernhard has moved
Since I don't work for SUSE any more and the bwalle@suse.de address is
invalid, correct it in the copyright headers and documentation.

Signed-off-by: Bernhard Walle <bernhard.walle@gmx.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:56 -08:00
Adam Lackorzynski
ffa7525c13 jsm: additional device support
I have a Digi Neo 8 PCI card (114f:00b1) Serial controller: Digi
International Digi Neo 8 (rev 05)

that works with the jsm driver after using the following patch.

Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: Scott H Kilau <Scott_Kilau@digi.com>
Cc: Wendy Xiong <wendyx@us.ibm.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
cc2559bccc mm: fix memmap init for handling memory hole
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.

To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
f2dbcfa738 mm: clean up for early_pfn_to_nid()
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:

	BUG_ON(page_zone(start_page) != page_zone(end_page));

Once I knew this is what was happening, I added some annotations:

	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
		printk(KERN_ERR "move_freepages: Bogus zones: "
		       "start_page[%p] end_page[%p] zone[%p]\n",
		       start_page, end_page, zone);
		printk(KERN_ERR "move_freepages: "
		       "start_zone[%p] end_zone[%p]\n",
		       page_zone(start_page), page_zone(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
		       page_to_pfn(start_page), page_to_pfn(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_nid[%d] end_nid[%d]\n",
		       page_to_nid(start_page), page_to_nid(end_page));
 ...

And here's what I got:

	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
	move_freepages: start_nid[1] end_nid[0]

My memory layout on this box is:

[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff5d
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081fe50
[    0.000000]     1: 0x0081fed1 -> 0x0081fed8
[    0.000000]     1: 0x0081feda -> 0x0081fedb
[    0.000000]     1: 0x0081fedd -> 0x0081fee5
[    0.000000]     1: 0x0081fee7 -> 0x0081ff51
[    0.000000]     1: 0x0081ff59 -> 0x0081ff5d

So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.

This patch:

Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
 I think it makes fix-for-memmap-init not easy.

This patch moves all declaration to include/linux/mm.h

After this,
  if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use static definition in include/linux/mm.h
  else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use generic definition in mm/page_alloc.c
  else
     -> per-arch back end function will be called.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Dan Williams
287d859222 atmel-mci: fix initialization of dma slave data
The conversion of atmel-mci to dma_request_channel missed the
initialization of the channel dma_slave information.  The filter_fn passed
to dma_request_channel is responsible for initializing the channel's
private data.  This implementation has the additional benefit of enabling
a generic client-channel data passing mechanism.

Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Nick Piggin
1cf6e7d83b mm: task dirty accounting fix
YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for
cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty).

Additionally, there is some inconsistency about when task_dirty_inc is
called.  It is used for dirty balancing, however it even gets called for
__set_page_dirty_no_writeback.

So rather than increment it in a set_page_dirty wrapper, move it down to
exactly where the dirty page accounting stats are incremented.

Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:54 -08:00
Davide Libenzi
610d18f412 timerfd: add flags check
As requested by Michael, add a missing check for valid flags in
timerfd_settime(), and make it return EINVAL in case some extra bits are
set.

Michael said:
If this is to be any use to userland apps that want to check flag
support (perhaps it is too late already), then the sooner we get it
into the kernel the better: 2.6.29 would be good; earlier stables as
well would be even better.

[akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org>		[2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Eric Biederman
8f19d47293 seq_file: properly cope with pread
Currently seq_read assumes that the offset passed to it is always the
offset it passed to user space.  In the case pread this assumption is
broken and we do the wrong thing when presented with pread.

To solve this I introduce an offset cache inside of struct seq_file so we
know where our logical file position is.  Then in seq_read if we try to
read from another offset we reset our data structures and attempt to go to
the offset user space wanted.

[akpm@linux-foundation.org: restore FMODE_PWRITE]
[pjt@google.com: seq_open needs its fmode opened up to take advantage of this]
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Turner <pjt@google.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Paul Turner
55ec82176e vfs: separate FMODE_PREAD/FMODE_PWRITE into separate flags
Separate FMODE_PREAD and FMODE_PWRITE into separate flags to reflect the
reality that the read and write paths may have independent restrictions.

A git grep verifies that these flags are always cleared together so this
new behavior will only apply to interfaces that change to clear flags
individually.

This is required for "seq_file: properly cope with pread", a post-2.6.25
regression fix.

[akpm@linux-foundation.org: add comment]
Signed-off-by: Paul Turner <pjt@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc:  Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Benjamin Herrenschmidt
c296861291 vmalloc: add __get_vm_area_caller()
We have get_vm_area_caller() and __get_vm_area() but not
__get_vm_area_caller()

On powerpc, I use __get_vm_area() to separate the ranges of addresses
given to vmalloc vs.  ioremap (various good reasons for that) so in order
to be able to implement the new caller tracking in /proc/vmallocinfo, I
need a "_caller" variant of it.

(akpm: needed for ongoing powerpc development, so merge it early)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00