Commit Graph

26692 Commits

Author SHA1 Message Date
Christoph Lameter
e10d59f2c3 mm: add comments to explain mm_struct fields
Add comments to explain the page statistics field in the mm_struct.

[akpm@linux-foundation.org: add missing ;]
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:46 -07:00
Christoph Lameter
bc3e53f682 mm: distinguish between mlocked and pinned pages
Some kernel components pin user space memory (infiniband and perf) (by
increasing the page count) and account that memory as "mlocked".

The difference between mlocking and pinning is:

A. mlocked pages are marked with PG_mlocked and are exempt from
   swapping. Page migration may move them around though.
   They are kept on a special LRU list.

B. Pinned pages cannot be moved because something needs to
   directly access physical memory. They may not be on any
   LRU list.

I recently saw an mlockalled process where mm->locked_vm became
bigger than the virtual size of the process (!) because some
memory was accounted for twice:

Once when the page was mlocked and once when the Infiniband
layer increased the refcount because it needt to pin the RDMA
memory.

This patch introduces a separate counter for pinned pages and
accounts them seperately.

Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: Mike Marciniszyn <infinipath@qlogic.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:46 -07:00
David Rientjes
43362a4977 oom: fix race while temporarily setting current's oom_score_adj
test_set_oom_score_adj() was introduced in 72788c3856 ("oom: replace
PF_OOM_ORIGIN with toggling oom_score_adj") to temporarily elevate
current's oom_score_adj for ksm and swapoff without requiring an
additional per-process flag.

Using that function to both set oom_score_adj to OOM_SCORE_ADJ_MAX and
then reinstate the previous value is racy since it's possible that
userspace can set the value to something else itself before the old value
is reinstated.  That results in userspace setting current's oom_score_adj
to a different value and then the kernel immediately setting it back to
its previous value without notification.

To fix this, a new compare_swap_oom_score_adj() function is introduced
with the same semantics as the compare and swap CAS instruction, or
CMPXCHG on x86.  It is used to reinstate the previous value of
oom_score_adj if and only if the present value is the same as the old
value.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:45 -07:00
David Rientjes
c9f01245b6 oom: remove oom_disable_count
This removes mm->oom_disable_count entirely since it's unnecessary and
currently buggy.  The counter was intended to be per-process but it's
currently decremented in the exit path for each thread that exits, causing
it to underflow.

The count was originally intended to prevent oom killing threads that
share memory with threads that cannot be killed since it doesn't lead to
future memory freeing.  The counter could be fixed to represent all
threads sharing the same mm, but it's better to remove the count since:

 - it is possible that the OOM_DISABLE thread sharing memory with the
   victim is waiting on that thread to exit and will actually cause
   future memory freeing, and

 - there is no guarantee that a thread is disabled from oom killing just
   because another thread sharing its mm is oom disabled.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:45 -07:00
Minchan Kim
f80c067361 mm: zone_reclaim: make isolate_lru_page() filter-aware
In __zone_reclaim case, we don't want to shrink mapped page.  Nonetheless,
we have isolated mapped page and re-add it into LRU's head.  It's
unnecessary CPU overhead and makes LRU churning.

Of course, when we isolate the page, the page might be mapped but when we
try to migrate the page, the page would be not mapped.  So it could be
migrated.  But race is rare and although it happens, it's no big deal.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Minchan Kim
39deaf8585 mm: compaction: make isolate_lru_page() filter-aware
In async mode, compaction doesn't migrate dirty or writeback pages.  So,
it's meaningless to pick the page and re-add it to lru list.

Of course, when we isolate the page in compaction, the page might be dirty
or writeback but when we try to migrate the page, the page would be not
dirty, writeback.  So it could be migrated.  But it's very unlikely as
isolate and migration cycle is much faster than writeout.

So, this patch helps cpu overhead and prevent unnecessary LRU churning.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Minchan Kim
4356f21d09 mm: change isolate mode from #define to bitwise type
Change ISOLATE_XXX macro with bitwise isolate_mode_t type.  Normally,
macro isn't recommended as it's type-unsafe and making debugging harder as
symbol cannot be passed throught to the debugger.

Quote from Johannes
" Hmm, it would probably be cleaner to fully convert the isolation mode
into independent flags.  INACTIVE, ACTIVE, BOTH is currently a
tri-state among flags, which is a bit ugly."

This patch moves isolate mode from swap.h to mmzone.h by memcontrol.h

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Christopher Yeoh
fcf634098c Cross Memory Attach
The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Andrew Morton
6eea69dd8b include/linux/dmar.h: forward-declare struct acpi_dmar_header
x86_64 allnoconfig:

In file included from arch/x86/kernel/pci-dma.c:3:
include/linux/dmar.h:248: warning: 'struct acpi_dmar_header' declared inside parameter list
include/linux/dmar.h:248: warning: its scope is only this definition or declaration, which is probably not what you want

Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Paul Gortmaker
ec53cf23c0 irq: don't put module.h into irq.h for tracking irqgen modules.
Recent commit "irq: Track the  owner of irq descriptor" in
commit ID b6873807a7 placed module.h into linux/irq.h
but we are trying to limit module.h inclusion to just C files
that really need it, due to its size and number of children
includes.  This targets just reversing that include.

Add in the basic "struct module" since that is all we really need
to ensure things compile.  In theory, b687380 should have added the
module.h include to the irqdesc.h header as well, but the implicit
module.h everywhere presence masked this from showing up.  So give
it the "struct module" as well.

As for the C files, irqdesc.c is only using THIS_MODULE, so it
does not need module.h - give it export.h instead.  The C file
irq/manage.c is now (as of b687380) using try_module_get and
module_put and so it needs module.h (which it already has).

Also convert the irq_alloc_descs variants to macros, since all
they really do is is call the __irq_alloc_descs primitive.
This avoids including export.h and no debug info is lost.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:35 -04:00
Paul Gortmaker
de47725421 include: replace linux/module.h with "struct module" wherever possible
The <linux/module.h> pretty much brings in the kitchen sink along
with it, so it should be avoided wherever reasonably possible in
terms of being included from other commonly used <linux/something.h>
files, as it results in a measureable increase on compile times.

The worst culprit was probably device.h since it is used everywhere.
This file also had an implicit dependency/usage of mutex.h which was
masked by module.h, and is also fixed here at the same time.

There are over a dozen other headers that simply declare the
struct instead of pulling in the whole file, so follow their lead
and simply make it a few more.

Most of the implicit dependencies on module.h being present by
these headers pulling it in have been now weeded out, so we can
finally make this change with hopefully minimal breakage.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:32 -04:00
Paul Gortmaker
eb5589a8f0 include: convert various register fcns to macros to avoid include chaining
The original implementations reference THIS_MODULE in an inline.
We could include <linux/export.h>, but it is better to avoid chaining.

Fortunately someone else already thought of this, and made a similar
inline into a #define in <linux/device.h> for device_schedule_callback(),
[see commit 523ded71de] so follow that precedent here.

Also bubble up any __must_check that were used on the prev. wrapper inline
functions up one to the real __register functions, to preserve any prev.
sanity checks that were used in those instances.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:32 -04:00
Paul Gortmaker
7c926402a7 crypto.h: remove unused crypto_tfm_alg_modname() inline
The <linux/crypto.h> (which is in turn in common headers
like tcp.h) wants to use module_name() in an inline fcn.
But having all of <linux/module.h> along for the ride is
overkill and slows down compiles by a measureable amount,
since it in turn includes lots of headers.

Since the inline is never used anywhere in the kernel[1],
we can just remove it, and then also remove the module.h
include as well.

In all the many crypto modules, there were some relying on
crypto.h including module.h -- for them we now explicitly
call out module.h for inclusion.

[1] git grep shows some staging drivers also define the same
static inline, but they also never ever use it.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:31 -04:00
Paul Gortmaker
e2ffa376f6 uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
Once we clean up the implicit presence of module.h (and all its
sub-includes), we'll see an implicit dependency on page.h for
the PAGE_SIZE define.  So fix it in advance.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:31 -04:00
Paul Gortmaker
246359d379 pm_runtime.h: explicitly requires notifier.h
This file was getting notifier.h via device.h --> module.h but
the module.h inclusion is going away, so add notifier.h directly.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:30 -04:00
Paul Gortmaker
a8efa9d6bf linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
The implicit presence of module.h and all its sub-includes was
masking these implicit header usages:

include/linux/dmaengine.h:684: warning: 'struct page' declared inside parameter list
include/linux/dmaengine.h:684: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/dmaengine.h:687: warning: 'struct page' declared inside parameter list
include/linux/dmaengine.h:736:2: error: implicit declaration of function 'bitmap_zero'

With input from Stephen Rothwell <sfr@canb.auug.org.au>

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:30 -04:00
Paul Gortmaker
1986c93f09 miscdevice.h: fix up implicit use of lists and types
By removing the implicit presence of module.h from this file, we
will see things like:

In file included from fs/dlm/user.c:9:
include/linux/miscdevice.h:50: error: field ‘list’ has incomplete type
include/linux/miscdevice.h:54: error: expected specifier-qualifier-list before ‘mode_t’

Call out lists.h and types.h for inclusion to fix each of the
above respectively.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:29 -04:00
Paul Gortmaker
bb2eac66ee stop_machine.h: fix implicit use of smp.h for smp_processor_id
This will show up on MIPS when we fix all the implicit header presences
that are because of module.h being everywhere.

In file included from kernel/trace/ftrace.c:16:
include/linux/stop_machine.h: In function 'stop_one_cpu':
include/linux/stop_machine.h:50: error: implicit declaration of function 'smp_processor_id'
include/linux/stop_machine.h: In function 'stop_cpus':
include/linux/stop_machine.h:80: error: implicit declaration of function 'raw_smp_processor_id'

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:28 -04:00
Paul Gortmaker
d0a9940289 of: fix implicit use of errno.h in include/linux/of.h
It shows up as a build failure on MIPS, as it is used in
three of_property function stubs.

include/linux/of.h:275: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:282: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:295: error: 'ENOSYS' undeclared (first use in this function)

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:28 -04:00
Paul Gortmaker
7755c47123 of_platform.h: delete needless include <linux/module.h>
There is nothing modular in this file, and no reason to drag
in all the 357 headers that module.h brings with it, since
it just slows down compiles.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:27 -04:00
Paul Gortmaker
ddac6021fc miscdevice.h: delete unnecessary inclusion of module.h
This file has a define MODULE_ALIAS_MISCDEV which in turn will
use the MODULE_ALIAS define, but only if the former is explicitly
used by modular device driver code (and such code should be
already including module.h).

Delete the include, since module.h is such a giant thing that we
don't want it implicitly sneaking into compiles where it isn't
specifically required.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:26 -04:00
Paul Gortmaker
8a24454869 device_cgroup.h: delete needless include <linux/module.h>
There is nothing modular in this file, and no reason to drag
in all the extra headers that module.h brings with it, since
it just slows down compiles.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:26 -04:00
Paul Gortmaker
4eae0cc4f4 sysdev.h: dont include <linux/module.h> for no reason
The <linux/module.h> pretty much brings in the kitchen sink along
with it, so it should be avoided wherever reasonably possible in
terms of being included from other commonly used <linux/something.h>
files, as it results in a measureable increase on compile times.

There doesn't appear to be any module specifics in this file.
The obvious people who were relying on the presence of
the vast amount of stuff module.h sucked in have been fixed.

If other files are implicitly relying on it, then lets see who
they are and fix them too.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:24 -04:00
Paul Gortmaker
51d7815e4b vermagic: delete unused include of <linux/module.h>
This file consists of nothing other than things like:

  #ifdef CONFIG_FOO
  #define ....

There is no reason for it to require module.h

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:23 -04:00
Paul Gortmaker
ced55d4ef7 regulator: Fix implicit use of notifier.h by driver.h
This was implicitly appearing by way of module.h -- but when
we fix that, we'll get this:

In file included from drivers/regulator/dummy.c:21:
include/linux/regulator/driver.h:197: error: field 'notifier' has incomplete type
make[3]: *** [drivers/regulator/dummy.o] Error 1

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:16 -04:00
Linus Torvalds
32087d4eec Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (54 commits)
  [S390] Remove error checking from copy_oldmem_page()
  [S390] qdio: prevent dsci access without adapter interrupts
  [S390] irqstats: split IPI interrupt accounting
  [S390] add missing __tlb_flush_global() for !CONFIG_SMP
  [S390] sparse: fix sparse symbol shadow warning
  [S390] sparse: fix sparse NULL pointer warnings
  [S390] sparse: fix sparse warnings with __user pointers
  [S390] sparse: fix sparse warnings in math-emu
  [S390] sparse: fix sparse warnings about missing prototypes
  [S390] sparse: fix sparse ANSI-C warnings
  [S390] sparse: fix sparse static warnings
  [S390] sparse: fix access past end of array warnings
  [S390] dasd: prevent path verification before resume
  [S390] qdio: remove multicast polling
  [S390] qdio: reset outbound SBAL error states
  [S390] qdio: EQBS retry after CCQ 96
  [S390] qdio: add timestamp for last queue scan time
  [S390] Introduce get_clock_fast()
  [S390] kvm: Handle diagnose 0x10 (release pages)
  [S390] take mmap_sem when walking guest page table
  ...
2011-10-31 16:14:20 -07:00
Linus Torvalds
1eb6337835 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (348 commits)
  [media] pctv452e: Remove bogus code
  [media] adv7175: Make use of media bus pixel codes
  [media] media: vb2: fix incorrect return value
  [media] em28xx: implement VIDIOC_ENUM_FRAMESIZES
  [media] cx23885: Stop the risc video fifo before reconfiguring it
  [media] cx23885: Avoid incorrect error handling and reporting
  [media] cx23885: Avoid stopping the risc engine during buffer timeout
  [media] cx23885: Removed a spurious function cx23885_set_scale()
  [media] cx23885: v4l2 api compliance, set the audioset field correctly
  [media] cx23885: hook the audio selection functions into the main driver
  [media] cx23885: add generic functions for dealing with audio input selection
  [media] cx23885: fixes related to maximum number of inputs and range checking
  [media] cx23885: Initial support for the MPX-885 mini-card
  [media] cx25840: Ensure AUDIO6 and AUDIO7 trigger line-in baseband use
  [media] cx23885: Enable audio line in support from the back panel
  [media] cx23885: Allow the audio mux config to be specified on a per input basis
  [media] cx25840: Enable support for non-tuner LR1/LR2 audio inputs
  [media] cx23885: Name an internal i2c part and declare a bitfield by name
  [media] cx23885: Ensure VBI buffers timeout quickly - bugfix for vbi hangs during streaming
  [media] cx23885: remove channel dump diagnostics when a vbi buffer times out
  ...

Fix up trivial conflicts in drivers/misc/altera-stapl/altera.c (header
file rename vs add)
2011-10-31 15:42:54 -07:00
Linus Torvalds
1a4ceab195 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  vlan: allow nested vlan_do_receive()
  ipv6: fix route lookup in addrconf_prefix_rcv()
  bonding: eliminate bond_close race conditions
  qlcnic: fix beacon and LED test.
  qlcnic: Updated License file
  qlcnic: updated reset sequence
  qlcnic: reset loopback mode if promiscous mode setting fails.
  qlcnic: skip IDC ack check in fw reset path.
  i825xx: Fix incorrect dependency for BVME6000_NET
  ipv6: fix route error binding peer in func icmp6_dst_alloc
  ipv6: fix error propagation in ip6_ufo_append_data()
  stmmac: update normal descriptor structure (v2)
  stmmac: fix NULL pointer dereference in capabilities fixup (v2)
  stmmac: fix a bug while checking the HW cap reg (v2)
  be2net: Changing MAC Address of a VF was broken.
  be2net: Refactored be_cmds.c file.
  bnx2x: update driver version to 1.70.30-0
  bnx2x: use FW 7.0.29.0
  bnx2x: Enable changing speed when port type is PORT_DA
  bnx2x: Fix 54618se LED behavior
  ...
2011-10-31 15:22:44 -07:00
Jonathan E Brassow
5a25f0eb70 dm log userspace: add log device dependency
Allow userspace dm log implementations to register their log device so it
is no longer missing from the list of device dependencies.

When device mapper targets use a device they normally call dm_get_device
which includes it in the device list returned to userspace applications
such as LVM through the DM_TABLE_DEPS ioctl.  Userspace log devices
don't use dm_get_device as userspace opens them so they are missing from
the list of dependencies.

This patch extends the DM_ULOG_CTR operation to allow userspace to
respond with the name of the log device (if appropriate) to be
registered via 'dm_get_device'.  DM_ULOG_REQUEST_VERSION is incremented.

This is backwards compatible.  If the kernel and userspace log server
have both been updated, the new information will be passed down to the
kernel and the device will be registered.  If the kernel is new, but
the log server is old, the log server will not pass down any device
information and the kernel will simply bypass the device registration
as before.  If the kernel is old but the log server is new, the log
server will see the old version number and not pass the device info.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:21:24 +00:00
Alasdair G Kergon
36a0456fbf dm table: add immutable feature
Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed
with any other target type, and once loaded into a device, it cannot be
replaced with a table containing a different type.

The thin provisioning pool device will use this.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:19:04 +00:00
Alasdair G Kergon
cc6cbe141a dm table: add always writeable feature
Add a target feature flag DM_TARGET_ALWAYS_WRITEABLE to indicate that a target
does not support read-only mode.

The initial implementation of the thin provisioning target uses this.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:19:02 +00:00
Alasdair G Kergon
3791e2fc0e dm table: add singleton feature
Introduce the concept of a singleton table which contains exactly one target.

If a target type sets the DM_TARGET_SINGLETON feature bit device-mapper
will ensure that any table that includes that target contains no others.

The thin provisioning pool target uses this.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:19:00 +00:00
Mikulas Patocka
7f06965390 dm kcopyd: add dm_kcopyd_zero to zero an area
This patch introduces dm_kcopyd_zero() to make it easy to use
kcopyd to write zeros into the requested areas instead
instead of copying.  It is implemented by passing a NULL
copying source to dm_kcopyd_copy().

The forthcoming thin provisioning target uses this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:18:58 +00:00
Namhyung Kim
71a16736a1 dm: use local printk ratelimit
printk_ratelimit() shares global ratelimiting state with all
other subsystems, so its usage is discouraged. Instead,
define and use dm's local state.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-31 20:18:54 +00:00
Mauro Carvalho Chehab
ddeb3547d4 edac: Move edac main structs to include/linux/edac.h
As we'll need to use those structs for trace functions, they should
be on a more public place. So, move struct mem_ctl_info & friends
to edac.h.

No functional changes on this patch.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
2011-10-31 15:10:04 -02:00
Peng Tao
92407e75ce nfs4: serialize layoutcommit
Current pnfs_layoutcommit_inode can not handle parallel layoutcommit.
And as Trond suggested , there is no need for client to optimize for
parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to
mark inflight layoutcommit and serialize lalyoutcommit with it.
Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue
layoutcommit.

Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-31 11:51:28 -04:00
Paul Gortmaker
639938eb60 module.h: relocate MODULE_PARM_DESC into moduleparam.h
There are files which use module_param and MODULE_PARM_DESC
back to back.  They only include moduleparam.h which makes sense,
but the implicit presence of module.h everywhere hid the fact
that MODULE_PARM_DESC wasn't in moduleparam.h at all.  Relocate
the macro to moduleparam.h so that the moduleparam infrastructure
can be used independently of module.h

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:11 -04:00
Paul Gortmaker
f50169324d module.h: split out the EXPORT_SYMBOL into export.h
A lot of files pull in module.h when all they are really
looking for is the basic EXPORT_SYMBOL functionality. The
recent data from Ingo[1] shows that this is one of several
instances that has a significant impact on compile times,
and it should be targeted for factoring out (as done here).

Note that several commonly used header files in include/*
directly include <linux/module.h> themselves (some 34 of them!)
The most commonly used ones of these will have to be made
independent of module.h before the full benefit of this change
can be realized.

We also transition THIS_MODULE from module.h to export.h,
since there are lots of files with subsystem structs that
in turn will have a struct module *owner and only be doing:

	.owner = THIS_MODULE;

and absolutely nothing else modular. So, we also want to have
the THIS_MODULE definition present in the lightweight header.

[1] https://lkml.org/lkml/2011/5/23/76

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:11 -04:00
Arnd Bergmann
08cab72f91 Merge branch 'dt/gic' into next/dt
Conflicts:
	arch/arm/include/asm/localtimer.h
	arch/arm/mach-msm/board-msm8x60.c
	arch/arm/mach-omap2/board-generic.c
2011-10-31 14:08:10 +01:00
Arnd Bergmann
86c1e5a74a Merge branch 'omap/dt' into next/dt 2011-10-31 14:07:51 +01:00
Rob Herring
6d274309d0 irq: support domains with non-zero hwirq base
Interrupt controllers can have non-zero starting value for h/w irq numbers.
Adding support in irq_domain allows the domain hwirq numbering to match
the interrupt controllers' numbering.

As this makes looping over irqs for a domain more complicated, add loop
iterators to iterate over all hwirqs and irqs for a domain.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Tested-by: Thomas Abraham <thomas.abraham@linaro.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-31 14:03:23 +01:00
Rob Herring
c71a54b082 of/irq: introduce of_irq_init
of_irq_init will scan the devicetree for matching interrupt controller
nodes. Then it calls an initialization function for each found controller
in the proper order with parent nodes initialized before child nodes.

Based on initial pseudo code from Grant Likely.

Changes in v4:
- Drop unnecessary empty list check
- Be more verbose on errors
- Simplify "if (!desc) WARN_ON(1)" to "if (WARN_ON(!desc))"

Changes in v3:
- add missing kfree's found by Jamie
- Implement Grant's comments to simplify the init loop
- fix function comments

Changes in v2:
- Complete re-write of list searching code from Grant Likely

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Tested-by: Thomas Abraham <thomas.abraham@linaro.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-31 14:03:22 +01:00
Robin H. Johnson
99a700bcc7 [SCSI] mv_sas: OCZ RevoDrive3 & zDrive R4 support
In the OCZ RevoDrive3/zDrive R4 series, the "OCZ SuperScale Storage
Controller" with "Virtualized Controller Architecture 2.0" really seems
to be a Marvell 88SE9485 part, with OCZ firmware/BIOS.

Developed and tested on OCZ RevoDrive3 120GB [PCI 1b85:1021]

Should work on:
- OCZ RevoDrive3 (2x SandForce 2281)
- OCZ RevoDrive3 X2 (4x SandForce 2281)
- OCZ zDrive R4 CM84 (4x SandForce 2281)
- OCZ zDrive R4 CM88 (8x SandForce 2281)
- OCZ zDrive R4 RM84 (4x SandForce 2582)
- OCZ zDrive R4 RM88 (8x SandForce 2582)

All of this because a friend recently bought a OCZ RevoDrive3 and was
bitten by the lack of Linux support.

Notes from testing:
-------------------
- SMART works.
- VPD Device Identification is "OCZ-REVODRIVE3"
- Thin provisioning/TRIM seems to be implemented as WRITE SAME UNMAP,
  with deterministic (non-zero) read after TRIM, but I'm not sure if it
  works 100% in my testing.
- Some of the tuning in the firmware seems to ensure much better
  performance when in a RAID0 setup than using the two devices
  seperately.

I have not tested booting from the SSD, because all of this was
developed and tested remotely from the actual hardware.

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Thanks-To: Gordon Pritchard <gordp@sfu.ca>
Acked-by: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:29:01 +04:00
Linus Torvalds
839d881074 Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c: Functions for byte-swapped smbus_write/read_word_data
  i2c-algo-pca: Return standard fault codes
  i2c-algo-bit: Return standard fault codes
  i2c-algo-bit: Be verbose on bus testing failure
  i2c-algo-bit: Let user test buses without failing
  i2c/scx200_acb: Fix section mismatch warning in scx200_pci_drv
  i2c: I2C_ELEKTOR should depend on HAS_IOPORT
2011-10-30 15:54:59 -07:00
Linus Torvalds
0cfdc72439 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (33 commits)
  iommu/core: Remove global iommu_ops and register_iommu
  iommu/msm: Use bus_set_iommu instead of register_iommu
  iommu/omap: Use bus_set_iommu instead of register_iommu
  iommu/vt-d: Use bus_set_iommu instead of register_iommu
  iommu/amd: Use bus_set_iommu instead of register_iommu
  iommu/core: Use bus->iommu_ops in the iommu-api
  iommu/core: Convert iommu_found to iommu_present
  iommu/core: Add bus_type parameter to iommu_domain_alloc
  Driver core: Add iommu_ops to bus_type
  iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API
  iommu/amd: Fix wrong shift direction
  iommu/omap: always provide iommu debug code
  iommu/core: let drivers know if an iommu fault handler isn't installed
  iommu/core: export iommu_set_fault_handler()
  iommu/omap: Fix build error with !IOMMU_SUPPORT
  iommu/omap: Migrate to the generic fault report mechanism
  iommu/core: Add fault reporting mechanism
  iommu/core: Use PAGE_SIZE instead of hard-coded value
  iommu/core: use the existing IS_ALIGNED macro
  iommu/msm: ->unmap() should return order of unmapped page
  ...

Fixup trivial conflicts in drivers/iommu/Makefile: "move omap iommu to
dedicated iommu folder" vs "Rename the DMAR and INTR_REMAP config
options" just happened to touch lines next to each other.
2011-10-30 15:46:19 -07:00
Linus Torvalds
1bc87b0055 Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
  KVM: SVM: Keep intercepting task switching with NPT enabled
  KVM: s390: implement sigp external call
  KVM: s390: fix register setting
  KVM: s390: fix return value of kvm_arch_init_vm
  KVM: s390: check cpu_id prior to using it
  KVM: emulate lapic tsc deadline timer for guest
  x86: TSC deadline definitions
  KVM: Fix simultaneous NMIs
  KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
  KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
  KVM: x86 emulator: streamline decode of segment registers
  KVM: x86 emulator: simplify OpMem64 decode
  KVM: x86 emulator: switch src decode to decode_operand()
  KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
  KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
  KVM: x86 emulator: free up some flag bits near src, dst
  KVM: x86 emulator: switch src2 to generic decode_operand()
  KVM: x86 emulator: expand decode flags to 64 bits
  KVM: x86 emulator: split dst decode to a generic decode_operand()
  KVM: x86 emulator: move memop, memopp into emulation context
  ...
2011-10-30 15:36:45 -07:00
Linus Torvalds
acff987d94 Merge branch 'fbdev-next' of git://github.com/schandinat/linux-2.6
* 'fbdev-next' of git://github.com/schandinat/linux-2.6: (270 commits)
  video: platinumfb: Add __devexit_p at necessary place
  drivers/video: fsl-diu-fb: merge diu_pool into fsl_diu_data
  drivers/video: fsl-diu-fb: merge diu_hw into fsl_diu_data
  drivers/video: fsl-diu-fb: only DIU modes 0 and 1 are supported
  drivers/video: fsl-diu-fb: remove unused panel operating mode support
  drivers/video: fsl-diu-fb: use an enum for the AOI index
  drivers/video: fsl-diu-fb: add several new video modes
  drivers/video: fsl-diu-fb: remove broken screen blanking support
  drivers/video: fsl-diu-fb: move some definitions out of the header file
  drivers/video: fsl-diu-fb: fix some ioctls
  video: da8xx-fb: Increased resolution configuration of revised LCDC IP
  OMAPDSS: picodlp: add missing #include <linux/module.h>
  fb: fix au1100fb bitrot.
  mx3fb: fix NULL pointer dereference in screen blanking.
  video: irq: Remove IRQF_DISABLED
  smscufx: change edid data to u8 instead of char
  OMAPDSS: DISPC: zorder support for DSS overlays
  OMAPDSS: DISPC: VIDEO3 pipeline support
  OMAPDSS/OMAP_VOUT: Fix incorrect OMAP3-alpha compatibility setting
  video/omap: fix build dependencies
  ...

Fix up conflicts in:
 - drivers/staging/xgifb/XGI_main_26.c
	Changes to XGIfb_pan_var()
 - drivers/video/omap/{lcd_apollon.c,lcd_ldp.c,lcd_overo.c}
	Removed (or in the case of apollon.c, merged into the generic
	DSS panel in drivers/video/omap2/displays/panel-generic-dpi.c)
2011-10-30 15:30:01 -07:00
Curt Wohlgemuth
0e175a1835 writeback: Add a 'reason' to wb_writeback_work
This creates a new 'reason' field in a wb_writeback_work
structure, which unambiguously identifies who initiates
writeback activity.  A 'wb_reason' enumeration has been
added to writeback.h, to enumerate the possible reasons.

The 'writeback_work_class' and tracepoint event class and
'writeback_queue_io' tracepoints are updated to include the
symbolic 'reason' in all trace events.

And the 'writeback_inodes_sbXXX' family of routines has had
a wb_stats parameter added to them, so callers can specify
why writeback is being started.

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-31 00:33:36 +08:00
Martin Schwidefsky
20b40a794b [S390] signal race with restarting system calls
For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
do_signal will prepare the restart of the system call with a rewind of
the PSW before calling get_signal_to_deliver (where the debugger might
take control). For A ERESTART_RESTARTBLOCK restarting system call
do_signal will set -EINTR as return code.
There are two issues with this approach:
1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
   ERESTART_RESTARTBLOCK as the rewinding already took place or the
   return code has been changed to -EINTR
2) if get_signal_to_deliver does not return with a signal to deliver
   the restart via the repeat of the svc instruction is left in place.
   This opens a race if another signal is made pending before the
   system call instruction can be reexecuted. The original system call
   will be restarted even if the second signal would have ended the
   system call with -EINTR.

These two issues can be solved by dropping the early rewind of the
system call before get_signal_to_deliver has been called and by using
the TIF_RESTART_SVC magic to do the restart if no signal has to be
delivered. The only situation where the system call restart via the
repeat of the svc instruction is appropriate is when a SA_RESTART
signal is delivered to user space.

Unfortunately this breaks inferior calls by the debugger again. The
system call number and the length of the system call instruction is
lost over the inferior call and user space will see ERESTARTNOHAND/
ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
new ptrace interface is added to save/restore the system call number
and system call instruction length.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Michael Holzheu
558df7209e [S390] kdump: Add infrastructure for unmapping crashkernel memory
This patch introduces a mechanism that allows architecture backends to
remove page tables for the crashkernel memory. This can protect the loaded
kdump kernel from being overwritten by broken kernel code.  Two new
functions crash_map_reserved_pages() and crash_unmap_reserved_pages() are
added that can be implemented by architecture code.  The
crash_map_reserved_pages() function is called before and
crash_unmap_reserved_pages() after the crashkernel segments are loaded.  The
functions are also called in crash_shrink_memory() to create/remove page
tables when the crashkernel memory size is reduced.

To support architectures that have large pages this patch also introduces
a new define KEXEC_CRASH_MEM_ALIGN. The crashkernel start and size must
always be aligned with KEXEC_CRASH_MEM_ALIGN.

Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:42 +01:00