Commit Graph

28780 Commits

Author SHA1 Message Date
Yinghai Lu
c535b6a1a6 x86: let 32bit use apic_ops too
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:12 +02:00
Suresh Siddha
ad66dd340f x2apic: xen64 paravirt basic apic ops
Define the Xen specific basic apic ops, in additon to paravirt apic ops,
with some misc warning fixes.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:10 +02:00
Suresh Siddha
2d9579a124 x64, x2apic/intr-remap: support for x2apic physical mode support
x2apic Physical mode  support. By default we will use x2apic cluster mode.
x2apic physical mode can be selected using "x2apic_phys" boot parameter.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:07 +02:00
Suresh Siddha
6e1cb38a2a x64, x2apic/intr-remap: add x2apic support, including enabling interrupt-remapping
x2apic support.  Interrupt-remapping must be enabled before enabling x2apic,
this is needed to ensure that IO interrupts continue to work properly after the
cpu mode is changed to x2apic(which uses 32bit extended physical/cluster
apic id).

On systems where apicid's are > 255, BIOS can handover the control to OS in
x2apic mode. Or if the OS handover was in legacy xapic mode, check
if it is capable of x2apic mode. And if we succeed in enabling
Interrupt-remapping, then we can enable x2apic mode in the CPU.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:06 +02:00
Suresh Siddha
75c46fa61b x64, x2apic/intr-remap: MSI and MSI-X support for interrupt remapping infrastructure
MSI and MSI-X support for interrupt remapping infrastructure.

MSI address register will be programmed with interrupt-remapping table
entry(IRTE) index and the IRTE will contain information about the vector,
cpu destination, etc.

For MSI-X, all the IRTE's will be consecutively allocated in the table,
and the address registers will contain the starting index to the block
and the data register will contain the subindex with in that block.

This also introduces a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this).

As MSI is edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flushing the hardware cache.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:05 +02:00
Suresh Siddha
89027d35aa x64, x2apic/intr-remap: IO-APIC support for interrupt-remapping
IO-APIC support in the presence of interrupt-remapping infrastructure.

IO-APIC RTE will be programmed with interrupt-remapping table entry(IRTE)
index and the IRTE will contain information about the vector, cpu destination,
trigger mode etc, which traditionally was present in the IO-APIC RTE.

Introduce a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this cleanly).

For edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flush the hardware cache.

For level triggered, we need to modify the io-apic RTE aswell with the update
vector information, along with modifying IRTE with vector and cpu destination.
So irq migration for level triggered is little  bit more complex compared to
edge triggered migration. But the good news is, we use the same algorithm
for level triggered migration as we have today, only difference being,
we now initiate the irq migration from process context instead of the
interrupt context.

In future, when we do a directed EOI (combined with cpu EOI broadcast
suppression) to the IO-APIC, level triggered irq migration will also be
as simple as edge triggered migration and we can do the irq migration
with a simple atomic update to IO-APIC RTE.

TBD: some tests/changes needed in the presence of fixup_irqs() for
level triggered irq migration.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:05 +02:00
Suresh Siddha
12a67cf685 x64, x2apic/intr-remap: x2apic cluster mode support
x2apic cluster mode support.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:03 +02:00
Suresh Siddha
cff73a6ffa x64, x2apic/intr-remap: introcude self IPI to genapic routines
Introduce self IPI op for genapic.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:02 +02:00
Suresh Siddha
13c88fb58d x64, x2apic/intr-remap: x2apic ops for x2apic mode support
x2apic ops for x2apic mode support. This uses MSR interface and differs
slightly from the xapic register layout.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:01 +02:00
Suresh Siddha
32e1d0a065 x64, x2apic/intr-remap: cpuid bits for x2apic feature
cpuid feature for x2apic.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:00 +02:00
Suresh Siddha
1b374e4d6f x64, x2apic/intr-remap: basic apic ops support
Introduce basic apic operations which handle the apic programming. This
will be used later to introduce another specific operations for x2apic.

For the perfomance critial accesses like IPI's, EOI etc, we use the
native operations as they are already referenced by different
indirections like genapic, irq_chip etc.

64bit Paravirt ops can also define their apic operations accordingly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:59 +02:00
Suresh Siddha
0c81c746f9 x64, x2apic/intr-remap: introduce read_apic_id() to genapic routines
Move the read_apic_id()  to genapic routines.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:57 +02:00
Suresh Siddha
4dc2f96cac x64, x2apic/intr-remap: ioapic routines which deal with initial io-apic RTE setup
Generic ioapic specific routines which be used later during enabling
interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:56 +02:00
Suresh Siddha
d94d93ca5c x64, x2apic/intr-remap: 8259 specific mask/unmask routines
8259 specific mask/unmask routines which be used later while enabling
interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:55 +02:00
Suresh Siddha
72b1e22dfc x64, x2apic/intr-remap: generic irq migration support from process context
Generic infrastructure for migrating the irq from the process context in the
presence of CONFIG_GENERIC_PENDING_IRQ.

This will be used later for migrating irq in the presence of
interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:55 +02:00
Suresh Siddha
b6fcb33ad6 x64, x2apic/intr-remap: routines managing Interrupt remapping table entries.
Routines handling the management of interrupt remapping table entries.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:54 +02:00
Suresh Siddha
2ae2101069 x64, x2apic/intr-remap: Interrupt remapping infrastructure
Interrupt remapping (part of Intel Virtualization Tech for directed I/O)
infrastructure.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:53 +02:00
Suresh Siddha
ad3ad3f6a2 x64, x2apic/intr-remap: parse ioapic scope under vt-d structures
Parse the vt-d device scope structures to find the mapping between IO-APICs
and the interrupt remapping hardware units.

This will be used later for enabling Interrupt-remapping for IOAPIC devices.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:50 +02:00
Suresh Siddha
1886e8a90a x64, x2apic/intr-remap: code re-structuring, to be used by both DMA and Interrupt remapping
Allocate the iommu during the parse of DMA remapping hardware
definition structures. And also, introduce routines for device
scope initialization which will be explicitly called during
dma-remapping initialization.

These will be used for enabling interrupt remapping separately from the
existing DMA-remapping enabling sequence.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:48 +02:00
Ingo Molnar
ae94b8075a Merge branch 'linus' into x86/core
Conflicts:

	arch/x86/mm/ioremap.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 07:29:02 +02:00
Michael Karcher
5ac37f87ff x86: fix ldt limit for 64 bit
Fix size of LDT entries. On x86-64, ldt_desc is a double-sized descriptor.

Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 07:11:31 +02:00
Aneesh Kumar K.V
06d6cf6959 mm: Add range_cont mode for writeback
Filesystems like ext4 needs to start a new transaction in
the writepages for block allocation. This happens with delayed
allocation and there is limit to how many credits we can request
from the journal layer. So we call write_cache_pages multiple
times with wbc->nr_to_write set to the maximum possible value
limitted by the max journal credits available.

Add a new mode to writeback that enables us to handle this
behaviour. In the new mode we update the wbc->range_start
to point to the new offset to be written. Next call to
call to write_cache_pages will start writeout from specified
range_start offset. In the new mode we also limit writing
to the specified wbc->range_end.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-07-11 19:27:31 -04:00
Mingming Cao
e8ced39d5e percpu_counter: new function percpu_counter_sum_and_set
Delayed allocation need to check free blocks at every write time.
percpu_counter_read_positive() is not quit accurate. delayed
allocation need a more accurate accounting, but using
percpu_counter_sum_positive() is frequently is quite expensive.

This patch added a new function to update center counter when sum
per-cpu counter, to increase the accurate rate for next
percpu_counter_read() and require less calling expensive
percpu_counter_sum().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-07-11 19:27:31 -04:00
Alex Tomas
29a814d2ee vfs: add hooks for ext4's delayed allocation support
Export mpage_bio_submit() and __mpage_writepage() for the benefit of
ext4's delayed allocation support.   Also change __block_write_full_page
so that if buffers that have the BH_Delay flag set it will call
get_block() to get the physical block allocated, just as in the
!BH_Mapped case.

Signed-off-by: Alex Tomas <alex@clusterfs.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-07-11 19:27:31 -04:00
Jan Kara
87c89c232c jbd2: Remove data=ordered mode support using jbd buffer heads
Signed-off-by: Jan Kara <jack@suse.cz>
2008-07-11 19:27:31 -04:00
Jan Kara
c851ed5401 jbd2: Implement data=ordered mode handling via inodes
This patch adds necessary framework into JBD2 to be able to track inodes
with each transaction and write-out their dirty data during transaction
commit time.

This new ordered mode brings all sorts of advantages such as possibility 
to get rid of journal heads and buffer heads for data buffers in ordered 
mode, better ordering of writes on transaction commit, simplification of
 some JBD code, no more anonymous pages when truncate of data being 
committed happens.  Also with this new ordered mode, delayed allocation 
on ordered mode is much simpler.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-07-11 19:27:31 -04:00
Jan Kara
f4c0a0fdfa vfs: export filemap_fdatawrite_range()
Make filemap_fdatawrite_range() function public, so that it can later
be used in ordered mode rewrite by JBD/JBD2.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-07-11 19:27:31 -04:00
Theodore Ts'o
736603ab29 jbd2: Add commit time into the commit block
Carlo Wood has demonstrated that it's possible to recover deleted
files from the journal.  Something that will make this easier is if we
can put the time of the commit into commit block.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-07-11 19:27:31 -04:00
Ingo Molnar
6c82a000a2 Merge branch 'x86/generalize-visws' into x86/core 2008-07-11 21:22:18 +02:00
Glauber Costa
392a0fc96b x86: merge dwarf2 headers
Merge dwarf2_32.h and dwarf2_64.h into dwarf2.h.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:39 +02:00
Glauber Costa
d73a731abe x86: use AS_CFI instead of UNWIND_INFO
In dwarf2_32.h, test for CONFIG_AS_CFI instead of
CONFIG_UNWIND_INFO. Turns out that searching for UNWIND_INFO
returns no match in any Kconfig or Makefile, so we're really
just throwing everything away regarding dwarf frames for i386.

The test that generates CONFIG_AS_CFI does not have anything
x86_64-specific, and right now, checking V=1 builds shows me
that the flags is there anyway, although unused.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:35 +02:00
Glauber Costa
70f1bba4c8 x86: use ignore macro instead of hash comment
In dwarf_64.h header, use the "ignore" macro the way
i386 does.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:32 +02:00
Ingo Molnar
d9fc3fd3fa x86: fix savesegment() bug causing crashes on 64-bit
i spent a fair amount of time chasing a 64-bit bootup crash that manifested
itself as bootup segfaults:

  S10network[1825]: segfault at 7f3e2b5d16b8 ip 00000031108748c9 sp 00007fffb9c14c70 error 4 in libc-2.7.so[3110800000+14d000]

eventually causing init to die and panic the system:

  Kernel panic - not syncing: Attempted to kill init!
  Pid: 1, comm: init Not tainted 2.6.26-rc9-tip #13878

after a maratonic bisection session, the bad commit turned out to be:

| b7675791859075418199c7af86a116ea34eaf5bd is first bad commit
| commit b7675791859075418199c7af86a116ea34eaf5bd
| Author: Jeremy Fitzhardinge <jeremy@goop.org>
| Date:   Wed Jun 25 00:19:00 2008 -0400
|
|     x86: remove open-coded save/load segment operations
|
|     This removes a pile of buggy open-coded implementations of savesegment
|     and loadsegment.

after some more bisection of this patch itself, it turns out that what
makes the difference are the savesegment() changes to __switch_to().

Taking a look at this portion of arch/x86/kernel/process_64.o revealed
this crutial difference:

| good:    99c:       8c e0                   mov    %fs,%eax
|          99e:       89 45 cc                mov    %eax,-0x34(%rbp)
|
| bad:     99c:       8c 65 cc                mov    %fs,-0x34(%rbp)

which is due to:

|                 unsigned fsindex;
| -               asm volatile("movl %%fs,%0" : "=r" (fsindex));
| +               savesegment(fs, fsindex);

savesegment() is implemented as:

 #define savesegment(seg, value)                                \
          asm("mov %%" #seg ",%0":"=rm" (value) : : "memory")

note the "m" modifier - it allows GCC to generate the segment move
into a memory operand as well.

But regarding segment operands there's a subtle detail in the x86
instruction set: the above 16-bit moves are zero-extend, but only
if it goes to a register.

If it goes to a memory operand, -0x34(%rbp) in the above case, there's
no zero-extend to 32-bit and the instruction will only save 16 bits
instead of the intended 32-bit.

The other 16 bits is random data - which can cause problems when that
value is used later on.

The solution is to only allow segment operands to go to registers.
This fix allows my test-system to boot up without crashing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 19:51:47 +02:00
Joerg Roedel
d591b0a3ae x86, AMD IOMMU: replace DEVID macro with a function
This patch replaces the DEVID macro with a function and uses them where
apropriate (also in the core code).

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:18 +02:00
Joerg Roedel
83f5aac18c x86, AMD IOMMU: fix device table entry size
A device table entry is actually only 256 *bits* large. Not 256 bytes.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:17 +02:00
Joerg Roedel
8ea80d783e x86, AMD IOMMU: replace HIGH_U32 macro with upper_32_bits function
Removes a driver specific macro and replaces it with a generic function already
available in Linux.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:11 +02:00
Joerg Roedel
5694703f14 x86, AMD IOMMU: add comments to amd_iommu_types.h
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:09 +02:00
Steven Rostedt
af52a90a14 sched_clock: stop maximum check on NO HZ
Working with ftrace I would get large jumps of 11 millisecs or more with
the clock tracer. This killed the latencing timings of ftrace and also
caused the irqoff self tests to fail.

What was happening is with NO_HZ the idle would stop the jiffy counter and
before the jiffy counter was updated the sched_clock would have a bad
delta jiffies to compare with the gtod with the maximum.

The jiffies would stop and the last sched_tick would record the last gtod.
On wakeup, the sched clock update would compare the gtod + delta jiffies
(which would be zero) and compare it to the TSC. The TSC would have
correctly (with a stable TSC) moved forward several jiffies. But because the
jiffies has not been updated yet the clock would be prevented from moving
forward because it would appear that the TSC jumped too far ahead.

The clock would then virtually stop, until the jiffies are updated. Then
the next sched clock update would see that the clock was very much behind
since the delta jiffies is now correct. This would then jump the clock
forward by several jiffies.

This caused ftrace to report several milliseconds of interrupts off
latency at every resume from NO_HZ idle.

This patch adds hooks into the nohz code to disable the checking of the
maximum clock update when nohz is in effect. It resumes the max check
when nohz has updated the jiffies again.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 15:53:26 +02:00
Steven Rostedt
a2bb6a3d85 ftrace: add ftrace_kill_atomic
It has been suggested that I add a way to disable the function tracer
on an oops. This code adds a ftrace_kill_atomic. It is not meant to be
used in normal situations. It will disable the ftrace tracer, but will
not perform the nice shutdown that requires scheduling.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 15:49:21 +02:00
Jeremy Fitzhardinge
8d28aab59f x86_64: add pseudo-features for 32-bit compat syscall
Add pseudo-feature bits to describe whether the CPU supports sysenter
and/or syscall from ia32-compat userspace.  This removes a hardcoded
test in vdso32-setup.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 15:44:57 +02:00
David Woodhouse
a8931ef380 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-07-11 14:36:25 +01:00
Andre Noll
7e93a89251 md: Remove some unused macros.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Neil Brown <neilb@suse.de>
2008-07-11 22:02:23 +10:00
Andre Noll
0f420358e3 md: Turn rdev->sb_offset into a sector-based quantity.
Rename it to sb_start to make sure all users have been converted.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Neil Brown <neilb@suse.de>
2008-07-11 22:02:23 +10:00
Neil Brown
0306d5efbf Merge branch 'master' into for-next 2008-07-11 21:57:40 +10:00
FUJITA Tomonori
be54f9d1c8 x86: remove ifdef CONFIG_SWIOTLB in pci-dma.c
As other IOMMUs do, this puts dummy pci_swiotlb_init() in swiotlb.h
and remove ifdef CONFIG_SWIOTLB in pci-dma.c.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:55 +02:00
FUJITA Tomonori
ac7ded2adb x86: remove ifdef CONFIG_GART_IOMMU in pci-dma.c
Our way to handle gart_* functions for CONFIG_GART_IOMMU and
!CONFIG_GART_IOMMU cases is inconsistent.

We have some dummy gart_* functions in !CONFIG_GART_IOMMU case and
also use ifdef CONFIG_GART_IOMMU tricks in pci-dma.c to call some
gart_* functions in only CONFIG_GART_IOMMU case.

This patch removes ifdef CONFIG_GART_IOMMU in pci-dma.c and always use
dummy gart_* functions in iommu.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:54 +02:00
FUJITA Tomonori
46a7fa270a x86: make only GART code include gart.h
gart.h has only GART-specific stuff. Only GART code needs it. Other
IOMMU stuff should include iommu.h instead of gart.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:54 +02:00
Ingo Molnar
0c81b2a144 Merge branch 'linus' into core/rcu
Conflicts:

	include/linux/rculist.h
	kernel/rcupreempt.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 10:46:50 +02:00
Yinghai Lu
f361a450bf x86: introduce max_low_pfn_mapped for 64-bit
when more than 4g memory is installed, don't map the big hole below 4g.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 10:24:04 +02:00
Alexander Duyck
d815653404 net: add netif_napi_del function to allow for removal of napistructs
Adds netif_napi_del function which is used to remove the napi struct from
the netdev napi_list in cases where CONFIG_NETPOLL was enabled.
The motivation for adding this is to handle the case in which the number of
queues on a device changes due to a configuration change.  Previously the
napi structs for each queue would be left in the list until the netdev was
freed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-11 01:20:33 -04:00