Commit Graph

4983 Commits

Author SHA1 Message Date
Cliff Wickman
e2a7147640 x86: correct the conversion of EFI memory types
This patch causes all the EFI_RESERVED_TYPE memory reservations to be recorded
in the e820 table as type E820_RESERVED.

(This patch replaces one called 'x86: vendor reserved memory type'.
 This version has been discussed a bit with Peter and Yinghai but not given
 a final opinion.)

Without this patch EFI_RESERVED_TYPE memory reservations may be
marked usable in the e820 table. There may be a collision between
kernel use and some reserver's use of this memory.

(An example use of this functionality is the UV system, which
 will access extremely large areas of memory with a memory engine
 that allows a user to address beyond the processor's range.  Such
 areas are reserved in the EFI table by the BIOS.
 Some loaders have a restricted number of entries possible in the e820 table,
 hence the need to record the reservations in the unrestricted EFI table.)

The call to do_add_efi_memmap() is only made if "add_efi_memmap" is specified
on the kernel command line.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 17:47:32 -07:00
H. Peter Anvin
95ee14e437 x86: cap iomem_resource to addressable physical memory
iomem_resource is by default initialized to -1, which means 64 bits of
physical address space if 64-bit resources are enabled.  However, x86
CPUs cannot address 64 bits of physical address space.  Thus, we want
to cap the physical address space to what the union of all CPU can
actually address.

Without this patch, we may end up assigning inaccessible values to
uninitialized 64-bit PCI memory resources.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Martin Mares <mj@ucw.cz>
Cc: stable@kernel.org
2009-06-16 17:47:31 -07:00
Hidetoshi Seto
1af0815f96 x86, mce: rename _64.c files which are no longer 64-bit-specific
Rename files that are no longer 64bit specific:
	mce_amd_64.c	=> mce_amd.c
	mce_intel_64.c	=> mce_intel.c

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:11 -07:00
Hidetoshi Seto
1149e72645 x86, mce: remove therm_throt.h
Now all symbols in the header are static.  Remove the header.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:09 -07:00
Hidetoshi Seto
8363fc82d3 x86, mce: remove intel_set_thermal_handler()
and make intel_thermal_interrupt() static.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:08 -07:00
Hidetoshi Seto
895287c0a6 x86, mce: squash mce_intel.c into therm_throt.c
move intel_init_thermal() into therm_throt.c

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:08 -07:00
Hidetoshi Seto
a65c88dd2c x86, mce: unify smp_thermal_interrupt
Put common functions into therm_throt.c, modify Makefile.

	unexpected_thermal_interrupt
	intel_thermal_interrupt
	smp_thermal_interrupt
	intel_set_thermal_handler

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:08 -07:00
Hidetoshi Seto
e8ce2c5ee8 x86, mce: unify smp_thermal_interrupt, prepare
Let them in same shape.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:08 -07:00
Hidetoshi Seto
5335612a57 x86, mce: unify smp_thermal_interrupt, prepare mce_intel_64
Break smp_thermal_interrupt() into two functions.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:08 -07:00
Hidetoshi Seto
3adacb70d3 x86, mce: unify smp_thermal_interrupt, prepare p4
Remove unused argument regs from handlers, and use inc_irq_stat.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:07 -07:00
Hidetoshi Seto
c697836985 x86, mce: make mce_disabled boolean
The mce_disabled on 32bit is a tristate variable [1,0,-1],
while 64bit version is boolean [0,1].
This patch makes mce_disabled always boolean, and use mce_p5_enabled
to indicate the third state instead.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:07 -07:00
Hidetoshi Seto
9e55e44e39 x86, mce: unify mce.h
There are 2 headers:
	arch/x86/include/asm/mce.h
	arch/x86/kernel/cpu/mcheck/mce.h
and in the latter small header:
	#include <asm/mce.h>

This patch move all contents in the latter header into the former,
and fix all files using the latter to include the former instead.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:07 -07:00
Hidetoshi Seto
9af43b54ab x86, mce: sysfs entries for new mce options
Add sysfs interface for admins who want to tweak these options without
rebooting the system.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:06 -07:00
Hidetoshi Seto
1020bcbcc7 x86, mce: rename static variables around trigger
"trigger" is not straight forward name for valiable that holds name
of user mode helper program which triggered by machine check events.

This patch renames this valiable and kins to more recognizable names.

	trigger		=> mce_helper
	trigger_argv	=> mce_helper_argv
	notify_user	=> mce_need_notify

No functional changes.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:06 -07:00
Hidetoshi Seto
4e5b3e690d x86, mce: add __read_mostly
Add __read_mostly to data written during setup.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:05 -07:00
Hidetoshi Seto
7fb06fc967 x86, mce: cleanup mce_start()
Simplify interface of mce_start():

-       no_way_out = mce_start(no_way_out, &order);
+       order = mce_start(&no_way_out);

Now Monarch and Subjects share same exit(return) in usual path.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:05 -07:00
Hidetoshi Seto
33edbf02a9 x86, mce: don't init timer if !mce_available
In mce_cpu_restart, mce_init_timer is called unconditionally.
If !mce_available (e.g. mce is disabled), there are no useful work
for timer.  Stop running it.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-16 16:56:04 -07:00
Huang Ying
184e1fdfea x86, mce: fix a race condition about mce_callin and no_way_out
If one CPU has no_way_out == 1, all other CPUs should have no_way_out
== 1. But despite global_nwo is read after mce_callin, global_nwo is
updated after mce_callin too. So it is possible that some CPU read
global_nwo before some other CPU update global_nwo, so that no_way_out
== 1 for some CPU, while no_way_out == 0 for some other CPU.

This patch fixes this race condition via moving mce_callin updating
after global_nwo updating, with a smp_wmb in between. A smp_rmb is
added between their reading too.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
2009-06-16 16:56:04 -07:00
Linus Torvalds
b3fec0fe35 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits)
  signal: fix __send_signal() false positive kmemcheck warning
  fs: fix do_mount_root() false positive kmemcheck warning
  fs: introduce __getname_gfp()
  trace: annotate bitfields in struct ring_buffer_event
  net: annotate struct sock bitfield
  c2port: annotate bitfield for kmemcheck
  net: annotate inet_timewait_sock bitfields
  ieee1394/csr1212: fix false positive kmemcheck report
  ieee1394: annotate bitfield
  net: annotate bitfields in struct inet_sock
  net: use kmemcheck bitfields API for skbuff
  kmemcheck: introduce bitfield API
  kmemcheck: add opcode self-testing at boot
  x86: unify pte_hidden
  x86: make _PAGE_HIDDEN conditional
  kmemcheck: make kconfig accessible for other architectures
  kmemcheck: enable in the x86 Kconfig
  kmemcheck: add hooks for the page allocator
  kmemcheck: add hooks for page- and sg-dma-mappings
  kmemcheck: don't track page tables
  ...
2009-06-16 13:09:51 -07:00
Ingo Molnar
8a4a6182fd Merge branch 'amd-iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-06-16 11:51:24 +02:00
Chris Wright
6a047d8b9e amd-iommu: resume cleanup
Now that enable_iommus() will call iommu_disable() for each iommu,
the call to disable_iommus() during resume is redundant.  Also, the order
for an invalidation is to invalidate device table entries first, then
domain translations.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-16 10:19:16 +02:00
Kay Sievers
07e9bb8eeb Driver Core: x86: add nodename for cpuid and msr drivers.
This adds support to the x86 cpuid and msr drivers to report the proper
device name to userspace for their devices.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:25 -07:00
Kay Sievers
d405640539 Driver Core: misc: add nodename support for misc devices.
This adds support for misc devices to report their requested nodename to
userspace.  It also updates a number of misc drivers to provide the
needed subdirectory and device name to be used for them.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:25 -07:00
Linus Torvalds
19035e5b5d Merge branch 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: Logic to move non pinned timers
  timers: /proc/sys sysctl hook to enable timer migration
  timers: Identifying the existing pinned timers
  timers: Framework for identifying pinned timers
  timers: allow deferrable timers for intervals tv2-tv5 to be deferred

Fix up conflicts in kernel/sched.c and kernel/timer.c manually
2009-06-15 10:06:19 -07:00
Rusty Russell
8e7c25971b [CPUFREQ] cpumask: new cpumask operators for arch/x86/kernel/cpu/cpufreq/powernow-k8.c
Remove all old-style cpumask operators, and cpumask_t.

Also: get rid of the unused define_siblings function.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:43 -04:00
Rusty Russell
1ff6e97f1d [CPUFREQ] cpumask: avoid playing with cpus_allowed in powernow-k8.c
cpumask: avoid playing with cpus_allowed in powernow-k8.c

It's generally a very bad idea to mug some process's cpumask: it could
legitimately and reasonably be changed by root, which could break us
(if done before our code) or them (if we restore the wrong value).

I did not replace powernowk8_target; it needs fixing, but it grabs a
mutex (so no smp_call_function_single here) but Mark points out it can
be called multiple times per second, so work_on_cpu is too heavy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To: cpufreq@vger.kernel.org
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:43 -04:00
Rusty Russell
e3f996c26f [CPUFREQ] cpumask: avoid cpumask games in arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
Impact: don't play with current's cpumask

It's generally a very bad idea to mug some process's cpumask: it could
legitimately and reasonably be changed by root, which could break us
(if done before our code) or them (if we restore the wrong value).

Use rdmsr_on_cpu and wrmsr_on_cpu instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To: cpufreq@vger.kernel.org
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:43 -04:00
Rusty Russell
394122ab14 [CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.c
Impact: don't play with current's cpumask

It's generally a very bad idea to mug some process's cpumask: it could
legitimately and reasonably be changed by root, which could break us
(if done before our code) or them (if we restore the wrong value).

We use smp_call_function_single: this had the advantage of being more
efficient, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To: cpufreq@vger.kernel.org
Cc: Dominik Brodowski <linux@brodo.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:43 -04:00
Naga Chumbalkar
e15bc4559b [CPUFREQ] powernow-k8: get drv data for correct CPU
Make powernowk8_get() similar to powernowk8_target() and powernowk8_verify()
in the way it obtains "powernow_data" for a given CPU.

Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Langsdorf, Mark <mark.langsdorf@amd.com>
Cc: Thomas Renninger <trenn@suse.de>

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Langsdorf, Mark <mark.langsdorf@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:42 -04:00
Naga Chumbalkar
532cfee6ba [CPUFREQ] powernow-k8: read P-state from HW
By definition, "cpuinfo_cur_freq" should report the value from HW. So, don't
depend on the cached value. Instead read P-state directly from HW, while
taking into account the erratum 311 workaround for Fam 11h processors.

Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Langsdorf, Mark <mark.langsdorf@amd.com>
Cc: Thomas Renninger <trenn@suse.de>

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Langsdorf, Mark <mark.langsdorf@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:42 -04:00
Andrew Morton
b394f1dfc0 [CPUFREQ] reduce scope of ACPI_PSS_BIOS_BUG_MSG[]
This symbol doesn't need file-global scope.

Cc: "Zhang, Rui" <rui.zhang@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Langsdorf, Mark <mark.langsdorf@amd.com>
Cc: Leo Milano <lmilano@gmx.net>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:42 -04:00
Dave Jones
931db6a32d [CPUFREQ] Clean up convoluted code in arch/x86/kernel/tsc.c:time_cpufreq_notifier()
Christoph Hellwig noticed the following potential uninitialised use:

 > arch/x86/kernel/tsc.c: In function 'time_cpufreq_notifier':
 > arch/x86/kernel/tsc.c:634: warning: 'dummy' may be used uninitialized in this function
 >
 > where we do have CONFIG_SMP set, freq->flags & CPUFREQ_CONST_LOOPS is
 > true and ref_freq is false.

It seems plausable, though the circumstances for hitting it are really low.
Nearly all SMP capable cpufreq drivers set CPUFREQ_CONST_LOOPS.
powernow-k8 is really the only exception. The older CPUs were typically
only ever UP. (powernow-k7 never supported SMP for eg)

It's worth fixing regardless, as it cleans up the code.

Fix possible uninitialized use of dummy, by just removing it,
and making the setting of lpj more obvious.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:42 -04:00
Luis Henriques
21335d0214 [CPUFREQ] powernow-k8.c: mess cleanup
Mess cleanup in powernow_k8_acpi_pst_values() function.

Signed-off-by: Luis Henriques <henrix@sapo.pt>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:41 -04:00
Thomas Renninger
86e13684aa [CPUFREQ] powernow-k8: Set transition latency to 1 if ACPI tables export 0
This doesn't fix anything, but it's expected that a transition latency of 0
could cause trouble in the future.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: Langsdorf, Mark <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-15 11:49:41 -04:00
Joerg Roedel
09067207f6 amd-iommu: set event buffer head and tail to 0 manually
These registers may contain values from previous kernels. So reset them
to known values before enable the event buffer again.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-15 16:06:48 +02:00
Peter Zijlstra
74193ef0ec perf_counter: x86: Fix call-chain support to use NMI-safe methods
__copy_from_user_inatomic() isn't NMI safe in that it can trigger
the page fault handler which is another trap and its return path
invokes IRET which will also close the NMI context.

Therefore use a GUP based approach to copy the stack frames over.

We tried an alternative solution as well: we used a forward ported
version of Mathieu Desnoyers's "NMI safe INT3 and Page Fault" patch
that modifies the exception return path to use an open-coded IRET with
explicit stack unrolling and TF checking.

This didnt work as it interacted with faulting user-space instructions,
causing them not to restart properly, which corrupts user-space
registers.

Solving that would probably involve disassembling those instructions
and backtracing the RIP. But even without that, the code was deemed
rather complex to the already non-trivial x86 entry assembly code,
so instead we went for this GUP based method that does a
software-walk of the pagetables.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-15 15:57:53 +02:00
Chris Wright
a8c485bb68 amd-iommu: disable cmd buffer and evt logging before reprogramming iommu
The IOMMU spec states that IOMMU behavior may be undefined when the
IOMMU registers are rewritten while command or event buffer is enabled.
Disable them in IOMMU disable path.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-15 15:53:45 +02:00
Vegard Nossum
722f2a6c87 Merge commit 'linus/master' into HEAD
Conflicts:
	MAINTAINERS

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2009-06-15 15:50:49 +02:00
Chris Wright
42a49f965a amd-iommu: flush domain tlb when attaching a new device
When kexec'ing to a new kernel (for example, when crashing and launching
a kdump session), the AMD IOMMU may have cached translations.  The kexec'd
kernel, during initialization, will invalidate the IOMMU device table
entries, but not the domain translations.  These stale entries can cause
a device's DMA to fail, makes it rough to write a dump to disk when the
disk controller can't DMA ;-)

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-15 15:42:00 +02:00
Joerg Roedel
61d047be99 x86: disable IOMMUs on kernel crash
If the IOMMUs are still enabled when the kexec kernel boots access to
the disk is not possible. This is bad for tools like kdump or anything
else which wants to use PCI devices.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-15 15:20:40 +02:00
Joerg Roedel
0975904276 amd-iommu: disable IOMMU hardware on shutdown
When the IOMMU stays enabled the BIOS may not be able to finish the
machine shutdown properly. So disable the hardware on shutdown.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-15 15:20:40 +02:00
Vegard Nossum
2dff440525 kmemcheck: add mm functions
With kmemcheck enabled, the slab allocator needs to do this:

1. Tell kmemcheck to allocate the shadow memory which stores the status of
   each byte in the allocation proper, e.g. whether it is initialized or
   uninitialized.
2. Tell kmemcheck which parts of memory that should be marked uninitialized.
   There are actually a few more states, such as "not yet allocated" and
   "recently freed".

If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
memory that can take page faults because of kmemcheck.

If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
request memory with the __GFP_NOTRACK flag. This does not prevent the page
faults from occuring, however, but marks the object in question as being
initialized so that no warnings will ever be produced for this object.

In addition to (and in contrast to) __GFP_NOTRACK, the
__GFP_NOTRACK_FALSE_POSITIVE flag indicates that the allocation should
not be tracked _because_ it would produce a false positive. Their values
are identical, but need not be so in the future (for example, we could now
enable/disable false positives with a config option).

Parts of this patch were contributed by Pekka Enberg but merged for
atomicity.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2009-06-15 12:40:03 +02:00
Vegard Nossum
f85612967c x86: add hooks for kmemcheck
The hooks that we modify are:
- Page fault handler (to handle kmemcheck faults)
- Debug exception handler (to hide pages after single-stepping
  the instruction that caused the page fault)

Also redefine memset() to use the optimized version if kmemcheck is
enabled.

(Thanks to Pekka Enberg for minimizing the impact on the page fault
handler.)

As kmemcheck doesn't handle MMX/SSE instructions (yet), we also disable
the optimized xor code, and rely instead on the generic C implementation
in order to avoid false-positive warnings.

Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>

[whitespace fixlet]
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
2009-06-15 12:40:02 +02:00
Ingo Molnar
038e836e97 perf_counter, x86: Fix kernel-space call-chains
Kernel-space call-chains were trimmed at the first entry because
we never processed anything beyond the first stack context.

Allow the backtrace to jump from NMI to IRQ stack then to task stack
and finally user-space stack.

Also calculate the stack and bp variables correctly so that the
stack walker does not exit early.

We can get deep traces as a result, visible in perf report -D output:

0x32af0 [0xe0]: PERF_EVENT (IP, 5): 15134: 0xffffffff815225fd period: 1
... chain: u:2, k:22, nr:24
.....  0: 0xffffffff815225fd
.....  1: 0xffffffff810ac51c
.....  2: 0xffffffff81018e29
.....  3: 0xffffffff81523939
.....  4: 0xffffffff81524b8f
.....  5: 0xffffffff81524bd9
.....  6: 0xffffffff8105e498
.....  7: 0xffffffff8152315a
.....  8: 0xffffffff81522c3a
.....  9: 0xffffffff810d9b74
..... 10: 0xffffffff810dbeec
..... 11: 0xffffffff810dc3fb

This is a 22-entries kernel-space chain.

(We still only record reliable stack entries.)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-15 09:08:08 +02:00
Ingo Molnar
5a6cec3abb perf_counter, x86: Fix call-chain walking
Fix the ptregs variant when we hit user-mode tasks.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-14 22:37:15 +02:00
Thomas Gleixner
507fa3a3d8 x86: hpet: Mark per cpu interrupts IRQF_TIMER to prevent resume failure
timer interrupts are excluded from being disabled during suspend. The
clock events code manages the disabling of clock events on its own
because the timer interrupt needs to be functional before the resume
code reenables the device interrupts.

The hpet per cpu timers request their interrupt without setting the
IRQF_TIMER flag so suspend_device_irqs() disables them as well which
results in a fatal resume failure on the boot CPU.

Adding IRQF_TIMER to the interupt flags when requesting the hpet per
cpu timer interrupts solves the problem.

Reported-by: Benjamin S. <sbenni@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin S. <sbenni@gmx.de>
Cc: stable@kernel.org
2009-06-14 18:24:29 +02:00
Linus Torvalds
a2ee2981ae Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (80 commits)
  x86, mce: Add boot options for corrected errors
  x86, mce: Fix mce printing
  x86, mce: fix for mce counters
  x86, mce: support action-optional machine checks
  x86, mce: define MCE_VECTOR
  x86, mce: rename mce_notify_user to mce_notify_irq
  x86: fix panic with interrupts off (needed for MCE)
  x86, mce: export MCE severities coverage via debugfs
  x86, mce: implement new status bits
  x86, mce: print header/footer only once for multiple MCEs
  x86, mce: default to panic timeout for machine checks
  x86, mce: improve mce_get_rip
  x86, mce: make non Monarch panic message "Fatal machine check" too
  x86, mce: switch x86 machine check handler to Monarch election.
  x86, mce: implement panic synchronization
  x86, mce: implement bootstrapping for machine check wakeups
  x86, mce: check early in exception handler if panic is needed
  x86, mce: add table driven machine check grading
  x86, mce: remove TSC print heuristic
  x86, mce: log corrected errors when panicing
  ...
2009-06-13 13:14:51 -07:00
Jaswinder Singh Rajput
f4db43a38f perf_counter, x86: Update AMD hw caching related event table
All AMD models share the same hw caching related event table.

Also complete the table with more events.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1244835381.2802.2.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-13 12:58:25 +02:00
Jaswinder Singh Rajput
4d2be1267f perf_counter, x86: Check old-AMD performance monitoring support
AMD supports performance monitoring start from K7 (i.e. family 6),
so disable it for earlier AMD CPUs.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1244714289.6923.0.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-13 12:58:25 +02:00
Len Brown
c4bf2f372d ACPI, PCI, x86: move MCFG parsing routine from ACPI to PCI file
Move
arch/x86/kernel/acpi/boot.c: acpi_parse_mcfg()
to
arch/x86/pci/mmconfig-shared.c: pci_parse_mcfg()
where it is used, and make it static.

Move associated globals and helper routine with it.

No functional change.

This code move is in preparation for SFI support,
which will allow the PCI code to find the MCFG table
on systems which do not support ACPI.

Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-12 20:50:38 -04:00