This approach is the first baby step towards solving many of the
structural problems the x86 MCE logging code is having today:
- It has a private ring-buffer implementation that has a number
of limitations and has been historically fragile and buggy.
- It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
specific. /dev/mcelog is not part of any larger logging
framework and hence has remained on the fringes for many years.
- The MCE logging code is still very unclean partly due to its ABI
limitations. Fields are being reused for multiple purposes, and
the whole message structure is limited and x86 specific to begin
with.
All in one, the x86 tree would like to move away from this private
implementation of an event logging facility to a broader framework.
By using perf events we gain the following advantages:
- Multiple user-space agents can access MCE events. We can have an
mcelog daemon running but also a system-wide tracer capturing
important events in flight-recorder mode.
- Sampling support: the kernel and the user-space call-chain of MCE
events can be stored and analyzed as well. This way actual patterns
of bad behavior can be matched to precisely what kind of activity
happened in the kernel (and/or in the app) around that moment in
time.
- Coupling with other hardware and software events: the PMU can track a
number of other anomalies - monitoring software might chose to
monitor those plus the MCE events as well - in one coherent stream of
events.
- Discovery of MCE sources - tracepoints are enumerated and tools can
act upon the existence (or non-existence) of various channels of MCE
information.
- Filtering support: we just subscribe to and act upon the events we
are interested in. Then even on a per event source basis there's
in-kernel filter expressions available that can restrict the amount
of data that hits the event channel.
- Arbitrary deep per cpu buffering of events - we can buffer 32
entries or we can buffer as much as we want, as long as we have
the RAM.
- An NMI-safe ring-buffer implementation - mappable to user-space.
- Built-in support for timestamping of events, PID markers, CPU
markers, etc.
- A rich ABI accessible over system call interface. Per cpu, per task
and per workload monitoring of MCE events can be done this way. The
ABI itself has a nice, meaningful structure.
- Extensible ABI: new fields can be added without breaking tooling.
New tracepoints can be added as the hardware side evolves. There's
various parsers that can be used.
- Lots of scheduling/buffering/batching modes of operandi for MCE
events. poll() support. mmap() support. read() support. You name it.
- Rich tooling support: even without any MCE specific extensions added
the 'perf' tool today offers various views of MCE data: perf report,
perf stat, perf trace can all be used to view logged MCE events and
perhaps correlate them to certain user-space usage patterns. But it
can be used directly as well, for user-space agents and policy action
in mcelog, etc.
With this we hope to achieve significant code cleanup and feature
improvements in the MCE code, and we hope to be able to drop the
/dev/mcelog facility in the end.
This patch is just a plain dumb dump of mce_log() records to
the tracepoints / perf events framework - a first proof of
concept step.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.
Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.
This takes into account comments made by:
. Paul Moore: sock_recvmsg is called only for the first datagram,
sock_recvmsg_nosec is used for the rest.
. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
works in the same fashion as the ppoll one.
If the underlying protocol returns a datagram with MSG_OOB set, this
will make recvmmsg return right away with as many datagrams (+ the OOB
one) it has received so far.
. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
datagrams and then recvmsg returns an error, recvmmsg will return
the successfully received datagrams, store the error and return it
in the next call.
This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was namespace overlap due to a rename i did - this caused
the following build warning, reported by Stephen Rothwell against
linux-next x86_64 allmodconfig:
arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx':
arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function
This is a real bug not just a warning: fix it by renaming the
global event-constraints table pointer to 'event_constraints'.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame. For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
The way to obtain a kernel-mode stack pointer from a struct
pt_regs in 32-bit mode is "subtle": the stack doesn't actually
contain the stack pointer, but rather the location where it would
have been marks the actual previous stack frame. For clarity, use
kernel_stack_pointer() instead of coding this weirdness
explicitly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame. For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.
Furthermore, user_mode() is only valid when the process is known to
not run in V86 mode. Use the safer user_mode_vm() instead.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame. For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This is the counterpart to "x86: export k8 physical topology" for
SRAT. It is not as invasive because the acpi code already seperates
node setup into detection and registration steps, with the
exception of registering e820 active regions in
acpi_numa_memory_affinity_init(). This is now moved to
acpi_scan_nodes() if NUMA emulation is disabled or deferred.
acpi_numa_init() now returns a value which specifies whether an
underlying SRAT was located. If so, that topology can be used by
the emulation code to interleave emulated nodes over physical nodes
or to register the nodes for ACPI.
acpi_get_nodes() may now be used to export the srat physical
topology of the machine for NUMA emulation.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To eventually interleave emulated nodes over physical nodes, we
need to know the physical topology of the machine without actually
registering it. This does the k8 node setup in two parts:
detection and registration. NUMA emulation can then used the
physical topology detected to setup the address ranges of emulated
nodes accordingly. If emulation isn't used, the k8 nodes are
registered as normal.
Two formals are added to the x86 NUMA setup functions: `acpi' and
`k8'. These represent whether ACPI or K8 NUMA has been detected;
both cannot be true at the same time. This specifies to the NUMA
emulation code whether an underlying physical NUMA topology exists
and which interface to use.
This patch deals solely with separating the k8 setup path into
Northbridge detection and registration steps and leaves the ACPI
changes for a subsequent patch. The `acpi' formal is added here,
however, to avoid touching all the header files again in the next
patch.
This approach also ensures emulated nodes will not span physical
nodes so the true memory latency is not misrepresented.
k8_get_nodes() may now be used to export the k8 physical topology
of the machine for NUMA emulation.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Latest kernel has a kernel panic in booting on i386 machine when
profile=2 setting in cmdline. It is due to 'sp' being incorrect in
profile_pc().
BUG: unable to handle kernel NULL pointer dereference at 00000246
IP: [<c01288b6>] profile_pc+0x2a/0x48
*pde = 00000000
Oops: 0000 [#1] SMP
This differs from the original version by Alex Shi in that we use the
kernel_stack_pointer() inline already defined in <asm/ptrace.h> for
this purpose, instead of #ifdef.
Originally-by: Alex Shi <alex.shi@intel.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
We want this to happen after the PCI quirks, which are now running at
the very end of the fs_initcalls.
This works around the BIOS problems which were originally addressed by
commit db8be50c43 ('USB: Work around BIOS
bugs by quiescing USB controllers earlier'), which was reverted in
commit d93a8f829f.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
this_cpu_inc/dec reduces the number of instructions needed.
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.
Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091003065752.GA8935@liondog.tnic>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As reported in
http://bugzilla.kernel.org/show_bug.cgi?id=13940
on some system when acpi are enabled, acpi clears some BAR for some
devices without reason, and kernel will need to allocate devices for
them. It then apparently hits some undocumented resource conflict,
resulting in non-working devices.
Try to increase alignment to get more safe range for unassigned devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
The only thing left that differs between the standard and compat
start_thread functions is the actual segment numbers and the
prototype, so have a single common function which contains the guts
and two very small wrappers.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
For no real good reason, compat_start_thread() was embedded inline in
<asm/elf.h> whereas the native start_thread() lives in process_*.c.
Move compat_start_thread() to process_64.c, remove gratuitious
differences, and fix a few items which mostly look like bit rot.
In particular, compat_start_thread() didn't do free_thread_xstate(),
which means it was hanging on to the xstate store area even when it
was not needed. It was also not setting old_rsp, but it looks like
that generally shouldn't matter for a 32-bit process.
Note: compat_start_thread *has* to be a macro, since it is tested with
start_thread_ia32() as the out of line function name.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
There is an erratum for IOMMU hardware which documents
undefined behavior when forwarding SMI requests from
peripherals and the DTE of that peripheral has a sysmgt
value of 01b. This problem caused weird IO_PAGE_FAULTS in my
case.
This patch implements the suggested workaround for that
erratum into the AMD IOMMU driver. The erratum is
documented with number 63.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This reverts commit 9bcbdd9c58.
The real bug producing LatencyTop latencies has been fixed in:
f5dc375: sched: Update the clock of runqueue select_task_rq() selected
And the commit being reverted here triggers local timer processing
from every device IRQ. If device IRQs come in at a high frequency,
this could cause a performance regression.
The commit being reverted here purely 'fixed' the reported latency
as a side effect, because CPUs were being moved out of idle more
often.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Refuse to add events when the group wouldn't fit onto the PMU
anymore.
Naive implementation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@gmail.com>
LKML-Reference: <1254911461.26976.239.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On some Intel processors, not all events can be measured in all
counters. Some events can only be measured in one particular
counter, for instance. Assigning an event to the wrong counter does
not crash the machine but this yields bogus counts, i.e., silent
error.
This patch changes the event to counter assignment logic to take
into account event constraints for Intel P6, Core and Nehalem
processors. There is no contraints on Intel Atom. There are
constraints on Intel Yonah (Core Duo) but they are not provided in
this patch given that this processor is not yet supported by
perf_events.
As a result of the constraints, it is possible for some event
groups to never actually be loaded onto the PMU if they contain two
events which can only be measured on a single counter. That
situation can be detected with the scaling information extracted
with read().
Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Intel fixed counters do not support all the filters possible with a
generic counter. Thus, if a fixed counter event is passed but with
certain filters set, then the fixed_mode_idx() function must fail
and the event must be measured in a generic counter instead.
Reject filters are: inv, edge, cnt-mask.
Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter picked up my patch for tip/x86/cpu that removes the bkl in
cpuid_open. Ingo subsequently merged that into tip/master.
This patch folds back in tglx's 55968ede164ae523692f00717f50cd926f1382a0
to my patch that removed the bkl.
This simplifies the code, and makes it consistent with the changes to
kill the bkl in msr.c as well.
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Now that range timers and deferred timers are common, I found a
problem with these using the "perf timechart" tool. Frans Pop also
reported high scheduler latencies via LatencyTop, when using
iwlagn.
It turns out that on x86, these two 'opportunistic' timers only get
checked when another "real" timer happens. These opportunistic
timers have the objective to save power by hitchhiking on other
wakeups, as to avoid CPU wakeups by themselves as much as possible.
The change in this patch runs this check not only at timer
interrupts, but at all (device) interrupts. The effect is that:
1) the deferred timers/range timers get delayed less
2) the range timers cause less wakeups by themselves because
the percentage of hitchhiking on existing wakeup events goes up.
I've verified the working of the patch using "perf timechart", the
original exposed bug is gone with this patch. Frans also reported
success - the latencies are now down in the expected ~10 msec
range.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove the big kernel lock from msr_open() as it doesn't protect
anything there.
The only racy event that can happen here is a concurrent cpu shutdown.
So let's look at what could be racy during/after the above event:
- The cpu_online() check is racy, but the bkl doesn't help about
that anyway it disables preemption but we may be chcking another
cpu than the current one.
Also the cpu can still become offlined between open and read calls.
- The cpu_data(cpu) returns a safe pointer too. It won't be released on
cpu offlining. But some fields can be changed from
arch/x86/kernel/smpboot.c:remove_siblinginfo() :
- phys_proc_id
- cpu_core_id
Those are not read from msr_open(). What we are checking is the
x86_capability that is left untouched on offlining.
So this removal looks safe.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sven-Thorsten Dietrich <sdietrich@suse.de>
LKML-Reference: <1254944602-7382-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The current bound checks for copy_from_user in the MTRR driver are
not as obvious as they could be, and gcc agrees with that.
This patch simplifies the boundary checks to the point that gcc can
now prove to itself that the copy_from_user() is never going past
its bounds.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090926205150.30797709@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.
While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:
- Fixed the mce_decode() weak alias - a weak alias is really not
good here, it should be a proper callback. A weak alias will be
overriden if a piece of code is built into the kernel - not
good, obviously.
- The patch initializes the callback on AMD family 10h and 11h.
- Added the more correct fallback printk of:
No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.
On CPUs that dont have a decoder.
- Made the surrounding code more readable.
Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.
(there's no unregister needed as this is core code.)
version -v2 by Borislav Petkov:
- add K8 to the set of supported CPUs
- always build in edac_mce_amd since we use an early_initcall now
- fix checkpatch warnings
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a general per-cpu notifier that is called whenever the kernel is
about to return to userspace. The notifier uses a thread_info flag
and existing checks, so there is no impact on user return or context
switch fast paths.
This will be used initially to speed up KVM task switching by lazily
updating MSRs.
Signed-off-by: Avi Kivity <avi@redhat.com>
LKML-Reference: <1253342422-13811-1-git-send-email-avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Commit c953094 ("early_printk: Allow more than one early console")
introduced a regression in the parsing of the earlyprintk= kernel
arguments.
If you specify "earlyprintk=serial,ttyS0,115200" as a kernel
argument, the "serial,ttyS" should be parsed as a single argument
and not as "serial" and then "ttyS".
Also update the documentation to reflect you can specify the ttyS
directly without the "serial" argument.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
LKML-Reference: <4ABB7D5E.6000301@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched_clock: Fix atomicity/continuity bug by using cmpxchg64()
x86: Provide an alternative() based cmpxchg64()
cmpxchg64() today generates, to quote Linus, "barf bag" code.
cmpxchg64() is about to get used in the scheduler to fix a bug there,
but it's a prerequisite that cmpxchg64() first be made non-sucking.
This patch turns cmpxchg64() into an efficient implementation that
uses the alternative() mechanism to just use the raw instruction on
all modern systems.
Note: the fallback is NOT smp safe, just like the current fallback
is not SMP safe. (Interested parties with i486 based SMP systems
are welcome to submit fix patches for that.)
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
[ fixed asm constraint bug ]
Fixed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090930170754.0886ff2e@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 22223c9b41, as
requested by Andi Kleen:
"Obviously kernels compiled with AMD support can still run on non AMD
systems, so messages like this can never be removed at compile time."
Requsted-by: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Remove redundant non-NUMA topology functions
x86: early_printk: Protect against using the same device twice
x86: Reduce verbosity of "PAT enabled" kernel message
x86: Reduce verbosity of "TSC is reliable" message
x86: mce: Use safer ways to access MCE registers
x86: mce, inject: Use real inject-msg in raise_local
x86: mce: Fix thermal throttling message storm
x86: mce: Clean up thermal throttling state tracking code
x86: split NX setup into separate file to limit unstack-protected code
xen: check EFER for NX before setting up GDT mapping
x86: Cleanup linker script using new linker script macros.
x86: Use section .data.page_aligned for the idt_table.
x86: convert to use __HEAD and HEAD_TEXT macros.
x86: convert compressed loader to use __HEAD and HEAD_TEXT macros.
x86: fix fragile computation of vsyscall address
gcc (4.x) supports the __builtin_object_size() builtin, which
reports the size of an object that a pointer point to, when known
at compile time. If the buffer size is not known at compile time, a
constant -1 is returned.
This patch uses this feature to add a sanity check to
copy_from_user(); if the target buffer is known to be smaller than
the copy size, the copy is aborted and a WARNing is emitted in
memory debug mode.
These extra checks compile away when the object size is not known,
or if both the buffer size and the copy length are constants.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20090926143301.2c396b94@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If you use the kernel argument:
earlyprintk=serial,ttyS0,115200
This will cause a recursive hang printing the same line
again and again:
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
bootconsole [earlyser0] enabled
Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
Instead warn the end user that they specified the device
a second time, and ignore that second console.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4ABAAB89.1080407@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On modern systems, the kernel prints the message
Skipping synchronization checks as TSC is reliable.
once for every non-boot CPU.
This gets kind of ridiculous on huge systems; for example, on a
64-thread system I was lucky enough to get:
$ dmesg | grep 'TSC is reliable' | wc
63 567 4221
There's no point to doing this for every CPU, since the code is
just checking the boot CPU anyway, so change this to a
printk_once() to make the message appears only once.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
LKML-Reference: <adazl8l2swc.fsf@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>