The mpc_apic_id setup is handled by a x86_quirk. Make it a
x86_init_ops function with a default implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
32bit and also the numaq code have special requirements on the
ioapic_id setup. Convert it to a x86_init_ops function and get rid
of the quirks and #ifdefs
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The x86 quirkification introduced an extra ugly hackery with a
variable pointer in the mpparse code. If the pointer is initialized
then it is dereferenced and the variable set to 0 or incremented.
Create a x86_init_ops function and let the affected numaq code
hold the function. Default init is a setup noop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As far as I see there is no external poking of mp_lapic_addr in
this procedure which could lead to unpredited changes and
require local storage unit for it. Lets use it plain forward.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090826171324.GC4548@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2.6.31-rc7 does not boot on vSMP systems:
[ 8.501108] CPU31: Thermal monitoring enabled (TM1)
[ 8.501127] CPU 31 MCA banks SHD:2 SHD:3 SHD:5 SHD:6 SHD:8
[ 8.650254] CPU31: Intel(R) Xeon(R) CPU E5540 @ 2.53GHz stepping 04
[ 8.710324] Brought up 32 CPUs
[ 8.713916] Total of 32 processors activated (162314.96 BogoMIPS).
[ 8.721489] ERROR: parent span is not a superset of domain->span
[ 8.727686] ERROR: domain->groups does not contain CPU0
[ 8.733091] ERROR: groups don't span domain->span
[ 8.737975] ERROR: domain->cpu_power not set
[ 8.742416]
Ravikiran Thirumalai bisected it to:
| commit 2759c3287d
| x86: don't call read_apic_id if !cpu_has_apic
The problem is that on vSMP systems the CPUID derived
initial-APICIDs are overlapping - so we need to fall
back on hard_smp_processor_id() which reads the local
APIC.
Both come from the hardware (influenced by firmware
though) so it's a tough call which one to trust.
Doing the quirk expresses the vSMP property properly
and also does not affect other systems, so we go for
this solution instead of a revert.
Reported-and-Tested-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Shai Fultheim <shai@scalex86.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4A944D3C.5030100@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
alloc_bootmem() already panics on allocation failure. There is
no need to check the result.
Also there is a way to unbind global variable from its body and
use it as a parameter which allow us to simplify
ioapic_init_mappings as well -- "for" cycle already uses
nr_ioapics as a conditional variable and there is no need to
check if ioapic_setup_resources was returning NULL again.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090824175551.493629148@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The default_send_IPI_mask_logical() function uses the "flat" APIC mode
to send an IPI to a set of CPU's at once, but if that set happens to be
empty, some older local APIC's will apparently be rather unhappy. So
just warn if a caller gives us an empty mask, and ignore it.
This fixes a regression in 2.6.30.x, due to commit 4595f9620 ("x86:
change flush_tlb_others to take a const struct cpumask"), documented
here:
http://bugzilla.kernel.org/show_bug.cgi?id=13933
which causes a silent lock-up. It only seems to happen on PPro, P2, P3
and Athlon XP cores. Most developers sadly (or not so sadly, if you're
a developer..) have more modern CPU's. Also, on x86-64 we don't use the
flat APIC mode, so it would never trigger there even if the APIC didn't
like sending an empty IPI mask.
Reported-by: Pavel Vilim <wylda@volny.cz>
Reported-and-tested-by: Thomas Björnell <thomas.bjornell@gmail.com>
Reported-and-tested-by: Martin Rogge <marogge@onlinehome.de>
Cc: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The function uv_acpi_madt_oem_check() has been marked __init,
the struct apic_x2apic_uv_x has been marked __refdata.
The aim is to address the following section mismatch messages:
WARNING: arch/x86/kernel/apic/built-in.o(.data+0x1368): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
WARNING: arch/x86/kernel/built-in.o(.data+0x68e8): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
WARNING: arch/x86/built-in.o(.text+0x7b36f): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_ioremap()
The function uv_acpi_madt_oem_check() references
the function __init early_ioremap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_ioremap is wrong.
WARNING: arch/x86/built-in.o(.text+0x7b38d): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_iounmap()
The function uv_acpi_madt_oem_check() references
the function __init early_iounmap().
This is often because uv_acpi_madt_oem_check lacks a __init
annotation or the annotation of early_iounmap is wrong.
WARNING: arch/x86/built-in.o(.data+0x8668): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
The variable apic_x2apic_uv_x references
the function __cpuinit uv_wakeup_secondary()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
LKML-Reference: <200908161855.48302.lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Though the most time we are to panic on irq-pin allocation
fails, for PCI interrupts it's not the case and we could
continue operate even if irq-pin allocation failed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090805200931.GB5319@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
KVM would like to provide x2APIC interface to a guest without emulating
interrupt remapping device. The reason KVM prefers guest to use x2APIC
is that x2APIC interface is better virtualizable and provides better
performance than mmio xAPIC interface:
- msr exits are faster than mmio (no page table walk, emulation)
- no need to read back ICR to look at the busy bit
- one 64 bit ICR write instead of two 32 bit writes
- shared code with the Hyper-V paravirt interface
Included patch changes x2APIC enabling logic to enable it even if IR
initialization failed, but kernel runs under KVM and no apic id is
greater than 255 (if there is one spec requires BIOS to move to x2apic
mode before starting an OS).
-v2: fix build
-v3: fix bug causing compiler warning
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: "avi@redhat.com" <avi@redhat.com>
LKML-Reference: <20090720122417.GR5638@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
cpu_has_apic has already investigated boot_cpu_data
X86_FEATURE_APIC bit for being clear if condition is
triggered.
So there is no need to clear this bit second time.
Signed-off-by: Cyrill Gorcuno v <gorcunov@openvz.org>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
LKML-Reference: <20090722205259.GE15805@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In uv_setup_irq(), the call to create_irq() initially assigns
IRQ vectors to cpu 0. The subsequent call to
assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
another cpu and frees the cpu 0 vector - at least it will be
freed as soon as the "IRQ move" completes.
arch_enable_uv_irq() needs to send a cleanup IPI to complete
the IRQ move. Otherwise, assignment of GRU interrupts on large
systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
and initialization of the GRU driver will fail.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090720142840.GA8885@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change SGI UV default apicid mode to "physical". This is
required to match settings in the UV hub chip.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090727143856.GA8905@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The UV BIOS has added additional MMR ranges that are mapped via
EFI virtual mode mappings. These ranges should be deleted from
ranges mapped by uv_system_init().
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
LKML-Reference: <20090727143656.GA7698@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
UV blades may not have any blade-local memory. Add a field
(nid) to the UV blade structure to indicates whether the node
has local memory. This is needed by the GRU driver (pushed
separately).
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
LKML-Reference: <20090727143507.GA7006@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As Andrew noted, my previous patch ("debug lockups: Improve lockup
detection") broke/removed SysRq-L support from architecture that do
not provide a __trigger_all_cpu_backtrace implementation.
Restore a fallback path and clean up the SysRq-L machinery a bit:
- Rename the arch method to arch_trigger_all_cpu_backtrace()
- Simplify the define
- Document the method a bit - in the hope of more architectures
adding support for it.
[ The patch touches Sparc code for the rename. ]
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When debugging a recent lockup bug i found various deficiencies
in how our current lockup detection helpers work:
- SysRq-L is not very efficient as it uses a workqueue, hence
it cannot punch through hard lockups and cannot see through
most soft lockups either.
- The SysRq-L code depends on the NMI watchdog - which is off
by default.
- We dont print backtraces from the RCU code's built-in
'RCU state machine is stuck' debug code. This debug
code tends to be one of the first (and only) mechanisms
that show that a lockup has occured.
This patch changes the code so taht we:
- Trigger the NMI backtrace code from SysRq-L instead of using
a workqueue (which cannot punch through hard lockups)
- Trigger print-all-CPU-backtraces from the RCU lockup detection
code
Also decouple the backtrace printing code from the NMI watchdog:
- Dont use variable size cpumasks (it might not be initialized
and they are a bit more fragile anyway)
- Trigger an NMI immediately via an IPI, instead of waiting
for the NMI tick to occur. This is a lot faster and can
produce more relevant backtraces. It will also work if the
NMI watchdog is disabled.
- Dont print the 'dazed and confused' message when we print
a backtrace from the NMI
- Do a show_regs() plus a dump_stack() to get maximum info
out of the dump. Worst-case we get two stacktraces - which
is not a big deal. Sometimes, if register content is
corrupted, the precise stack walker in show_regs() wont
give us a full backtrace - in this case dump_stack() will
do it.
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The 32 and 64-bit versions of ioapic_retrigger_irq() are identical
except the 64-bit one takes vector_lock. vector_lock is defined and
used on 32-bit too, so just use a common ioapic_retrigger_irq().
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
There's no need for a control variable in replace_pin_at_irq_node();
it can just return if it finds the old apic/pin to replace.
If the loop terminates, then it didn't find the old apic/pin, so it can
add the new ones.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Rather than duplicating the same alloc/init code twice, restructure
the function to look for duplicates and then add an entry
if none is found.
This function is not performance critical; all but one of its callers
are __init functions, and the non-__init caller is for PCI device setup.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Convert the unconventional loop in io_apic_level_ack_pending() to
a conventional for() loop.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The comment got separated from its subject, so move it to what
appears to be the right place, and update to describe the current
structure.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The structure is defined immediately below, so there's no need
to forward declare it.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
While no 64-bit hardware will have a version 0x11 I/O APIC which needs
the level/edge bug workaround, that's not a particular reason to use
CONFIG_X86_32 to #ifdef the code out. Most 32-bit machines will no
longer need the workaround either, so the test to see whether it is
necessary should be more fine-grained than "32-bit=yes, 64-bit=no".
(Also fix formatting of block comment.)
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The main difference between 32 and 64-bit __mask_IO_APIC_irq() does a
readback from the I/O APIC to synchronize it.
If there's a hardware requirement to do a readback sync after updating
an APIC register, then it will be a hardware requrement regardless of
whether the kernel is compiled 32 or 64-bit.
Unify __mask_IO_APIC_irq() using the 64-bit version which always syncs
with io_apic_sync().
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
If ioapic_modify_irq() is marked inline, it gets inlined several times.
Un-inlining it saves around 200 bytes in .text for me.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The variable apic_numaq placed in noninit section references the
function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit
section. Thus causes a section mismatch warning. To avoid such
mismatch we mark apic_numaq as __refdata.
We were warned by the following warning:
WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in
reference from the variable apic_numaq to the function
.cpuinit.text:wakeup_secondary_cpu_via_nmi()
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The variable apic_es7000_cluster references the function __cpuinit
wakeup_secondary_cpu_via_mip() from a noninit section. So we've been
warned by the following warning. To avoid possible collision between
init/noninit, its best to mark the variable as __refdata.
We were warned by the following warning:
LD arch/x86/kernel/apic/built-in.o
WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section
mismatch in reference from the variable apic_es7000_cluster to the
function .cpuinit.text:wakeup_secondary_cpu_via_mip()
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?)
Dann bisected it down to:
commit 30a18d6c3f
Date: Tue Feb 19 03:21:20 2008 -0800
x86: multi pci root bus with different io resource range, on
64-bit
It turns out that:
1. that AMD-based systems have two HT chains.
2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc
3. that multi-peer-root patch will try to split root resources to peer
root resources according to PCI conf of NB
4. PCI core assigns unassigned resources, but they overlap with BARs
that are used by ioapic addr of io4 and 8132.
The reason: at that point ioapic address are not inserted yet. Solution
is to insert ioapic resources into the tree a bit earlier.
Reported-by: Stephen Frost <sfrost@snowman.net>
Reported-and-Tested-by: dann frazier <dannf@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@jbarnes-g45.(none)>
We already use a lot of cpu_has_ helpers.
Lets do here the same for consistency.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090705160154.GB4791@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Drop the CONFIG_X86_NEW_MCE symbol and change all
references to it to check for CONFIG_X86_MCE directly.
No code changes
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics. printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.
<level> is now included in the output on each additional use.
Remove all uses of multiple KERN_<level>s in formats.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To support domain-isolation usages, the platform hardware must be
capable of uniquely identifying the requestor (source-id) for each
interrupt message. Without source-id checking for interrupt remapping
, a rouge guest/VM with assigned devices can launch interrupt attacks
to bring down anothe guest/VM or the VMM itself.
This patch adds source-id checking for interrupt remapping, and then
really isolates interrupts for guests/VMs with assigned devices.
Because PCI subsystem is not initialized yet when set up IOAPIC
entries, use read_pci_config_byte to access PCI config space directly.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>