native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow
presetting of a subset of eight x86 GPRs before executing the rd/wrmsr
instructions. This is needed at least on AMD K8 for accessing an erratum
workaround MSR.
Originally based on an idea by H. Peter Anvin.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
boot_cpu_physical_apicid is a global variable and used as function
argument as well. Rename the function arguments to avoid confusion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The proposed Moorestown support patches use an extra feature flag
mechanism to make the ioapic work w/o an i8259. There is a much
simpler solution.
Most i8259 specific functions are already called dependend on the irq
number less than NR_IRQS_LEGACY. Replacing that constant by a
read_mostly variable which can be set to 0 by the platform setup code
allows us to achieve the same without any special feature flags.
That trivial change allows us to proceed with MRST w/o doing a full
blown overhaul of the ioapic code which would delay MRST unduly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Moorestown MID devices need to be detected early in the boot process
to setup and do not call x86_default_early_setup as there is no EBDA
region to reserve.
[ Copied the minimal code from Jacobs latest MRST series ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
x86 bootprotocol 2.07 has introduced hardware_subarch ID in the boot
parameters provided by FW. We use it to identify Moorestown platforms.
[ tglx: Cleanup and paravirt fix ]
Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Platforms like Moorestown require early setup and want to avoid the
call to reserve_ebda_region. The x86_init override is too late when
the MRST detection happens in setup_arch. Move the default i386
x86_init overrides and the call to reserve_ebda_region into a separate
function which is called as the default of a switch case depending on
the hardware_subarch id in boot params. This allows us to add a case
for MRST and let MRST have its own early setup function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We do not need the TSC before late_time_init. Move the tsc_init to the
late time init code so we can also utilize HPET for calibration (which
we claimed to do but never did except in some older kernel
version). This also helps Moorestown to calibrate the TSC with the
AHBT timer which needs to be initialized in late_time_init like HPET.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
TSC calibration is modified by the vmware hypervisor and paravirt by
separate means. Moorestown wants to add its own calibration routine as
well. So make calibrate_tsc a proper x86_init_ops function and
override it by paravirt or by the early setup of the vmware
hypervisor.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the code where it's only user is. Also we need to look whether
this hardwired hackery might interfere with perfcounters.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The timer and timer irq setup code is identical in 32 and 64 bit. Make
it the same formatting as well. Also add the global variables under
the necessary ifdefs to both files.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
MCA_bus is constant 0 when CONFIG_MCA=n. So the compiler removes that
code w/o needing an extra #ifdef
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Let the compiler optimize the timer_ack magic away in the 32bit timer
interrupt and put the same code into time_64.c. It's optimized out for
CONFIG_X86_IO_APIC on 32bit and for 64bit because timer_ack is const 0
in both cases.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is a left over of the old x86 sub arch support. Remove it and
open code it like we do in time_64.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.
Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
paravirt overrides the setup of the default apic timers as per cpu
timers. Moorestown needs to override that as well.
Move it to x86_init_ops setup and create a separate x86_cpuinit struct
which holds the function for the secondary evtl. hotplugabble CPUs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We really do not need two paravirt/x86_init_ops functions which are
called in two consecutive source lines. Move the only user of
post_allocator_init into the already existing pagetable_setup_done
function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Replace another obscure paravirt magic and move it to
x86_init_ops. Such a hook is also useful for embedded and special
hardware.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ARCH_SETUP is a horrible leftover from the old arch/i386 mach support
code. It still has a lonely user in xen. Move it to x86_init_ops.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_init is overridden by x86_quirks and by paravirts. Unify the whole
mess and make it an unconditional x86_init_ops function which defaults
to the standard function and can be overridden by the early platform
code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Replace the quirk machinery by a x86_init_ops function which
defaults to the standard implementation. This is also a preparatory
patch for Moorestown support which needs to replace the default
init_ISA_irqs as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Replace the quirk machinery by a x86_init_ops function which defaults
to the standard implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Call BUG() when a probe have been hit on the way of kprobe processing
path, because that kind of probes are currently unrecoverable
(recovering it will cause an infinite loop and stack overflow).
The original code seems to assume that it's caused by an int3
which another subsystem inserted on out-of-line singlestep buffer if
the hitting probe is same as current probe. However, in that case,
int3-hitting-address is on the out-of-line buffer and should be
different from first (current) int3 address.
Thus, I decided to remove the code.
I also removes arch_disarm_kprobe() because it will involve other stuffs
in text_poke().
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20090827172258.8246.61889.stgit@localhost.localdomain>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Since parse_early_param() may (e.g. for earlyprintk=dbgp)
involve calls to page table manipulation functions (here
set_fixmap_nocache()), NX hardware support must be determined
before calling that function (so that __supported_pte_mask gets
properly set up).
But the call after parse_early_param() can also not go away, as
that will honor eventual command line specified disabling of
the NX functionality.
( This will then just result in whatever mappings got
established during parse_early_param() having the NX bit set
despite it being disabled on the command line, but I think
that's tolerable).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
LKML-Reference: <4A97F3BD02000078000121B9@vpn.id2.novell.com>
[ merged to x86/pat to resolve a conflict. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: the SFI (Simple Firmware Interface) feature in the ACPI
tree needs this cleanup, pull it into the APIC branch as
well so that there's no interactions.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/x86/kernel/sfi.c serves the dual-purpose of supporting the
SFI core with arch specific code, as well as a home for the
arch-specific code that uses SFI.
analogous to ACPI, drivers/sfi/Kconfig is pulled in by arch/x86/Kconfig
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Some IO-APIC routines are ACPI specific now, but need to
be exposed when CONFIG_ACPI=n for the benefit of SFI.
Remove #ifdef ACPI around these routines:
io_apic_get_unique_id(int ioapic, int apic_id);
io_apic_get_version(int ioapic);
io_apic_get_redir_entries(int ioapic);
Move these routines from ACPI-specific boot.c to io_apic.c:
uniq_ioapic_id(u8 id)
mp_find_ioapic()
mp_find_ioapic_pin()
mp_register_ioapic()
Also, since uniq_ioapic_id() is now no longer static,
re-name it to io_apic_unique_id() for consistency
with the other public io_apic routines.
For simplicity, do not #ifdef the resulting code ACPI || SFI,
thought that could be done in the future if it is important
to optimize the !ACPI !SFI IO-APIC x86 kernel for size.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Martin Schwidefsky analyzed it:
To register a clocksource the clocksource_mutex is acquired and if
necessary timekeeping_notify is called to install the clocksource as
the timekeeper clock. timekeeping_notify uses stop_machine which needs
to take cpu_add_remove_lock mutex.
Starting a new cpu is done with the cpu_add_remove_lock mutex held.
native_cpu_up checks the tsc of the new cpu and if the tsc is no good
clocksource_change_rating is called. Which needs the clocksource_mutex
and the deadlock is complete.
The solution is to replace the TSC via the clocksource watchdog
mechanism. Mark the TSC as unstable and schedule the watchdog work so
it gets removed in the watchdog thread context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
The mpc_apic_id setup is handled by a x86_quirk. Make it a
x86_init_ops function with a default implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
32bit and also the numaq code have special requirements on the
ioapic_id setup. Convert it to a x86_init_ops function and get rid
of the quirks and #ifdefs
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The x86 quirkification introduced an extra ugly hackery with a
variable pointer in the mpparse code. If the pointer is initialized
then it is dereferenced and the variable set to 0 or incremented.
Create a x86_init_ops function and let the affected numaq code
hold the function. Default init is a setup noop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
reserve_ebda_region needs to be called befor start_kernel. Moorestown
needs to override it. Make it a x86_init_ops function and initialize
it with the default reserve_ebda_region.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The 32bit and the 64bit code are slighty different in the reservation
of standard resources. Also the upcoming Moorestown support needs its
own version of that.
Add it to x86_init_ops and initialize it with the 64bit default. 32bit
overrides it in early boot. Now moorestown can add it's own override
w/o sprinkling the code with more #ifdefs
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
probe_roms is only used on 32bit. Add it to the x86_init ops and
remove the #ifdefs.
Default initializer is x86_init_noop() which is overridden in
the 32bit boot code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The upcoming Moorestown support brings the embedded world to x86. The
setup code of x86 has already a couple of hooks which are either
x86_quirks or paravirt ops. Some of those setup hooks are pretty
convoluted like the timer setup and the tsc calibration code. But
there are other places which could do with a cleanup.
Instead of having inline functions/macros which are modified at
compile time I decided to introduce x86_init ops which are
unconditional in the code and make it clear that they can be changed
either during compile time or in the early boot process. The function
pointers are initialized by default functions which can be noops so
that the pointer can be called unconditionally in the most cases. This
also allows us to remove 32bit/64bit, paravirt and other #ifdeffery.
paravirt guests are just a hardware platform in the setup code, so we
should treat them as such and not hide all behind multiple layers of
indirection and compile time dependencies.
It's more obvious that x86_init.timers.timer_init() is a function
pointer than the late_time_init = choose_time_init() obscurity. It's
also way simpler to grep for x86_init.timers.timer_init and find all
the places which modify that function pointer instead of analyzing
weak functions, macros and paravirt indirections.
Note. This is not a general paravirt_ops replacement. It just will
move setup related hooks which are potentially useful for other
platform setup purposes as well out of the paravirt domain.
Add the base infrastructure without any functionality.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Kprobes can enter into a probing recursion, ie: a kprobe that does an
endless loop because one of its core mechanism function used during
probing is also probed itself.
This patch helps pinpointing the kprobe that raised such recursion
by dumping it and raising a BUG instead of a warning (we also disarm
the kprobe to try avoiding recursion in BUG itself). Having a BUG
instead of a warning stops the stacktrace in the right place and
doesn't pollute the logs with hundreds of traces that eventually end
up in a stack overflow.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Reason: Change to is_new_memtype_allowed() in x86/urgent
Resolved semantic conflicts in:
arch/x86/mm/pat.c
arch/x86/mm/ioremap.c
Signed-off-by: H. Peter Anvin <hpa@zytor.com>