Commit Graph

9691 Commits

Author SHA1 Message Date
Linus Torvalds
ac07f5c3cb Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/fpu update from Ingo Molnar:
 "The biggest change is the addition of the non-lazy (eager) FPU saving
  support model and enabling it on CPUs with optimized xsaveopt/xrstor
  FPU state saving instructions.

  There are also various Sparse fixes"

Fix up trivial add-add conflict in arch/x86/kernel/traps.c

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, kvm: fix kvm's usage of kernel_fpu_begin/end()
  x86, fpu: remove cpu_has_xmm check in the fx_finit()
  x86, fpu: make eagerfpu= boot param tri-state
  x86, fpu: enable eagerfpu by default for xsaveopt
  x86, fpu: decouple non-lazy/eager fpu restore from xsave
  x86, fpu: use non-lazy fpu restore for processors supporting xsave
  lguest, x86: handle guest TS bit for lazy/non-lazy fpu host models
  x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage
  x86, kvm: use kernel_fpu_begin/end() in kvm_load/put_guest_fpu()
  x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig()
  x86, fpu: drop_fpu() before restoring new state from sigframe
  x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels
  x86, fpu: Consolidate inline asm routines for saving/restoring fpu state
  x86, signal: Cleanup ifdefs and is_ia32, is_x32
2012-10-01 11:10:52 -07:00
Linus Torvalds
58ae9c0d54 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 debug update from Ingo Molnar:
 "Various small enhancements"

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/debug: Dump family, model, stepping of the boot CPU
  x86/iommu: Use NULL instead of plain 0 for __IOMMU_INIT
  x86/iommu: Drop duplicate const in __IOMMU_INIT
  x86/fpu/xsave: Keep __user annotation in casts
  x86/pci/probe_roms: Add missing __iomem annotation to pci_map_biosrom()
  x86/signals: ia32_signal.c: add __user casts to fix sparse warnings
  x86/vdso: Add __user annotation to VDSO32_SYMBOL
  x86: Fix __user annotations in asm/sys_ia32.h
2012-10-01 11:07:31 -07:00
Linus Torvalds
4a553e1450 Merge branches 'x86-cpu-for-linus' and 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/cpu and x86/cpufeature from Ingo Molnar:
 "One tiny cleanup, and prepare for SMAP (Supervisor Mode Access
  Prevention) support on x86"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Remove the useless branch in c_start()

* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpufeature: Add feature bit for SMAP
2012-10-01 11:06:35 -07:00
Linus Torvalds
08815bc267 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/cleanups from Ingo Molnar:
 "Smaller cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arch/x86: Remove unecessary semicolons
  x86, boot: Remove obsolete and unused constant RAMDISK
2012-10-01 10:47:45 -07:00
Linus Torvalds
da8347969f Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar:
 "The one change that stands out is the alternatives patching change
  that prevents us from ever patching back instructions from SMP to UP:
  this simplifies things and speeds up CPU hotplug.

  Other than that it's smaller fixes, cleanups and improvements."

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Unspaghettize do_trap()
  x86_64: Work around old GAS bug
  x86: Use REP BSF unconditionally
  x86: Prefer TZCNT over BFS
  x86/64: Adjust types of temporaries used by ffs()/fls()/fls64()
  x86: Drop unnecessary kernel_eflags variable on 64-bit
  x86/smp: Don't ever patch back to UP if we unplug cpus
2012-10-01 10:46:27 -07:00
Linus Torvalds
80749df4a1 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
 "Smaller fixes and cleanups"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/api: Rename mp_register_lapic in a comment
  x86/irq/i8259: Fix incorrect comment
  x86: dt: Use linear irq domain for ioapic(s)
2012-10-01 10:45:54 -07:00
Linus Torvalds
4cba333582 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
 "Leftover perf/urgent fix from the v3.6 cycle"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix typo in uncore_pmu_to_box
2012-10-01 10:34:56 -07:00
Linus Torvalds
7e92daaefa Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf update from Ingo Molnar:
 "Lots of changes in this cycle as well, with hundreds of commits from
  over 30 contributors.  Most of the activity was on the tooling side.

  Higher level changes:

   - New 'perf kvm' analysis tool, from Xiao Guangrong.

   - New 'perf trace' system-wide tracing tool

   - uprobes fixes + cleanups from Oleg Nesterov.

   - Lots of patches to make perf build on Android out of box, from
     Irina Tirdea

   - Extend ftrace function tracing utility to be more dynamic for its
     users.  It allows for data passing to the callback functions, as
     well as reading regs as if a breakpoint were to trigger at function
     entry.

     The main goal of this patch series was to allow kprobes to use
     ftrace as an optimized probe point when a probe is placed on an
     ftrace nop.  With lots of help from Masami Hiramatsu, and going
     through lots of iterations, we finally came up with a good
     solution.

   - Add cpumask for uncore pmu, use it in 'stat', from Yan, Zheng.

   - Various tracing updates from Steve Rostedt

   - Clean up and improve 'perf sched' performance by elliminating lots
     of needless calls to libtraceevent.

   - Event group parsing support, from Jiri Olsa

   - UI/gtk refactorings and improvements from Namhyung Kim

   - Add support for non-tracepoint events in perf script python, from
     Feng Tang

   - Add --symbols to 'script', similar to the one in 'report', from
     Feng Tang.

  Infrastructure enhancements and fixes:

   - Convert the trace builtins to use the growing evsel/evlist
     tracepoint infrastructure, removing several open coded constructs
     like switch like series of strcmp to dispatch events, etc.
     Basically what had already been showcased in 'perf sched'.

   - Add evsel constructor for tracepoints, that uses libtraceevent just
     to parse the /format events file, use it in a new 'perf test' to
     make sure the libtraceevent format parsing regressions can be more
     readily caught.

   - Some strange errors were happening in some builds, but not on the
     next, reported by several people, problem was some parser related
     files, generated during the build, didn't had proper make deps, fix
     from Eric Sandeen.

   - Introduce struct and cache information about the environment where
     a perf.data file was captured, from Namhyung Kim.

   - Fix handling of unresolved samples when --symbols is used in
     'report', from Feng Tang.

   - Add union member access support to 'probe', from Hyeoncheol Lee.

   - Fixups to die() removal, from Namhyung Kim.

   - Render fixes for the TUI, from Namhyung Kim.

   - Don't enable annotation in non symbolic view, from Namhyung Kim.

   - Fix pipe mode in 'report', from Namhyung Kim.

   - Move related stats code from stat to util/, will be used by the
     'stat' kvm tool, from Xiao Guangrong.

   - Remove die()/exit() calls from several tools.

   - Resolve vdso callchains, from Jiri Olsa

   - Don't pass const char pointers to basename, so that we can
     unconditionally use libgen.h and thus avoid ifdef BIONIC lines,
     from David Ahern

   - Refactor hist formatting so that it can be reused with the GTK
     browser, From Namhyung Kim

   - Fix build for another rbtree.c change, from Adrian Hunter.

   - Make 'perf diff' command work with evsel hists, from Jiri Olsa.

   - Use the only field_sep var that is set up: symbol_conf.field_sep,
     fix from Jiri Olsa.

   - .gitignore compiled python binaries, from Namhyung Kim.

   - Get rid of die() in more libtraceevent places, from Namhyung Kim.

   - Rename libtraceevent 'private' struct member to 'priv' so that it
     works in C++, from Steven Rostedt

   - Remove lots of exit()/die() calls from tools so that the main perf
     exit routine can take place, from David Ahern

   - Fix x86 build on x86-64, from David Ahern.

   - {int,str,rb}list fixes from Suzuki K Poulose

   - perf.data header fixes from Namhyung Kim

   - Allow user to indicate objdump path, needed in cross environments,
     from Maciek Borzecki

   - Fix hardware cache event name generation, fix from Jiri Olsa

   - Add round trip test for sw, hw and cache event names, catching the
     problem Jiri fixed, after Jiri's patch, the test passes
     successfully.

   - Clean target should do clean for lib/traceevent too, fix from David
     Ahern

   - Check the right variable for allocation failure, fix from Namhyung
     Kim

   - Set up evsel->tp_format regardless of evsel->name being set
     already, fix from Namhyung Kim

   - Oprofile fixes from Robert Richter.

   - Remove perf_event_attr needless version inflation, from Jiri Olsa

   - Introduce libtraceevent strerror like error reporting facility,
     from Namhyung Kim

   - Add pmu mappings to perf.data header and use event names from cmd
     line, from Robert Richter

   - Fix include order for bison/flex-generated C files, from Ben
     Hutchings

   - Build fixes and documentation corrections from David Ahern

   - Assorted cleanups from Robert Richter

   - Let O= makes handle relative paths, from Steven Rostedt

   - perf script python fixes, from Feng Tang.

   - Initial bash completion support, from Frederic Weisbecker

   - Allow building without libelf, from Namhyung Kim.

   - Support DWARF CFI based unwind to have callchains when %bp based
     unwinding is not possible, from Jiri Olsa.

   - Symbol resolution fixes, while fixing support PPC64 files with an
     .opt ELF section was the end goal, several fixes for code that
     handles all architectures and cleanups are included, from Cody
     Schafer.

   - Assorted fixes for Documentation and build in 32 bit, from Robert
     Richter

   - Cache the libtraceevent event_format associated to each evsel
     early, so that we avoid relookups, i.e.  calling pevent_find_event
     repeatedly when processing tracepoint events.

     [ This is to reduce the surface contact with libtraceevents and
        make clear what is that the perf tools needs from that lib: so
        far parsing the common and per event fields.  ]

   - Don't stop the build if the audit libraries are not installed, fix
     from Namhyung Kim.

   - Fix bfd.h/libbfd detection with recent binutils, from Markus
     Trippelsdorf.

   - Improve warning message when libunwind devel packages not present,
     from Jiri Olsa"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits)
  perf trace: Add aliases for some syscalls
  perf probe: Print an enum type variable in "enum variable-name" format when showing accessible variables
  perf tools: Check libaudit availability for perf-trace builtin
  perf hists: Add missing period_* fields when collapsing a hist entry
  perf trace: New tool
  perf evsel: Export the event_format constructor
  perf evsel: Introduce rawptr() method
  perf tools: Use perf_evsel__newtp in the event parser
  perf evsel: The tracepoint constructor should store sys:name
  perf evlist: Introduce set_filter() method
  perf evlist: Renane set_filters method to apply_filters
  perf test: Add test to check we correctly parse and match syscall open parms
  perf evsel: Handle endianity in intval method
  perf evsel: Know if byte swap is needed
  perf tools: Allow handling a NULL cpu_map as meaning "all cpus"
  perf evsel: Improve tracepoint constructor setup
  tools lib traceevent: Fix error path on pevent_parse_event
  perf test: Fix build failure
  trace: Move trace event enable from fs_initcall to core_initcall
  tracing: Add an option for disabling markers
  ...
2012-10-01 10:28:49 -07:00
Al Viro
969ae0bfb0 x86: get rid of duplicate code in case of CONFIG_VM86
no need to have the call of do_notify_resume() + checks around it
duplicated for vm86 case - a bit of rearranging of ifdefs and we'll
have a perfectly fine copy to jump back to.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-01 09:58:15 -04:00
Al Viro
6783eaa2e1 x86, um/x86: switch to generic sys_execve and kernel_execve
32bit wrapper is lost on that; 64bit one is *not*, since
we need to arrange for full pt_regs on stack when we call
sys_execve() and we need to load callee-saved ones from
there afterwards.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-30 22:53:32 -04:00
Al Viro
7076aada10 x86: split ret_from_fork
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-30 22:53:31 -04:00
Thomas Renninger
53aac44c90 ACPI: Store valid ACPI tables passed via early initrd in reserved memblock areas
A later patch will compare them with ACPI tables that get loaded at boot or
runtime and if criteria match, a stored one is loaded.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Link: http://lkml.kernel.org/r/1349043837-22659-4-git-send-email-trenn@suse.de
Cc: Len Brown <lenb@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-30 18:03:23 -07:00
Thomas Renninger
8e30524dcc x86, acpi: Introduce x86 arch specific arch_reserve_mem_area() for e820 handling
This is needed for ACPI table overriding via initrd. Beside reserving
memblocks, X86 also requires to flag the memory area to E820_RESERVED or
E820_ACPI in the e820 mappings to be able to io(re)map it later.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Link: http://lkml.kernel.org/r/1349043837-22659-3-git-send-email-trenn@suse.de
Cc: Len Brown <lenb@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-30 18:03:13 -07:00
Oleg Nesterov
db023ea595 uprobes: Move clear_thread_flag(TIF_UPROBE) to uprobe_notify_resume()
Move clear_thread_flag(TIF_UPROBE) from do_notify_resume() to
uprobe_notify_resume() for !CONFIG_UPROBES case.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2012-09-29 21:21:53 +02:00
Tomoki Sekiyama
fd0f586972 x86: Distinguish TLB shootdown interrupts from other functions call interrupts
As TLB shootdown requests to other CPU cores are now using function call
interrupts, TLB shootdowns entry in /proc/interrupts is always shown as 0.

This behavior change was introduced by commit 52aec3308d ("x86/tlb:
replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR").

This patch reverts TLB shootdowns entry in /proc/interrupts to count TLB
shootdowns separately from the other function call interrupts.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Link: http://lkml.kernel.org/r/20120926021128.22212.20440.stgit@hpxw
Acked-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-27 22:52:34 -07:00
Naveen N. Rao
450cc20103 x86/mce: Provide boot argument to honour bios-set CMCI threshold
The ACPI spec doesn't provide for a way for the bios to pass down
recommended thresholds to the OS on a _per-bank_ basis. This patch adds
a new boot option, which if passed, tells Linux to use CMCI thresholds
set by the bios.

As fail-safe, we initialize threshold to 1 if some banks have not been
initialized by the bios and warn the user.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-09-27 10:08:00 -07:00
H. Peter Anvin
b2cc2a074d x86, smep, smap: Make the switching functions one-way
There is no fundamental reason why we should switch SMEP and SMAP on
during early cpu initialization just to switch them off again.  Now
with %eflags and %cr4 forced to be initialized to a clean state, we
only need the one-way enable.  Also, make the functions inline to make
them (somewhat) harder to abuse.

This does mean that SMEP and SMAP do not get initialized anywhere near
as early.  Even using early_param() instead of __setup() doesn't give
us control early enough to do this during the early cpu initialization
phase.  This seems reasonable to me, because SMEP and SMAP should not
matter until we have userspace to protect ourselves from, but it does
potentially make it possible for a bug involving a "leak of
permissions to userspace" to get uncaught.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-27 09:52:38 -07:00
Yan, Zheng
402537fd9f perf/x86: Fix typo in uncore_pmu_to_box
The variable box should not be declared as static.

This could probably cause crashes with sufficiently parallel
uncore PMU use.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Cc: eranian@google.com
Cc: a.p.zijlstra@chello.nl
Link: http://lkml.kernel.org/r/1348709606-2759-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-27 08:05:13 +02:00
H. Peter Anvin
73201dbec6 x86, suspend: On wakeup always initialize cr4 and EFER
We already have a flag word to indicate the existence of MISC_ENABLES,
so use the same flag word to indicate existence of cr4 and EFER, and
always restore them if they exist.  That way if something passes a
nonzero value when the value *should* be zero, we will still
initialize it.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
2012-09-26 15:06:22 -07:00
H. Peter Anvin
5a5a51db78 x86-32: Start out eflags and cr4 clean
%cr4 is supposed to reflect a set of features into which the operating
system is opting in.  If the BIOS or bootloader leaks bits here, this
is not desirable.  Consider a bootloader passing in %cr4.pae set to a
legacy paging kernel, for example -- it will not have any immediate
effect, but the kernel would crash when turning paging on.

A similar argument applies to %eflags, and since we have to look for
%eflags.id being settable we can use a sequence which clears %eflags
as a side effect.

Note that we already do this for x86-64.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
2012-09-26 15:06:22 -07:00
Frederic Weisbecker
edf55fda35 x86: Exit RCU extended QS on notify resume
do_notify_resume() may be called on irq or exception
exit. But at that time the exception has already called
rcu_user_enter() and the irq has already called rcu_irq_exit().

Since it can use RCU read side critical section, we must call
rcu_user_exit() before doing anything there. Then we must call
back rcu_user_enter() after this function because we know we are
going to userspace from there.

This complete support for userspace RCU extended quiescent state
in x86-64.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-26 15:47:14 +02:00
Frederic Weisbecker
0430499ce9 x86: Use the new schedule_user API on userspace preemption
This way we can exit the RCU extended quiescent state before
we schedule a new task from irq/exception exit.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-26 15:47:12 +02:00
Frederic Weisbecker
6ba3c97a38 x86: Exception hooks for userspace RCU extended QS
Add necessary hooks to x86 exception for userspace
RCU extended quiescent state support.

This includes traps, page fault, debug exceptions, etc...

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-26 15:47:07 +02:00
Frederic Weisbecker
ef3f628872 x86: Unspaghettize do_general_protection()
There is some unnatural label based layout in this function.
Convert the unnecessary goto to readable conditional blocks.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
2012-09-26 15:47:06 +02:00
Frederic Weisbecker
bf5a3c13b9 x86: Syscall hooks for userspace RCU extended QS
Add syscall slow path hooks to notify syscall entry
and exit on CPUs that want to support userspace RCU
extended quiescent state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-26 15:47:04 +02:00
Frederic Weisbecker
c416ddf5b9 x86: Unspaghettize do_trap()
Cleanup the label maze in this function. Having a
seperate function to first handle the traps that don't
generate a signal makes it easier to convert into
more readable conditional paths.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1348577479-2564-1-git-send-email-fweisbec@gmail.com
[ Fixed 32-bit build failure. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:36:50 +02:00
Tao Guo
1b2b23d857 x86_64: Work around old GAS bug
GAS in binutils(2.16.91) could not parse parentheses within
macro parameters unless fully parenthesized, and this is a
workaround to make old gas work without generating below errors:

 arch/x86/kernel/entry_64.S: Assembler messages:
 arch/x86/kernel/entry_64.S:387: Error: too many positional arguments
 arch/x86/kernel/entry_64.S:389: Error: too many positional arguments
 [...]

Signed-off-by: Tao Guo <glorioustao@gmail.com>
Reluctantly-Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1348648102-12653-1-git-send-email-glorioustao@gmail.com
[ Jan argues that these old GAS versions are fragile - which is so, but lets give them a chance. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:35:32 +02:00
Yasuaki Ishimatsu
57c078ce13 x86/api: Rename mp_register_lapic in a comment
Commit 31d2092eb0 ("x86: move
mp_register_lapic_address to boot.c") renamed mp_register_lapic
to acpi_register_lapic. But mp_register_lapic remains in a
comment. So the patch rename it.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/50625239.3050403@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:29:36 +02:00
Michael Wang
dec08a837f x86: Remove the useless branch in c_start()
Since 'cpu == -1' in cpumask_next() is legal, no need to handle
'*pos == 0' specially.

About the comments:

	/* just in case, cpu 0 is not the first */

A test with a cpumask in which cpu 0 is not the first has been
done, and it works well.

This patch will remove that useless branch to clean the code.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Cc: kjwinchester@gmail.com
Cc: borislav.petkov@amd.com
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1348033343-23658-1-git-send-email-wangyun@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:27:56 +02:00
H. Peter Anvin
e139e95590 x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space
With SMAP, the [f][x]rstor_checking() functions are no longer usable
for user-space pointers by applying a simple __force cast.  Instead,
create new [f][x]rstor_user() functions which do the proper SMAP
magic.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.com
2012-09-25 15:42:18 -07:00
Paul E. McKenney
593d1006cd Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b
Resolved conflict in kernel/sched/core.c using Peter Zijlstra's
approach from https://lkml.org/lkml/2012/9/5/585.
2012-09-25 10:03:56 -07:00
John Stultz
650ea02475 time: Convert x86_64 to using new update_vsyscall
Switch x86_64 to using sub-ns precise vsyscall

Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:09 -04:00
John Stultz
7063942116 time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
To help migrate archtectures over to the new update_vsyscall method,
redfine CONFIG_GENERIC_TIME_VSYSCALL as CONFIG_GENERIC_TIME_VSYSCALL_OLD

Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:07 -04:00
John Stultz
189374aed6 time: Move update_vsyscall definitions to timekeeper_internal.h
Since users will need to include timekeeper_internal.h, move
update_vsyscall definitions to timekeeper_internal.h.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:06 -04:00
John Stultz
b3c869d35b jiffies: Remove compile time assumptions about CLOCK_TICK_RATE
CLOCK_TICK_RATE is used to accurately caclulate exactly how
a tick will be at a given HZ.

This is useful, because while we'd expect NSEC_PER_SEC/HZ,
the underlying hardware will have some granularity limit,
so we won't be able to have exactly HZ ticks per second.

This slight error can cause timekeeping quality problems
when using the jiffies or other jiffies driven clocksources.
Thus we currently use compile time CLOCK_TICK_RATE value to
generate SHIFTED_HZ and NSEC_PER_JIFFIES, which we then use
to adjust the jiffies clocksource to correct this error.

Unfortunately though, since CLOCK_TICK_RATE is a compile
time value, and the jiffies clocksource is registered very
early during boot, there are a number of cases where there
are different possible hardware timers that have different
tick rates. This causes problems in cases like ARM where
there are numerous different types of hardware, each having
their own compile-time CLOCK_TICK_RATE, making it hard to
accurately support different hardware with a single kernel.

For the most part, this doesn't matter all that much, as not
too many systems actually utilize the jiffies or jiffies driven
clocksource. Usually there are other highres clocksources
who's granularity error is negligable.

Even so, we have some complicated calcualtions that we do
everywhere to handle these edge cases.

This patch removes the compile time SHIFTED_HZ value, and
introduces a register_refined_jiffies() function. This results
in the default jiffies clock as being assumed a perfect HZ
freq, and allows archtectures that care about jiffies accuracy
to call register_refined_jiffies() with the tick rate, specified
dynamically at boot.

This allows us, where necessary, to not have a compile time
CLOCK_TICK_RATE constant, simplifies the jiffies code, and
still provides a way to have an accurate jiffies clock.

NOTE: Since this patch does not add register_refinied_jiffies()
calls for every arch, it may cause time quality regressions
in some cases. Its likely these will not be noticable, but
if they are an issue, adding the following to the end of
setup_arch() should resolve the regression:
	register_refinied_jiffies(CLOCK_TICK_RATE)

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:05 -04:00
Silas Boyd-Wickizer
429227bbe5 Use get_online_cpus to avoid races involving CPU hotplug
If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
between the for_each_online_cpu() loop and the call to
register_hotcpu_notifier in cpuid_init or the call to
unregister_hotcpu_notifier in cpuid_exit.  The potential races can
lead to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in cpuid_exit if:

        for_each_online_cpu(cpu)
                cpuid_device_destroy(cpu);
        class_destroy(cpuid_class);
        __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
        <----- CPU onlines
        unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);

the hotcpu notifier will attempt to create a device for the
cpuid_class, which the module already destroyed.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-23 07:43:56 -07:00
Silas Boyd-Wickizer
a2db672aa3 Use get_online_cpus to avoid races involving CPU hotplug
If arch/x86/kernel/msr.c is a module, a CPU might offline or online
between the for_each_online_cpu(i) loop and the call to
register_hotcpu_notifier in msr_init or the call to
unregister_hotcpu_notifier in msr_exit. The potential races can lead
to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in msr_init if:

        for_each_online_cpu(i) {
                err = msr_device_create(i);
                if (err != 0)
                        goto out_class;
        }
        <----- CPU offlines
        register_hotcpu_notifier(&msr_class_cpu_notifier);

and the CPU never onlines before msr_exit, then the module will never
call msr_device_destroy for the associated CPU.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-23 07:43:56 -07:00
H. Peter Anvin
49b8c695e3 Merge branch 'x86/fpu' into x86/smap
Reason for merge:
       x86/fpu changed the structure of some of the code that x86/smap
       changes; mostly fpu-internal.h but also minor changes to the
       signal code.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

Resolved Conflicts:
	arch/x86/ia32/ia32_signal.c
	arch/x86/include/asm/fpu-internal.h
	arch/x86/kernel/signal.c
2012-09-21 17:18:44 -07:00
Suresh Siddha
b1a74bf821 x86, kvm: fix kvm's usage of kernel_fpu_begin/end()
Preemption is disabled between kernel_fpu_begin/end() and as such
it is not a good idea to use these routines in kvm_load/put_guest_fpu()
which can be very far apart.

kvm_load/put_guest_fpu() routines are already called with
preemption disabled and KVM already uses the preempt notifier to save
the guest fpu state using kvm_put_guest_fpu().

So introduce __kernel_fpu_begin/end() routines which don't touch
preemption and use them instead of kernel_fpu_begin/end()
for KVM's use model of saving/restoring guest FPU state.

Also with this change (and with eagerFPU model), fix the host cr0.TS vm-exit
state in the case of VMX. For eagerFPU case, host cr0.TS is always clear.
So no need to worry about it. For the traditional lazyFPU restore case,
change the cr0.TS bit for the host state during vm-exit to be always clear
and cr0.TS bit is set in the __vmx_load_host_state() when the FPU
(guest FPU or the host task's FPU) state is not active. This ensures
that the host/guest FPU state is properly saved, restored
during context-switch and with interrupts (using irq_fpu_usable()) not
stomping on the active FPU state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1348164109.26695.338.camel@sbsiddha-desk.sc.intel.com
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-21 16:59:04 -07:00
H. Peter Anvin
e59d1b0a24 x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry
The changes to entry_32.S got missed in checkin:

63bcff2a x86, smap: Add STAC and CLAC instructions to control user space access

The resulting kernel was largely functional but SMAP protection could
have been bypassed.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
2012-09-21 14:04:27 -07:00
H. Peter Anvin
5e88353d8b x86, smap: Reduce the SMAP overhead for signal handling
Signal handling contains a bunch of accesses to individual user space
items, which causes an excessive number of STAC and CLAC
instructions.  Instead, let get/put_user_try ... get/put_user_catch()
contain the STAC and CLAC instructions.

This means that get/put_user_try no longer nests, and furthermore that
it is no longer legal to use user space access functions other than
__get/put_user_ex() inside those blocks.  However, these macros are
x86-specific anyway and are only used in the signal-handling paths; a
simple reordering of moving the larger subroutine calls out of the
try...catch blocks resolves that problem.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-12-git-send-email-hpa@linux.intel.com
2012-09-21 12:45:27 -07:00
H. Peter Anvin
52b6179ac8 x86, smap: Turn on Supervisor Mode Access Prevention
If Supervisor Mode Access Prevention is available and not disabled by
the user, turn it on.  Also fix the expansion of SMEP (Supervisor Mode
Execution Prevention.)

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-10-git-send-email-hpa@linux.intel.com
2012-09-21 12:45:27 -07:00
H. Peter Anvin
63bcff2a30 x86, smap: Add STAC and CLAC instructions to control user space access
When Supervisor Mode Access Prevention (SMAP) is enabled, access to
userspace from the kernel is controlled by the AC flag.  To make the
performance of manipulating that flag acceptable, there are two new
instructions, STAC and CLAC, to set and clear it.

This patch adds those instructions, via alternative(), when the SMAP
feature is enabled.  It also adds X86_EFLAGS_AC unconditionally to the
SYSCALL entry mask; there is simply no reason to make that one
conditional.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
2012-09-21 12:45:27 -07:00
Al Viro
e76623d694 x86: get rid of TIF_IRET hackery
TIF_NOTIFY_RESUME will work in precisely the same way; all that
is achieved by TIF_IRET is appearing that there's some work to be
done, so we end up on the iret exit path.  Just use NOTIFY_RESUME.
And for execve() do that in 32bit start_thread(), not sys_execve()
itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-20 09:50:17 -04:00
Borislav Petkov
50a011f640 kprobes/x86: Move skip_singlestep up
I get this warning:

  arch/x86/kernel/kprobes.c:544:23: warning: ‘skip_singlestep’ declared ‘static’ but never defined

on tip/auto-latest.

Put the skip_singlestep function declaration up, in
KPROBES_CAN_USE_FTRACE and drop the superfluous forward
declaration.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1348145034-16603-1-git-send-email-bp@amd64.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-20 14:48:16 +02:00
Dan Carpenter
2d29748003 x86, microcode, AMD: Fix use after free in free_cache()
list_for_each_entry_reverse() dereferences the iterator, but we already
freed it. I don't see a reason that this has to be done in reverse order
so change it to use list_for_each_entry_safe().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-09-19 18:06:25 +02:00
Peter Senna Tschudin
4b8073e467 arch/x86: Remove unecessary semicolons
Found by http://coccinelle.lip6.fr/

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: rusty@rustcorp.com.au
Cc: masami.hiramatsu.pt@hitachi.com
Cc: suresh.b.siddha@intel.com
Cc: joerg.roedel@amd.com
Cc: agordeev@redhat.com
Cc: yinghai@kernel.org
Cc: bhelgaas@google.com
Cc: liuj97@gmail.com
Link: http://lkml.kernel.org/r/1347986174-30287-7-git-send-email-peter.senna@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:32:48 +02:00
Stephane Eranian
20a36e39d5 perf/x86: Fix Intel Ivy Bridge support
This patch updates the existing Intel IvyBridge (model 58)
support with proper PEBS event constraints. It cannot reuse
the same as SandyBridge because some events (0xd3) are
specific to IvyBridge.

Also there is no UOPS_DISPATCHED.THREAD on IVB, so do not
populate the PERF_COUNT_HW_STALLED_CYCLES_BACKEND mapping.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/20120910230701.GA5898@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:28:47 +02:00
Borislav Petkov
924e101a7a x86/debug: Dump family, model, stepping of the boot CPU
When acting on a user bug report, we find ourselves constantly
asking for /proc/cpuinfo in order to know the exact family,
model, stepping of the CPU in question.

Instead of having to ask this, add this to dmesg so that it is
visible and no ambiguities can ensue from looking at the
official name string of the CPU coming from CPUID and trying
to map it to f/m/s.

Output then looks like this:

[    0.146041] smpboot: CPU0: AMD FX(tm)-8100 Eight-Core Processor (fam: 15, model: 01, stepping: 02)

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/1347640666-13638-1-git-send-email-bp@amd64.org
[ tweaked it minimally to add commas. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:12:01 +02:00
Dan Carpenter
1e6dd8adc7 perf: Fix off by one test in perf_reg_value()
The test should be >= ARRAY_SIZE() instead of > ARRAY_SIZE().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20120905123126.GC6128@elgon.mountain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:08:40 +02:00