Commit Graph

12766 Commits

Author SHA1 Message Date
Linus Torvalds
350e4f4985 Merge tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull nmi-safe seq_buf printk update from Steven Rostedt:
 "This code is a fork from the trace-3.19 pull as it needed the
  trace_seq clean ups from that branch.

  This code solves the issue of performing stack dumps from NMI context.
  The issue is that printk() is not safe from NMI context as if the NMI
  were to trigger when a printk() was being performed, the NMI could
  deadlock from the printk() internal locks.  This has been seen in
  practice.

  With lots of review from Petr Mladek, this code went through several
  iterations, and we feel that it is now at a point of quality to be
  accepted into mainline.

  Here's what is contained in this patch set:

   - Creates a "seq_buf" generic buffer utility that allows a descriptor
     to be passed around where functions can write their own "printk()"
     formatted strings into it.  The generic version was pulled out of
     the trace_seq() code that was made specifically for tracing.

   - The seq_buf code was change to model the seq_file code.  I have a
     patch (not included for 3.19) that converts the seq_file.c code
     over to use seq_buf.c like the trace_seq.c code does.  This was
     done to make sure that seq_buf.c is compatible with seq_file.c.  I
     may try to get that patch in for 3.20.

   - The seq_buf.c file was moved to lib/ to remove it from being
     dependent on CONFIG_TRACING.

   - The printk() was updated to allow for a per_cpu "override" of the
     internal calls.  That is, instead of writing to the console, a call
     to printk() may do something else.  This made it easier to allow
     the NMI to change what printk() does in order to call dump_stack()
     without needing to update that code as well.

   - Finally, the dump_stack from all CPUs via NMI code was converted to
     use the seq_buf code.  The caller to trigger the NMI code would
     wait till all the NMIs finished, and then it would print the
     seq_buf data to the console safely from a non NMI context

  One added bonus is that this code also makes the NMI dump stack work
  on PREEMPT_RT kernels.  As printk() includes sleeping locks on
  PREEMPT_RT, printk() only writes to console if the console does not
  use any rt_mutex converted spin locks.  Which a lot do"

* tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  x86/nmi: Fix use of unallocated cpumask_var_t
  printk/percpu: Define printk_func when printk is not defined
  x86/nmi: Perform a safe NMI stack trace on all CPUs
  printk: Add per_cpu printk func to allow printk to be diverted
  seq_buf: Move the seq_buf code to lib/
  seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
  tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
  tracing: Have seq_buf use full buffer
  seq_buf: Add seq_buf_can_fit() helper function
  tracing: Add paranoid size check in trace_printk_seq()
  tracing: Use trace_seq_used() and seq_buf_used() instead of len
  tracing: Clean up tracing_fill_pipe_page()
  seq_buf: Create seq_buf_used() to find out how much was written
  tracing: Add a seq_buf_clear() helper and clear len and readpos in init
  tracing: Convert seq_buf fields to be like seq_file fields
  tracing: Convert seq_buf_path() to be like seq_path()
  tracing: Create seq_buf layer in trace_seq
2014-12-10 20:35:41 -08:00
Linus Torvalds
1dd7dcb6ea Merge tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "There was a lot of clean ups and minor fixes.  One of those clean ups
  was to the trace_seq code.  It also removed the return values to the
  trace_seq_*() functions and use trace_seq_has_overflowed() to see if
  the buffer filled up or not.  This is similar to work being done to
  the seq_file code as well in another tree.

  Some of the other goodies include:

   - Added some "!" (NOT) logic to the tracing filter.

   - Fixed the frame pointer logic to the x86_64 mcount trampolines

   - Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems.
     That is, the ftrace trampoline can be dynamically allocated and be
     called directly by functions that only have a single hook to them"

* tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits)
  tracing: Truncated output is better than nothing
  tracing: Add additional marks to signal very large time deltas
  Documentation: describe trace_buf_size parameter more accurately
  tracing: Allow NOT to filter AND and OR clauses
  tracing: Add NOT to filtering logic
  ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter
  ftrace/x86: Get rid of ftrace_caller_setup
  ftrace/x86: Have save_mcount_regs macro also save stack frames if needed
  ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
  ftrace/x86: Simplify save_mcount_regs on getting RIP
  ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
  ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
  ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
  ftrace/x86: Have static tracing also use ftrace_caller_setup
  ftrace/x86: Have static function tracing always test for function graph
  kprobes: Add IPMODIFY flag to kprobe_ftrace_ops
  ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict
  kprobes/ftrace: Recover original IP if pre_handler doesn't change it
  tracing/trivial: Fix typos and make an int into a bool
  tracing: Deletion of an unnecessary check before iput()
  ...
2014-12-10 19:58:13 -08:00
Linus Torvalds
3a5dc1fafb Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loading updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Reload microcode when resuming and the case when only the early
     loader has been utilized.  (Borislav Petkov)

   - Also, do not load the driver on paravirt guests.  (Boris
     Ostrovsky)"

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/intel: Fish out the stashed microcode for the BSP
  x86, microcode: Reload microcode on resume
  x86, microcode: Don't initialize microcode code on paravirt
  x86, microcode, intel: Drop unused parameter
  x86, microcode, AMD: Do not use smp_processor_id() in preemtible context
2014-12-10 15:01:43 -08:00
Linus Torvalds
3100e448e7 Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
 "Various vDSO updates from Andy Lutomirski, mostly cleanups and
  reorganization to improve maintainability, but also some
  micro-optimizations and robustization changes"

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86_64/vsyscall: Restore orig_ax after vsyscall seccomp
  x86_64: Add a comment explaining the TASK_SIZE_MAX guard page
  x86_64,vsyscall: Make vsyscall emulation configurable
  x86_64, vsyscall: Rewrite comment and clean up headers in vsyscall code
  x86_64, vsyscall: Turn vsyscalls all the way off when vsyscall==none
  x86,vdso: Use LSL unconditionally for vgetcpu
  x86: vdso: Fix build with older gcc
  x86_64/vdso: Clean up vgetcpu init and merge the vdso initcalls
  x86_64/vdso: Remove jiffies from the vvar page
  x86/vdso: Make the PER_CPU segment 32 bits
  x86/vdso: Make the PER_CPU segment start out accessed
  x86/vdso: Change the PER_CPU segment to use struct desc_struct
  x86_64/vdso: Move getcpu code from vsyscall_64.c to vdso/vma.c
  x86_64/vsyscall: Move all of the gate_area code to vsyscall_64.c
2014-12-10 14:24:20 -08:00
Linus Torvalds
c9f861c772 Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS update from Ingo Molnar:
 "The biggest change in this cycle is better support for UCNA
  (UnCorrected No Action) events:

    "Handle all uncorrected error reports in the same way (soft
     offline the page). We used to only do that for SRAO
     (software recoverable action optional) machine checks, but
     it makes sense to also do it for UCNA (UnCorrected No
     Action) logs found by CMCI or polling."

  plus various x86 MCE handling updates and fixes"

* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Spell "panicked" correctly
  x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll
  x86, mce, severity: Extend the the mce_severity mechanism to handle UCNA/DEFERRED error
  x86, MCE, AMD: Assign interrupt handler only when bank supports it
  x86, MCE, AMD: Drop software-defined bank in error thresholding
  x86, MCE, AMD: Move invariant code out from loop body
  x86, MCE, AMD: Correct thresholding error logging
  x86, MCE, AMD: Use macros to compute bank MSRs
  RAS, HWPOISON: Fix wrong error recovery status
  GHES: Make ghes_estatus_caches static
  APEI, GHES: Cleanup unnecessary function for lockless list
2014-12-10 14:20:10 -08:00
Linus Torvalds
773fed910d Merge branches 'x86-platform-for-linus' and 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar:
 "A handful of numachip APIC driver updates/fixes, and two small SGI/UV
  fixes"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: numachip: APIC driver cleanups
  x86: numachip: Elide self-IPI ICR polling
  x86: numachip: Fix 16-bit APIC ID truncation

* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: UV BAU: Increase maximum CPUs per socket/hub
  x86: UV BAU: Avoid NULL pointer reference in ptc_seq_show
2014-12-10 13:40:11 -08:00
Linus Torvalds
206f18f2ca Merge branches 'x86-build-for-linus', 'x86-cleanups-for-linus' and 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build, cleanup and defconfig updates from Ingo Molnar:
 "A single minor build change to suppress a repetitive build messages,
  misc cleanups and a defconfig update"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/purgatory, build: Suppress kexec-purgatory.c is up to date message

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, CPU, AMD: Move K8 TLB flush filter workaround to K8 code
  x86, espfix: Remove stale ptemask
  x86, msr: Use seek definitions instead of hard-coded values
  x86, msr: Convert printk to pr_foo()
  x86, msr: Use PTR_ERR_OR_ZERO
  x86/simplefb: Use PTR_ERR_OR_ZERO
  x86/sysfb: Use PTR_ERR_OR_ZERO
  x86, cpuid: Use PTR_ERR_OR_ZERO

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kconfig/defconfig: Enable CONFIG_FHANDLE=y
2014-12-10 12:35:46 -08:00
Linus Torvalds
b6444bd0a1 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot and percpu updates from Ingo Molnar:
 "This tree contains a bootable images documentation update plus three
  slightly misplaced x86/asm percpu changes/optimizations"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-64: Use RIP-relative addressing for most per-CPU accesses
  x86-64: Handle PC-relative relocations on per-CPU data
  x86: Convert a few more per-CPU items to read-mostly ones
  x86, boot: Document intermediates more clearly
2014-12-10 12:10:24 -08:00
Linus Torvalds
9d0cf6f564 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
 "Misc changes:

   - context switch micro-optimization
   - debug printout micro-optimization
   - comment enhancements and typo fix"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Replace seq_printf() with seq_puts()
  x86/asm: Fix typo in arch/x86/kernel/asm_offset_64.c
  sched/x86: Add a comment clarifying LDT context switching
  sched/x86_64: Don't save flags on context switch
2014-12-10 12:09:26 -08:00
Linus Torvalds
3eb5b893eb Merge branch 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MPX support from Thomas Gleixner:
 "This enables support for x86 MPX.

  MPX is a new debug feature for bound checking in user space.  It
  requires kernel support to handle the bound tables and decode the
  bound violating instruction in the trap handler"

* 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  asm-generic: Remove asm-generic arch_bprm_mm_init()
  mm: Make arch_unmap()/bprm_mm_init() available to all architectures
  x86: Cleanly separate use of asm-generic/mm_hooks.h
  x86 mpx: Change return type of get_reg_offset()
  fs: Do not include mpx.h in exec.c
  x86, mpx: Add documentation on Intel MPX
  x86, mpx: Cleanup unused bound tables
  x86, mpx: On-demand kernel allocation of bounds tables
  x86, mpx: Decode MPX instruction to get bound violation information
  x86, mpx: Add MPX-specific mmap interface
  x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: Add MPX to disabled features
  ia64: Sync struct siginfo with general version
  mips: Sync struct siginfo with general version
  mpx: Extend siginfo structure to include bound violation information
  x86, mpx: Rename cfg_reg_u and status_reg
  x86: mpx: Give bndX registers actual names
  x86: Remove arbitrary instruction size limit in instruction decoder
2014-12-10 09:34:43 -08:00
Linus Torvalds
9e66645d72 Merge branch 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq domain updates from Thomas Gleixner:
 "The real interesting irq updates:

   - Support for hierarchical irq domains:

     For complex interrupt routing scenarios where more than one
     interrupt related chip is involved we had no proper representation
     in the generic interrupt infrastructure so far.  That made people
     implement rather ugly constructs in their nested irq chip
     implementations.  The main offenders are x86 and arm/gic.

     To distangle that mess we have now hierarchical irqdomains which
     seperate the various interrupt chips and connect them via the
     hierarchical domains.  That keeps the domain specific details
     internal to the particular hierarchy level and removes the
     criss/cross referencing of chip internals.  The resulting hierarchy
     for a complex x86 system will look like this:

        vector          mapped: 74
          msi-0         mapped: 2
          dmar-ir-1     mapped: 69
            ioapic-1    mapped: 4
            ioapic-0    mapped: 20
            pci-msi-2   mapped: 45
          dmar-ir-0     mapped: 3
            ioapic-2    mapped: 1
            pci-msi-1   mapped: 2
          htirq         mapped: 0

     Neither ioapic nor pci-msi know about the dmar interrupt remapping
     between themself and the vector domain.  If interrupt remapping is
     disabled ioapic and pci-msi become direct childs of the vector
     domain.

     In hindsight we should have done that years ago, but in hindsight
     we always know better :)

   - Support for generic MSI interrupt domain handling

     We have more and more non PCI related MSI interrupts, so providing
     a generic infrastructure for this is better than having all
     affected architectures implementing their own private hacks.

   - Support for PCI-MSI interrupt domain handling, based on the generic
     MSI support.

     This part carries the pci/msi branch from Bjorn Helgaas pci tree to
     avoid a massive conflict.  The PCI/MSI parts are acked by Bjorn.

  I have two more branches on top of this.  The full conversion of x86
  to hierarchical domains and a partial conversion of arm/gic"

* 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  genirq: Move irq_chip_write_msi_msg() helper to core
  PCI/MSI: Allow an msi_controller to be associated to an irq domain
  PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain
  PCI/MSI: Enhance core to support hierarchy irqdomain
  PCI/MSI: Move cached entry functions to irq core
  genirq: Provide default callbacks for msi_domain_ops
  genirq: Introduce msi_domain_alloc/free_irqs()
  asm-generic: Add msi.h
  genirq: Add generic msi irq domain support
  genirq: Introduce callback irq_chip.irq_write_msi_msg
  genirq: Work around __irq_set_handler vs stacked domains ordering issues
  irqdomain: Introduce helper function irq_domain_add_hierarchy()
  irqdomain: Implement a method to automatically call parent domains alloc/free
  genirq: Introduce helper irq_domain_set_info() to reduce duplicated code
  genirq: Split out flow handler typedefs into seperate header file
  genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip
  genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip
  genirq: Add more helper functions to support stacked irq_chip
  genirq: Introduce helper functions to support stacked irq_chip
  irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF
  ...
2014-12-10 09:01:01 -08:00
Andy Lutomirski
29fa682546 x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
paravirt_enabled has the following effects:

 - Disables the F00F bug workaround warning.  There is no F00F bug
   workaround any more because Linux's standard IDT handling already
   works around the F00F bug, but the warning still exists.  This
   is only cosmetic, and, in any event, there is no such thing as
   KVM on a CPU with the F00F bug.

 - Disables 32-bit APM BIOS detection.  On a KVM paravirt system,
   there should be no APM BIOS anyway.

 - Disables tboot.  I think that the tboot code should check the
   CPUID hypervisor bit directly if it matters.

 - paravirt_enabled disables espfix32.  espfix32 should *not* be
   disabled under KVM paravirt.

The last point is the purpose of this patch.  It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests.  Fixes CVE-2014-8134.

Cc: stable@vger.kernel.org
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-10 12:49:39 +01:00
Borislav Petkov
25cdb9c868 x86/microcode/intel: Fish out the stashed microcode for the BSP
I'm such a moron! The simple solution of saving the BSP patch
for use on resume was too simple (and wrong!), hint:
sizeof(struct microcode_intel).

What needs to be done instead is to fish out the microcode patch
we have stashed previously and apply that on the BSP in case the
late loader hasn't been utilized.

So do that instead.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20141208110820.GB20057@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-10 11:36:28 +01:00
Linus Torvalds
bee2782f30 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull leftover perf fixes from Ingo Molnar:
 "Two perf fixes left over from the previous cycle"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf session: Do not fail on processing out of order event
  x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
2014-12-09 21:18:06 -08:00
Linus Torvalds
5706ffd045 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events update from Ingo Molnar:
 "On the kernel side there's few changes, the one that stands out is
  PEBS machine state sampling support on x86, by Stephane Eranian.

  On the tooling side:

  User visible tooling changes:

   - Don't open the DWARF info multiple times, keeping instead a dwfl
     handle in struct dso, greatly speeding up 'perf report' on powerpc.
     (Sukadev Bhattiprolu)

   - Introduce PARSE_OPT_DISABLED option flag and use it to avoid
     showing undersired options in tools that provides frontends to
     'perf record', like sched, kvm, etc (Namhyung Kim)

   - Fallback to kallsyms when using the minimal 'ELF' loader (Arnaldo
     Carvalho de Melo)

   - Fix annotation with kcore (Adrian Hunter)

   - Support source line numbers in annotate using a hotkey (Andi Kleen)

   - Callchain improvements including:
     * Enable printing the srcline in the history
     * Make get_srcline fall back to sym+offset (Andi Kleen)

   - TUI hist_entry browser fixes, including showing missing overhead
     value for first level callchain.  Detected comparing the output of
     --stdio/--gui (that matched) with --tui, that had this problem.
     (Namhyung Kim)

   - Support handling complete branch stacks as histograms (Andi Kleen)

  Tooling infrastructure changes:

   - Prep work for supporting per-pkg and snapshot counters in 'perf
     stat' (Jiri Olsa)

   - 'perf stat' refactorings, moving stuff from it to evsel.c to use in
     per-pkg/snapshot format changes (Jiri Olsa)

   - Add per-pkg format file parsing (Matt Fleming)

   - Clean up libelf feature support code (Namhyung Kim)

   - Add gzip decompression support for kernel modules (Namhyung Kim)

   - More prep patches for Intel PT, including a a thread stack and more
     stuff made available via the database export mechanism (Adrian
     Hunter)

   - More Intel PT work, including a facility to export sample data
     (comms, threads, symbol names, etc) in a database friendly way,
     with an script to use this to create a postgresql database.
     (Adrian Hunter)

   - Make sure that thread->mg->machine points to the machine where the
     thread exists (it was being set only for the kmaps kernel modules
     case, do it as well for the mmaps) and use it to shorten function
     signatures (Arnaldo Carvalho de Melo)

  ... and lots of other fixes and smaller improvements"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (91 commits)
  perf report: In branch stack mode use address history sorting
  perf report: Add --branch-history option
  perf callchain: Support handling complete branch stacks as histograms
  perf stat: Add support for snapshot counters
  perf stat: Add support for per-pkg counters
  perf tools: Remove perf_evsel__read interface
  perf stat: Use read_counter in read_counter_aggr
  perf stat: Make read_counter work over the thread dimension
  perf stat: Use perf_evsel__read_cb in read_counter
  perf tools: Add snapshot format file parsing
  perf tools: Add per-pkg format file parsing
  perf evsel: Introduce perf_evsel__read_cb function
  perf evsel: Introduce perf_counts_values__scale function
  perf evsel: Introduce perf_evsel__compute_deltas function
  perf tools: Allow to force redirect pr_debug to stderr.
  perf tools: Fix segfault due to invalid kernel dso access
  perf callchain: Make get_srcline fall back to sym+offset
  perf symbols: Move bfd_demangle stubbing to its only user
  perf callchain: Enable printing the srcline in the history
  perf tools: Collapse first level callchain entry if it has sibling
  ...
2014-12-09 20:55:37 -08:00
Linus Torvalds
0160928e79 Merge tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC updates from Borislav Petkov:
 "EDAC updates all over the place:

   - Enablement for AMD F15h models 0x60 CPUs.  Most notably DDR4 RAM
     support.  Out of tree stuff is adding the required PCI IDs.  From
     Aravind Gopalakrishnan.

   - Enable amd64_edac for 32-bit due to popular demand.  From Tomasz
     Pala.

   - Convert the AMD MCE injection module to debugfs, where it belongs.

   - Misc EDAC cleanups"

* tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, MCE, AMD: Correct formatting of decoded text
  EDAC, mce_amd_inj: Add an injector function
  EDAC, mce_amd_inj: Add hw-injection attributes
  EDAC, mce_amd_inj: Enable direct writes to MCE MSRs
  EDAC, mce_amd_inj: Convert mce_amd_inj module to debugfs
  EDAC: Delete unnecessary check before calling pci_dev_put()
  EDAC, pci_sysfs: remove unneccessary ifdef around entire file
  ghes_edac: Use snprintf() to silence a static checker warning
  amd64_edac: Build module on x86-32
  EDAC, MCE, AMD: Add decoding table for MC6 xec
  amd64_edac: Add F15h M60h support
  {mv64x60,ppc4xx}_edac,: Remove deprecated IRQF_DISABLED
  EDAC: Sync memory types and names
  EDAC: Add DDR3 LRDIMM entries to edac_mem_types
  x86, amd_nb: Add device IDs to NB tables for F15h M60h
  pci_ids: Add PCI device IDs for F15h M60h
2014-12-08 20:17:49 -08:00
Rafael J. Wysocki
cfc75ed68b Merge branch 'pm-cpufreq'
* pm-cpufreq: (21 commits)
  intel_pstate: skip this driver if Sun server has _PPC method
  cpufreq: arm_big_little: free OPP table created during ->init()
  imx6q: free OPP table created during ->init()
  exynos5440: free OPP table created during ->init()
  cpufreq-dt: free OPP table created during ->init()
  cpufreq-dt: register cooling device from ->ready() callback
  cpufreq: Introduce ->ready() callback for cpufreq drivers
  cpufreq-dt: pass 'policy->related_cpus' to of_cpufreq_cooling_register()
  cpufreq: Fix formatting issues in 'struct cpufreq_driver'
  cpufreq: pxa2xx: Add Kconfig entry
  cpufreq: Ref the policy object sooner
  cpufreq: Kconfig: Remove architecture specific menu entries
  cpufreq: pcc: Enable autoload of pcc-cpufreq for ACPI processors
  intel_pstate: Add CPUID for BDW-H CPU
  intel_pstate: Add support for HWP
  x86: Add support for Intel HWP feature detection.
  cpufreq: respect the min/max settings from user space
  cpufreq: cpufreq-dt: Handle regulator_get_voltage() failure
  cpufreq: cpufreq-dt: Improve debug about matching OPP
  cpufreq: Loongson1: Add cpufreq driver for Loongson1B
  ...
2014-12-08 20:00:15 +01:00
Rafael J. Wysocki
648fcab2b0 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: add MAINTAINERS entry for ARM Exynos cpuidle driver
  drivers: cpuidle: Remove cpuidle-arm64 duplicate error messages
  drivers: cpuidle: Add idle-state-name description to ARM idle states
  drivers: cpuidle: Add status property to ARM idle states
  cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logic
2014-12-08 20:00:09 +01:00
Dan Carpenter
e10abb2f77 x86_64/traps: Fix always true condition
We should be checking IS_ERR() here.  PTR_ERR() is always true.

Fixes: fe3d197f84 ('x86, mpx: On-demand kernel allocation of
bounds tables')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20141125172114.GA24535@mwanda
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-08 12:06:59 +01:00
Ingo Molnar
2a2662bf88 Merge branch 'perf/core-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into perf/hw_breakpoints
Pull AMD range breakpoints support from Frederic Weisbecker:

" - Extend breakpoint tools and core to support address range through perf
    event with initial backend support for AMD extended breakpoints.

    Syntax is:

           perf record -e mem:addr/len:type

    For example set write breakpoint from 0x1000 to 0x1200 (0x1000 + 512)

           perf record -e mem:0x1000/512:w

 - Clean up a bit breakpoint code validation

 It has been acked by Jiri and Oleg. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-08 11:50:24 +01:00
Rasmus Villemoes
3736708f03 x86: Replace seq_printf() with seq_puts()
seq_puts is a lot cheaper than seq_printf, so use that to print
literal strings.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: http://lkml.kernel.org/r/1417208622-12264-1-git-send-email-linux@rasmusvillemoes.dk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-08 11:48:15 +01:00
Borislav Petkov
c7c9b3929b x86/mce: Spell "panicked" correctly
We need the additional "k" to make it a hard-c:

  https://en.wiktionary.org/wiki/panicked

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1417642605-15730-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-08 11:12:46 +01:00
Dave Airlie
8c86394470 Merge tag 'v3.18' into drm-next
Linux 3.18

Backmerge Linus tree into -next as we had conflicts in i915/radeon/nouveau,
and everyone was solving them individually.

* tag 'v3.18': (57 commits)
  Linux 3.18
  watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7
  uapi: fix to export linux/vm_sockets.h
  i2c: cadence: Set the hardware time-out register to maximum value
  i2c: davinci: generate STP always when NACK is received
  ahci: disable MSI on SAMSUNG 0xa800 SSD
  context_tracking: Restore previous state in schedule_user
  slab: fix nodeid bounds check for non-contiguous node IDs
  lib/genalloc.c: export devm_gen_pool_create() for modules
  mm: fix anon_vma_clone() error treatment
  mm: fix swapoff hang after page migration and fork
  fat: fix oops on corrupted vfat fs
  ipc/sem.c: fully initialize sem_array before making it visible
  drivers/input/evdev.c: don't kfree() a vmalloc address
  cxgb4: Fill in supported link mode for SFP modules
  xen-netfront: Remove BUGs on paged skb data which crosses a page boundary
  mm/vmpressure.c: fix race in vmpressure_work_fn()
  mm: frontswap: invalidate expired data on a dup-store failure
  mm: do not overwrite reserved pages counter at show_mem()
  drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6
  ...

Conflicts:
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/nouveau/nouveau_drm.c
	drivers/gpu/drm/radeon/radeon_cs.c
2014-12-08 10:33:52 +10:00
Borislav Petkov
fbae4ba8c4 x86, microcode: Reload microcode on resume
Normally, we do reapply microcode on resume. However, in the cases where
that microcode comes from the early loader and the late loader hasn't
been utilized yet, there's no easy way for us to go and apply the patch
applied during boot by the early loader.

Thus, reuse the patch stashed by the early loader for the BSP.

Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06 13:03:03 +01:00
Boris Ostrovsky
a18a0f6850 x86, microcode: Don't initialize microcode code on paravirt
Paravirtual guests are not expected to load microcode into processors
and therefore it is not necessary to initialize microcode loading
logic.

In fact, under certain circumstances initializing this logic may cause
the guest to crash. Specifically, 32-bit kernels use __pa_nodebug()
macro which does not work in Xen (the code path that leads to this macro
happens during resume when we call mc_bp_resume()->load_ucode_ap()
->check_loader_disabled_ap())

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1417469264-31470-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06 12:59:03 +01:00
Borislav Petkov
47768626c6 x86, microcode, intel: Drop unused parameter
apply_microcode_early() doesn't use mc_saved_data, kill it.

Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06 12:58:56 +01:00
Linus Torvalds
beb5af4033 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Two final fixlets for 3.18:
   - Prevent microcode reload wreckage on 32bit
   - Unbreak cross compilation"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Limit the microcode reloading to 64-bit for now
  x86: Use $(OBJDUMP) instead of plain objdump
2014-12-05 10:47:19 -08:00
Paolo Bonzini
ba7b39203a x86: export get_xsave_addr
get_xsave_addr is the API to access XSAVE states, and KVM would
like to use it.  Export it.

Cc: stable@vger.kernel.org
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:55:44 +01:00
Jacob Shin
36748b9518 perf/x86: Remove get_hbp_len and replace with bp_len
Clean up the logic for determining the breakpoint length

Signed-off-by: Jacob Shin <jacob.w.shin@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-12-03 15:14:30 +01:00
Jacob Shin
d6d55f0b9d perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8
Implement hardware breakpoint address mask for AMD Family 16h and
above processors. CPUID feature bit indicates hardware support for
DRn_ADDR_MASK MSRs. These masks further qualify DRn/DR7 hardware
breakpoint addresses to allow matching of larger addresses ranges.

Valuable advice and pseudo code from Oleg Nesterov <oleg@redhat.com>

Signed-off-by: Jacob Shin <jacob.w.shin@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-12-03 15:14:26 +01:00
Dave Airlie
e8115e79aa Merge tag 'v3.18-rc7' into drm-next
This fixes a bunch of conflicts prior to merging i915 tree.

Linux 3.18-rc7

Conflicts:
	drivers/gpu/drm/exynos/exynos_drm_drv.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_pm.c
	drivers/gpu/drm/tegra/dc.c
2014-12-02 10:58:33 +10:00
Steven Rostedt (Red Hat)
6a06bdbf7f ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter
The function graph helper function prepare_ftrace_return() which does the work
to hijack the parent pointer has that parent pointer as its first parameter.
Instead, if we make it the second parameter and have ip as the first parameter
(self_addr), then it can use the %rdi from save_mcount_regs that loads it
already.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:08:58 -05:00
Steven Rostedt (Red Hat)
f1ab00af81 ftrace/x86: Get rid of ftrace_caller_setup
Move all the work from ftrace_caller_setup into save_mcount_regs. This
simplifies the code and makes it easier to understand.

Link: http://lkml.kernel.org/r/CA+55aFxUTUbdxpjVMW8X9c=o8sui7OB_MYPfcbJuDyfUWtNrNg@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:08:43 -05:00
Steven Rostedt (Red Hat)
0687c36e45 ftrace/x86: Have save_mcount_regs macro also save stack frames if needed
The save_mcount_regs macro saves and restores the required mcount regs that
need to be saved before calling C code. It is done for all the function hook
utilities (static tracing, dynamic tracing, regs, function graph).

When frame pointers are enabled, the ftrace trampolines need to set up
frames and pointers such that a back trace (dump stack) can continue passed
them. Currently, a separate macro is used (create_frame) to do this, but
it's only done for the ftrace_caller and ftrace_reg_caller functions. It
is not done for the static tracer or function graph tracing.

Instead of having a separate macro doing the recording of the frames,
have the save_mcount_regs perform this task. This also has all tracers
saving the frame pointers when needed.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:08:27 -05:00
Steven Rostedt (Red Hat)
85f6f0290c ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
The macro save_mcount_regs saves regs onto the stack. But to uncouple the
amount of stack used in that macro from the users of the macro, we need
to have a define that tells all the users how much stack is used by that
macro. This way we can change the amount of stack the macro uses without
breaking its users.

Also remove some dead code that was left over from commit fdc841b58c
"ftrace: x86: Remove check of obsolete variable function_trace_stop".

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:08:15 -05:00
Steven Rostedt (Red Hat)
527aa75b33 ftrace/x86: Simplify save_mcount_regs on getting RIP
Currently save_mcount_regs is passed a "skip" parameter to know how much
stack updated the pt_regs, as it tries to keep the saved pt_regs in the
same location for all users. This is rather stupid, especially since the
part stored on the pt_regs has nothing to do with what is suppose to be
in that location.

Instead of doing that, just pass in an "added" parameter that lets that
macro know how much stack was added before it was called so that it
can get to the RIP.  But the difference is that it will now offset the
pt_regs by that "added" count. The caller now needs to take care of
the offset of the pt_regs.

This will make it easier to simplify the code later.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:07:50 -05:00
Steven Rostedt (Red Hat)
094dfc5451 ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
Instead of having save_mcount_regs store the RIP in %rdx as a temp register
to place it in the proper location of the pt_regs on the stack. Use the
%rdi register as the temp register. This lets us remove the extra store
in the ftrace_caller_setup macro.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:07:39 -05:00
Steven Rostedt (Red Hat)
05df710ec3 ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
The name MCOUNT_SAVE_FRAME is rather confusing as it really isn't a
function frame that is saved, but just the required mcount registers
that are needed to be saved before C code may be called. The word
"frame" confuses it as being a function frame which it is not.

Rename MCOUNT_SAVE_FRAME and MCOUNT_RESTORE_FRAME to save_mcount_regs
and restore_mcount_regs respectively. Noticed the lower case, which
keeps it from screaming at the reviewers.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:07:27 -05:00
Steven Rostedt (Red Hat)
4bcdf1522f ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
Linus pointed out that MCOUNT_SAVE_FRAME is used in only a single file
and that there's no reason that it should be in a header file.
Move the macro to the code that uses it.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:07:16 -05:00
Steven Rostedt (Red Hat)
76c2f13c55 ftrace/x86: Have static tracing also use ftrace_caller_setup
Linus pointed out that there were locations that did the hard coded
update of the parent and rip parameters. One of them was the static tracer
which could also use the ftrace_caller_setup to do that work. In fact,
because it did not use it, it is prone to bugs, and since the static
tracer is hardly ever used (who wants function tracing code always being
called?) it doesn't get tested very often. I only run a few "does it still
work" tests on it. But I do not run stress tests on that code. Although,
since it is never turned off, just having it on should be stressful enough.
(especially for the performance folks)

There's no reason that the static tracer can't also use ftrace_caller_setup.
Have it do so.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01 14:06:52 -05:00
Borislav Petkov
2ef84b3bb9 x86, microcode, AMD: Do not use smp_processor_id() in preemtible context
Hand down the cpu number instead, otherwise lockdep screams when doing

echo 1 > /sys/devices/system/cpu/microcode/reload.

BUG: using smp_processor_id() in preemptible [00000000] code: amd64-microcode/2470
caller is debug_smp_processor_id+0x12/0x20
CPU: 1 PID: 2470 Comm: amd64-microcode Not tainted 3.18.0-rc6+ #26
...

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1417428741-4501-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-01 11:51:05 +01:00
Borislav Petkov
02ecc41abc x86, microcode: Limit the microcode reloading to 64-bit for now
First, there was this: https://bugzilla.kernel.org/show_bug.cgi?id=88001

The problem there was that microcode patches are not being reapplied
after suspend-to-ram. It was important to reapply them, though, because
of for example Haswell's TSX erratum which disabled TSX instructions
with a microcode patch.

A simple fix was fb86b97300 ("x86, microcode: Update BSPs microcode
on resume") but, as it is often the case, simple fixes are too
simple. This one causes 32-bit resume to fail:

https://bugzilla.kernel.org/show_bug.cgi?id=88391

Properly fixing this would require more involved changes for which it
is too late now, right before the merge window. Thus, limit this to
64-bit only temporarily.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1417353999-32236-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-01 10:55:08 +01:00
Rafael J. Wysocki
8a497cfdc0 Merge back earlier cpufreq material for 3.19-rc1. 2014-12-01 02:46:24 +01:00
Sasha Levin
db08655437 x86/nmi: Fix use of unallocated cpumask_var_t
Commit "x86/nmi: Perform a safe NMI stack trace on all CPUs" has introduced
a cpumask_var_t variable:

	+static cpumask_var_t printtrace_mask;

But never allocated it before using it, which caused a NULL ptr deref when
trying to print the stack trace:

[ 1110.296154] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1110.296169] IP: __memcpy (arch/x86/lib/memcpy_64.S:151)
[ 1110.296178] PGD 4c34b3067 PUD 4c351b067 PMD 0
[ 1110.296186] Oops: 0002 [#1] PREEMPT SMP KASAN
[ 1110.296234] Dumping ftrace buffer:
[ 1110.296330]    (ftrace buffer empty)
[ 1110.296339] Modules linked in:
[ 1110.296345] CPU: 1 PID: 10538 Comm: trinity-c99 Not tainted 3.18.0-rc5-next-20141124-sasha-00058-ge2a8c09-dirty #1499
[ 1110.296348] task: ffff880152650000 ti: ffff8804c3560000 task.ti: ffff8804c3560000
[ 1110.296357] RIP: __memcpy (arch/x86/lib/memcpy_64.S:151)
[ 1110.296360] RSP: 0000:ffff8804c3563870  EFLAGS: 00010246
[ 1110.296363] RAX: 0000000000000000 RBX: ffffe8fff3c4a809 RCX: 0000000000000000
[ 1110.296366] RDX: 0000000000000008 RSI: ffffffff9e254040 RDI: 0000000000000000
[ 1110.296369] RBP: ffff8804c3563908 R08: 0000000000ffffff R09: 0000000000ffffff
[ 1110.296371] R10: 0000000000000000 R11: 0000000000000006 R12: 0000000000000000
[ 1110.296375] R13: 0000000000000000 R14: ffffffff9e254040 R15: ffffe8fff3c4a809
[ 1110.296379] FS:  00007f9e43b0b700(0000) GS:ffff880107e00000(0000) knlGS:0000000000000000
[ 1110.296382] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1110.296385] CR2: 0000000000000000 CR3: 00000004e4334000 CR4: 00000000000006a0
[ 1110.296400] Stack:
[ 1110.296406]  ffffffff81b1e46c 0000000000000000 ffff880107e03fb8 000000000000000b
[ 1110.296413]  ffff880107dfffc0 ffff880107e03fc0 0000000000000008 ffffffff93f2e9c8
[ 1110.296419]  0000000000000000 ffffda0020fc07f7 0000000000000008 ffff8804c3563901
[ 1110.296420] Call Trace:
[ 1110.296429] ? memcpy (mm/kasan/kasan.c:275)
[ 1110.296437] ? arch_trigger_all_cpu_backtrace (include/linux/bitmap.h:215 include/linux/cpumask.h:506 arch/x86/kernel/apic/hw_nmi.c:76)
[ 1110.296444] arch_trigger_all_cpu_backtrace (include/linux/bitmap.h:215 include/linux/cpumask.h:506 arch/x86/kernel/apic/hw_nmi.c:76)
[ 1110.296451] ? dump_stack (./arch/x86/include/asm/preempt.h:95 lib/dump_stack.c:55)
[ 1110.296458] do_raw_spin_lock (./arch/x86/include/asm/spinlock.h:86 kernel/locking/spinlock_debug.c:130 kernel/locking/spinlock_debug.c:137)
[ 1110.296468] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[ 1110.296474] ? __page_check_address (include/linux/spinlock.h:309 mm/rmap.c:630)
[ 1110.296481] __page_check_address (include/linux/spinlock.h:309 mm/rmap.c:630)
[ 1110.296487] ? preempt_count_sub (kernel/sched/core.c:2615)
[ 1110.296493] try_to_unmap_one (include/linux/rmap.h:202 mm/rmap.c:1146)
[ 1110.296504] ? anon_vma_interval_tree_iter_next (mm/interval_tree.c:72 mm/interval_tree.c:103)
[ 1110.296514] rmap_walk (mm/rmap.c:1653 mm/rmap.c:1725)
[ 1110.296521] ? page_get_anon_vma (include/linux/rcupdate.h:423 include/linux/rcupdate.h:935 mm/rmap.c:435)
[ 1110.296530] try_to_unmap (mm/rmap.c:1545)
[ 1110.296536] ? page_get_anon_vma (mm/rmap.c:437)
[ 1110.296545] ? try_to_unmap_nonlinear (mm/rmap.c:1138)
[ 1110.296551] ? SyS_msync (mm/rmap.c:1501)
[ 1110.296558] ? page_remove_rmap (mm/rmap.c:1409)
[ 1110.296565] ? page_get_anon_vma (mm/rmap.c:448)
[ 1110.296571] ? anon_vma_ctor (mm/rmap.c:1496)
[ 1110.296579] migrate_pages (mm/migrate.c:913 mm/migrate.c:956 mm/migrate.c:1136)
[ 1110.296586] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:169 kernel/locking/spinlock.c:199)
[ 1110.296593] ? buffer_migrate_lock_buffers (mm/migrate.c:1584)
[ 1110.296601] ? handle_mm_fault (mm/memory.c:3163 mm/memory.c:3223 mm/memory.c:3336 mm/memory.c:3365)
[ 1110.296607] migrate_misplaced_page (mm/migrate.c:1738)
[ 1110.296614] handle_mm_fault (mm/memory.c:3170 mm/memory.c:3223 mm/memory.c:3336 mm/memory.c:3365)
[ 1110.296623] __do_page_fault (arch/x86/mm/fault.c:1246)
[ 1110.296630] ? vtime_account_user (kernel/sched/cputime.c:701)
[ 1110.296638] ? get_parent_ip (kernel/sched/core.c:2559)
[ 1110.296646] ? context_tracking_user_exit (kernel/context_tracking.c:144)
[ 1110.296656] trace_do_page_fault (arch/x86/mm/fault.c:1329 include/linux/jump_label.h:114 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1330)
[ 1110.296664] do_async_page_fault (arch/x86/kernel/kvm.c:280)
[ 1110.296670] async_page_fault (arch/x86/kernel/entry_64.S:1285)
[ 1110.296755] Code: 08 4c 8b 54 16 f0 4c 8b 5c 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 90 83 fa 08 72 1b 4c 8b 06 4c 8b 4c 16 f8 <4c> 89 07 4c 89 4c 17 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 83 fa
All code
========
   0:   08 4c 8b 54             or     %cl,0x54(%rbx,%rcx,4)
   4:   16                      (bad)
   5:   f0 4c 8b 5c 16 f8       lock mov -0x8(%rsi,%rdx,1),%r11
   b:   4c 89 07                mov    %r8,(%rdi)
   e:   4c 89 4f 08             mov    %r9,0x8(%rdi)
  12:   4c 89 54 17 f0          mov    %r10,-0x10(%rdi,%rdx,1)
  17:   4c 89 5c 17 f8          mov    %r11,-0x8(%rdi,%rdx,1)
  1c:   c3                      retq
  1d:   90                      nop
  1e:   83 fa 08                cmp    $0x8,%edx
  21:   72 1b                   jb     0x3e
  23:   4c 8b 06                mov    (%rsi),%r8
  26:   4c 8b 4c 16 f8          mov    -0x8(%rsi,%rdx,1),%r9
  2b:*  4c 89 07                mov    %r8,(%rdi)               <-- trapping instruction
  2e:   4c 89 4c 17 f8          mov    %r9,-0x8(%rdi,%rdx,1)
  33:   c3                      retq
  34:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  3b:   00 00 00
  3e:   83 fa 00                cmp    $0x0,%edx

Code starting with the faulting instruction
===========================================
   0:   4c 89 07                mov    %r8,(%rdi)
   3:   4c 89 4c 17 f8          mov    %r9,-0x8(%rdi,%rdx,1)
   8:   c3                      retq
   9:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  10:   00 00 00
  13:   83 fa 00                cmp    $0x0,%edx
[ 1110.296760] RIP __memcpy (arch/x86/lib/memcpy_64.S:151)
[ 1110.296763]  RSP <ffff8804c3563870>
[ 1110.296765] CR2: 0000000000000000

Link: http://lkml.kernel.org/r/1416931560-10603-1-git-send-email-sasha.levin@oracle.com

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-25 14:15:30 -05:00
Andy Lutomirski
7ddc6a2199 x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
These functions can be executed on the int3 stack, so kprobes
are dangerous. Tracing is probably a bad idea, too.

Fixes: b645af2d59 ("x86_64, traps: Rework bad_iret")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: <stable@vger.kernel.org> # Backport as far back as it would apply
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/50e33d26adca60816f3ba968875801652507d0c4.1416870125.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-25 07:26:55 +01:00
Steven Rostedt (Red Hat)
62a207d748 ftrace/x86: Have static function tracing always test for function graph
New updates to the ftrace generic code had ftrace_stub not always being
called when ftrace is off. This causes the static tracer to always save
and restore functions. But it also showed that when function tracing is
running, the function graph tracer can not. We should always check to see
if function graph tracing is running even if the function tracer is running
too. The function tracer code is not the only one that uses the hook to
function mcount.

Cc: Markos Chandras <Markos.Chandras@imgtec.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-24 15:02:25 -05:00
Linus Torvalds
00c89b2f11 Merge branch 'x86-traps' (trap handling from Andy Lutomirski)
Merge x86-64 iret fixes from Andy Lutomirski:
 "This addresses the following issues:

   - an unrecoverable double-fault triggerable with modify_ldt.
   - invalid stack usage in espfix64 failed IRET recovery from IST
     context.
   - invalid stack usage in non-espfix64 failed IRET recovery from IST
     context.

  It also makes a good but IMO scary change: non-espfix64 failed IRET
  will now report the correct error.  Hopefully nothing depended on the
  old incorrect behavior, but maybe Wine will get confused in some
  obscure corner case"

* emailed patches from Andy Lutomirski <luto@amacapital.net>:
  x86_64, traps: Rework bad_iret
  x86_64, traps: Stop using IST for #SS
  x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
2014-11-23 13:56:55 -08:00
Andy Lutomirski
b645af2d59 x86_64, traps: Rework bad_iret
It's possible for iretq to userspace to fail.  This can happen because
of a bad CS, SS, or RIP.

Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace.  To make this work, there's
an extra fixup to fudge the gs base into a usable state.

This is suboptimal because it loses the original exception.  It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with.  For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.

This patch throws out bad_iret entirely.  As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C.  It's should be clearer and more correct.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:19 -08:00
Andy Lutomirski
6f442be2fb x86_64, traps: Stop using IST for #SS
On a 32-bit kernel, this has no effect, since there are no IST stacks.

On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code.  The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.

This fixes a bug in which the espfix64 code mishandles a stack segment
violation.

This saves 4k of memory per CPU and a tiny bit of code.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:19 -08:00
Andy Lutomirski
af726f21ed x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
There's nothing special enough about the espfix64 double fault fixup to
justify writing it in assembly.  Move it to C.

This also fixes a bug: if the double fault came from an IST stack, the
old asm code would return to a partially uninitialized stack frame.

Fixes: 3891a04aaf
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:18 -08:00