Commit Graph

6717 Commits

Author SHA1 Message Date
Bandan Das
4399c03c67 x86/apic: Remove verify_local_APIC()
__verify_local_APIC() is detritus from the early APIC days.
Its return value isn't used anywhere and the information it
prints when debug is enabled is already part of APIC
initialization messages printed to syslog. Off with it!

Signed-off-by: Bandan Das <bsd@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/jpgy4mcsxsq.fsf@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01 10:47:57 +02:00
Joe Perches
1d804d079a x86: Use bool function return values of true/false not 1/0
Use the normal return values for bool functions

Signed-off-by: Joe Perches <joe@perches.com>
Message-Id: <9f593eb2f43b456851cd73f7ed09654ca58fb570.1427759009.git.joe@perches.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-31 18:05:09 +02:00
Ingo Molnar
55474c48b4 x86/asm/entry: Remove user_mode_ignore_vm86()
user_mode_ignore_vm86() can be used instead of user_mode(), in
places where we have already done a v8086_mode() security
check of ptregs.

But doing this check in the wrong place would be a bug that
could result in security problems, and also the naming still
isn't very clear.

Furthermore, it only affects 32-bit kernels, while most
development happens on 64-bit kernels.

If we replace them with user_mode() checks then the cost is only
a very minor increase in various slowpaths:

   text             data   bss     dec              hex    filename
   10573391         703562 1753042 13029995         c6d26b vmlinux.o.before
   10573423         703562 1753042 13030027         c6d28b vmlinux.o.after

So lets get rid of this distinction once and for all.

Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150329090233.GA1963@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-31 11:45:19 +02:00
Hector Marco-Gisbert
4e26d11f52 x86/mm: Improve AMD Bulldozer ASLR workaround
The ASLR implementation needs to special-case AMD F15h processors by
clearing out bits [14:12] of the virtual address in order to avoid I$
cross invalidations and thus performance penalty for certain workloads.
For details, see:

  dfb09f9b7a ("x86, amd: Avoid cache aliasing penalties on AMD family 15h")

This special case reduces the mmapped file's entropy by 3 bits.

The following output is the run on an AMD Opteron 62xx class CPU
processor under x86_64 Linux 4.0.0:

  $ for i in `seq 1 10`; do cat /proc/self/maps | grep "r-xp.*libc" ; done
  b7588000-b7736000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
  b7570000-b771e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
  b75d0000-b777e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
  b75b0000-b775e000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
  b7578000-b7726000 r-xp 00000000 00:01 4924       /lib/i386-linux-gnu/libc.so.6
  ...

Bits [12:14] are always 0, i.e. the address always ends in 0x8000 or
0x0000.

32-bit systems, as in the example above, are especially sensitive
to this issue because 32-bit randomness for VA space is 8 bits (see
mmap_rnd()). With the Bulldozer special case, this diminishes to only 32
different slots of mmap virtual addresses.

This patch randomizes per boot the three affected bits rather than
setting them to zero. Since all the shared pages have the same value
at bits [12..14], there is no cache aliasing problems. This value gets
generated during system boot and it is thus not known to a potential
remote attacker. Therefore, the impact from the Bulldozer workaround
gets diminished and ASLR randomness increased.

More details at:

  http://hmarco.org/bugs/AMD-Bulldozer-linux-ASLR-weakness-reducing-mmaped-files-by-eight.html

Original white paper by AMD dealing with the issue:

  http://developer.amd.com/wordpress/media/2012/10/SharedL1InstructionCacheonAMD15hCPU.pdf

Mentored-by: Ismael Ripoll <iripoll@disca.upv.es>
Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan-Simon <dl9pf@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-fsdevel@vger.kernel.org
Link: http://lkml.kernel.org/r/1427456301-3764-1-git-send-email-hecmargi@upv.es
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-31 10:01:17 +02:00
Nadav Amit
b32a991800 KVM: x86: Remove redundant definitions
Some constants are redfined in emulate.c. Avoid it.

s/SELECTOR_RPL_MASK/SEGMENT_RPL_MASK
s/SELECTOR_TI_MASK/SEGMENT_TI_MASK

No functional change.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427635984-8113-3-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-30 16:46:42 +02:00
Nadav Amit
0efb04406d KVM: x86: removing redundant eflags bits definitions
The eflags are redefined (using other defines) in emulate.c.
Use the definition from processor-flags.h as some mess already started.
No functional change.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427635984-8113-2-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-30 16:46:37 +02:00
Ingo Molnar
4bfe186dbe Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

  - Documentation updates.

  - Changes permitting use of call_rcu() and friends very early in
    boot, for example, before rcu_init() is invoked.

  - Miscellaneous fixes.

  - Add in-kernel API to enable and disable expediting of normal RCU
    grace periods.

  - Improve RCU's handling of (hotplug-) outgoing CPUs.

    Note: ARM support is lagging a bit here, and these improved
    diagnostics might generate (harmless) splats.

  - NO_HZ_FULL_SYSIDLE fixes.

  - Tiny RCU updates to make it more tiny.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:04:06 +01:00
Denys Vlasenko
aa6d9a128b x86/irq/tracing: Do not save callee-preserved registers around lockdep_sys_exit_thunk
Internally, lockdep_sys_exit_thunk saves callee-clobbered
registers, and calls a C function, lockdep_sys_exit. Thus,
callee-preserved registers won't be mangled, there is no need to
save them.

Patch was run-tested.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427314468-12763-4-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:01:49 +01:00
Denys Vlasenko
7dc7cc0780 x86/irq/tracing: Fold ARCH_LOCKDEP_SYS_EXIT defines into their users
There is no need to have an extra level of macro indirection
here.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427314468-12763-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:01:49 +01:00
Denys Vlasenko
40e2ec657d x86/irq/tracing: Move ARCH_LOCKDEP_SYS_EXIT defines closer to their users
This change simply moves defines around (even if it's not
obvious in a patch form). Nothing is changed.

This is a preparation for folding ARCH_LOCKDEP_SYS_EXIT defines
into their users.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427314468-12763-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:01:48 +01:00
Ingo Molnar
936c663aed Merge branch 'perf/x86' into perf/core, because it's ready
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:46:19 +01:00
Jan Kiszka
b3a2a9076d KVM: nVMX: Add support for rdtscp
If the guest CPU is supposed to support rdtscp and the host has rdtscp
enabled in the secondary execution controls, we can also expose this
feature to L1. Just extend nested_vmx_exit_handled to properly route
EXIT_REASON_RDTSCP.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-26 22:33:48 -03:00
Greg Kroah-Hartman
ff85f707ac Merge 4.0-rc5 into char-misc-next
We want those fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 10:51:53 +01:00
Ingo Molnar
72d64cc769 x86/asm: Further improve segment.h readability
- extend/clarify explanations where necessary

 - move comments from macro values to before the macro, to
   make them more consistent, and to reduce preprocessor overhead

 - sort GDT index and selector values likewise by number

 - use consistent, modern kernel coding style across the file

 - capitalize consistently

 - use consistent vertical spacing

 - remove the unused get_limit() method (noticed by Andy Lutomirski)

No change in code (verified with objdump -d):

 64-bit defconfig+kvmconfig:

   815a129bc1f80de6445c1d8ca5b97cad  vmlinux.o.before.asm
   815a129bc1f80de6445c1d8ca5b97cad  vmlinux.o.after.asm

 32-bit defconfig+kvmconfig:

   e659ef045159ddf41a0771b33a34aae5  vmlinux.o.before.asm
   e659ef045159ddf41a0771b33a34aae5  vmlinux.o.after.asm

Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 21:13:38 +01:00
Ingo Molnar
dca5b52ad7 x86/asm/entry/64: Rename THREAD_INFO() to ASM_THREAD_INFO()
The THREAD_INFO() macro has a somewhat confusingly generic name,
defined in a generic .h C header file. It also does not make it
clear that it constructs a memory operand for use in assembly
code.

Rename it to ASM_THREAD_INFO() to make it all glaringly
obvious on first glance.

Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/20150324184442.GC14760@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 20:57:31 +01:00
Ingo Molnar
f9d71854b4 x86/asm/entry/64: Merge the field offset into the THREAD_INFO() macro
Before:

   TI_sysenter_return+THREAD_INFO(%rsp,3*8),%r10d

After:

   movl    THREAD_INFO(TI_sysenter_return, %rsp, 3*8), %r10d

to turn it into a clear thread_info accessor.

No code changed:

 md5:
   fb4cb2b3ce05d89940ca304efc8ff183  ia32entry.o.before.asm
   fb4cb2b3ce05d89940ca304efc8ff183  ia32entry.o.after.asm

   e39f2958a5d1300158e276e4f7663263  entry_64.o.before.asm
   e39f2958a5d1300158e276e4f7663263  entry_64.o.after.asm

Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/20150324184411.GB14760@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 20:57:31 +01:00
Ingo Molnar
1ddc6f3c60 x86/asm/entry/64: Improve the THREAD_INFO() macro explanation
Explain the background, and add a real example.

Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/20150324184311.GA14760@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 20:57:30 +01:00
Denys Vlasenko
84f5378845 x86/asm: Deobfuscate segment.h
This file just defines a number of constants, and a few macros
and inline functions. It is particularly badly written.

For example, it is not trivial to see how descriptors are
numbered (you'd expect that should be easy, right?).

This change deobfuscates it via the following changes:

Group all GDT_ENTRY_foo together (move intervening stuff away).

Number them explicitly: use a number, not PREV_DEFINE+1, +2, +3:
I want to immediately see that GDT_ENTRY_PNPBIOS_CS32 is 18.
Seeing (GDT_ENTRY_KERNEL_BASE+6) instead is not useful.

The above change allows to remove GDT_ENTRY_KERNEL_BASE
and GDT_ENTRY_PNPBIOS_BASE, which weren't used anywhere else.

After a group of GDT_ENTRY_foo, define all selector values.

Remove or improve some comments. In particular:
Comment deleted as stating the obvious:
    /*
     * The GDT has 32 entries
     */
    #define GDT_ENTRIES 32

"The segment offset needs to contain a RPL. Grr. -AK"
    changed to
"Selectors need to also have a correct RPL (+3 thingy)"

"GDT layout to get 64bit syscall right (sysret hardcodes gdt
offsets)" expanded into a description *how exactly* sysret
hardcodes them.

Patch was tested to compile and not change vmlinux.o
on 32-bit and 64-bit builds (verified with objdump).

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 20:47:07 +01:00
Denys Vlasenko
ef593260f0 x86/asm/entry: Get rid of KERNEL_STACK_OFFSET
PER_CPU_VAR(kernel_stack) was set up in a way where it points
five stack slots below the top of stack.

Presumably, it was done to avoid one "sub $5*8,%rsp"
in syscall/sysenter code paths, where iret frame needs to be
created by hand.

Ironically, none of them benefits from this optimization,
since all of them need to allocate additional data on stack
(struct pt_regs), so they still have to perform subtraction.

This patch eliminates KERNEL_STACK_OFFSET.

PER_CPU_VAR(kernel_stack) now points directly to top of stack.
pt_regs allocations are adjusted to allocate iret frame as well.
Hopefully we can merge it later with 32-bit specific
PER_CPU_VAR(cpu_current_top_of_stack) variable...

Net result in generated code is that constants in several insns
are changed.

This change is necessary for changing struct pt_regs creation
in SYSCALL64 code path from MOV to PUSH instructions.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 19:42:38 +01:00
Denys Vlasenko
b3fe8ba320 x86/asm/entry/64: Change the THREAD_INFO() definition to not depend on KERNEL_STACK_OFFSET
This changes the THREAD_INFO() definition and all its callsites
so that they do not count stack position from
(top of stack - KERNEL_STACK_OFFSET), but from top of stack.

Semi-mysterious expressions THREAD_INFO(%rsp,RIP) - "why RIP??"
are now replaced by more logical THREAD_INFO(%rsp,SIZEOF_PTREGS)
- "calculate thread_info's address using information that
rsp is SIZEOF_PTREGS bytes below top of stack".

While at it, replace "(off)-THREAD_SIZE(reg)" with equivalent
"((off)-THREAD_SIZE)(reg)". The form without parentheses
falsely looks like we invoke THREAD_SIZE() macro.

Improve comment atop THREAD_INFO macro definition.

This patch does not change generated code (verified by objdump).

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426785469-15125-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-24 19:42:37 +01:00
Aravind Gopalakrishnan
43eaa2a1ad x86/mce: Define mce_severity function pointer
Rename mce_severity() to mce_severity_intel() and assign the
mce_severity function pointer to mce_severity_amd() during init on AMD.
This way, we can avoid a test to call mce_severity_amd every time we get
into mce_severity(). And it's cleaner to do it this way.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1427125373-2918-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-03-24 12:14:15 +01:00
Aravind Gopalakrishnan
bf80bbd7dc x86/mce: Add an AMD severities-grading function
Add a severities function that caters to AMD processors. This allows us
to do some vendor-specific work within the function if necessary.

Also, introduce a vendor flag bitfield for vendor-specific settings. The
severities code uses this to define error scope based on the prescence
of the flags field.

This is based off of work by Boris Petkov.

Testing details:
Fam10h, Model 9h (Greyhound)
Fam15h: Models 0h-0fh (Orochi), 30h-3fh (Kaveri) and 60h-6fh (Carrizo),
Fam16h Model 00h-0fh (Kabini)

Boris:
Intel SNB
AMD K8 (JH-E0)

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/1427125373-2918-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Fixup build, clean up comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-03-24 12:13:34 +01:00
Rusty Russell
2f921b5bb0 lguest: suppress interrupts for single insn, not range.
The last patch reduced our interrupt-suppression region to one address,
so simplify the code somewhat.

Also, remove the obsolete undefined instruction ranges and the comment
which refers to lguest_guest.S instead of head_32.S.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-24 11:52:08 +10:30
Marcelo Tosatti
0a4e6be9ca x86: kvm: Revert "remove sched notifier for cross-cpu migrations"
The following point:

    2. per-CPU pvclock time info is updated if the
       underlying CPU changes.

Is not true anymore since "KVM: x86: update pvclock area conditionally,
on cpu migration".

Add task migration notification back.

Problem noticed by Andy Lutomirski.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: stable@kernel.org # 3.11+
2015-03-23 20:22:48 -03:00
Greg Kroah-Hartman
caa445d808 Merge 4.0-rc5 into tty-next
We want the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-23 21:45:24 +01:00
Andy Lutomirski
7a2806741e x86/asm/entry: Remove user_mode_vm()
It has no callers anymore.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a594afd6a0bddb1311bd7c92a15201c87fbb8681.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:14:33 +01:00
Andy Lutomirski
efa7045103 x86/asm/entry: Make user_mode() work correctly if regs came from VM86 mode
user_mode() is now identical to user_mode_vm().  Subsequent patches
will change all callers of user_mode_vm() to user_mode() and then
delete user_mode_vm().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0dd03eacb5f0a2b5ba0240de25347a31b493c289.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:51 +01:00
Andy Lutomirski
a67e7277d0 x86/asm/entry: Add user_mode_ignore_vm86()
user_mode() is dangerous and user_mode_vm() has a confusing name.

Add user_mode_ignore_vm86() (equivalent to current user_mode()).
We'll change the small number of legitimate users of user_mode()
to user_mode_ignore_vm86().

Inspired by grsec, although this works rather differently.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/202c56ca63823c338af8e2e54948dbe222da6343.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:36 +01:00
Ingo Molnar
e4518ab90f Merge tag 'v4.0-rc5' into x86/asm, to resolve conflicts
Conflicts:
	arch/x86/kernel/entry_64.S

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 11:13:15 +01:00
Ingo Molnar
e1b63dec2d Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying new patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:50:29 +01:00
Borislav Petkov
b85e67d148 x86/fpu: Rename drop_init_fpu() to fpu_reset_state()
Call it what it does and in accordance with the context where it is
used: we reset the FPU state either because we were unable to restore it
from the one saved in the task or because we simply want to reset it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:13:59 +01:00
Borislav Petkov
d2d0ac9a46 x86/fpu: Fold __drop_fpu() into its sole user
Fold it into drop_fpu(). Phew, one less FPU function to pay attention
to.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:13:59 +01:00
Oleg Nesterov
8f4d81863b x86/fpu: Introduce restore_init_xstate()
Extract the "use_eager_fpu()" code from drop_init_fpu() into a new,
simple helper restore_init_xstate(). The next patch adds another user.

- It is not clear why we do not check use_fxsr() like fpu_restore_checking()
  does. eager_fpu_init_bp() calls setup_init_fpu_buf() too, and we have the
  "eagerfpu=on" kernel option.

- Ignoring the fact that init_xstate_buf is "struct xsave_struct *", not
  "union thread_xstate *", it is not clear why we can not simply use
  fpu_restore_checking() and avoid the code duplication.

- It is not clear why we can't call setup_init_fpu_buf() unconditionally
  to always create init_xstate_buf(). Then do_device_not_available() path
  (at least) could use restore_init_xstate() too. It doesn't need to init
  fpu->state, its content doesn't matter until unlazy_fpu()/__switch_to()/etc
  which overwrites this memory anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/20150311173429.GD5032@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:13:58 +01:00
Oleg Nesterov
fb14b4eadf x86/fpu: Document user_fpu_begin()
Currently, user_fpu_begin() has a single caller and it is not clear why
do we actually need it and why we should not worry about preemption
right after preempt_enable().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/20150311173409.GC5032@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:13:58 +01:00
Ingo Molnar
eda2360ad1 Merge tag 'v4.0-rc5' into x86/fpu, to prevent conflicts
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:13:36 +01:00
Brian Gerst
1daeaa3151 x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRET
Both the execve() and sigreturn() family of syscalls have the
ability to change registers in ways that may not be compatabile
with the syscall path they were called from.

In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss,
and some bits in eflags.

These syscalls have stubs that are hardcoded to jump to the IRET path,
and not return to the original syscall path.

The following commit:

   76f5df43ca ("Always allocate a complete "struct pt_regs" on the kernel stack")

recently changed this for some 32-bit compat syscalls, but introduced a bug where
execve from a 32-bit program to a 64-bit program would fail because it still returned
via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit.

This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so
that the IRET path is always taken on exit to userspace.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@gmail.com
[ Improved the changelog and comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 08:52:46 +01:00
Linus Torvalds
3d7a6db537 Merge tag 'pm+acpi-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
 "These are fixes for recent regressions (PCI/ACPI resources and at91
  RTC locking), a stable-candidate powercap RAPL driver fix and two ARM
  cpuidle fixes (one stable-candidate too).

  Specifics:

   - Revert a recent PCI commit related to IRQ resources management that
     introduced a regression for drivers attempting to bind to devices
     whose previous drivers did not balance pci_enable_device() and
     pci_disable_device() as expected (Rafael J Wysocki).

   - Fix a deadlock in at91_rtc_interrupt() introduced by a typo in a
     recent commit related to wakeup interrupt handling (Dan Carpenter).

   - Allow the power capping RAPL (Running-Average Power Limit) driver
     to use different energy units for domains within one CPU package
     which is necessary to handle Intel Haswell EP processors correctly
     (Jacob Pan).

   - Improve the cpuidle mvebu driver's handling of Armada XP SoCs by
     updating the target residency and exit latency numbers for those
     chips (Sebastien Rannou).

   - Prevent the cpuidle mvebu driver from calling cpu_pm_enter() twice
     in a row before cpu_pm_exit() is called on the same CPU which
     breaks the core's assumptions regarding the usage of those
     functions (Gregory Clement)"

* tag 'pm+acpi-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "x86/PCI: Refine the way to release PCI IRQ resources"
  rtc: at91rm9200: double locking bug in at91_rtc_interrupt()
  powercap / RAPL: handle domains with different energy units
  cpuidle: mvebu: Update cpuidle thresholds for Armada XP SOCs
  cpuidle: mvebu: Fix the CPU PM notifier usage
2015-03-21 12:51:36 -07:00
Rafael J. Wysocki
9e8ce4b96b Revert "x86/PCI: Refine the way to release PCI IRQ resources"
Commit b4b55cda58 (Refine the way to release PCI IRQ resources)
introduced a regression in the PCI IRQ resource management by causing
the IRQ resource of a device, established when pci_enabled_device()
is called on a fully disabled device, to be released when the driver
is unbound from the device, regardless of the enable_cnt.

This leads to the situation that an ill-behaved driver can now make a
device unusable to subsequent drivers by an imbalance in their use of
pci_enable/disable_device().  That is a serious problem for secondary
drivers like vfio-pci, which are innocent of the transgressions of
the previous driver.

Since the solution of this problem is not immediate and requires
further discussion, revert commit b4b55cda58 and the issue it was
supposed to address (a bug related to xen-pciback) will be taken
care of in a different way going forward.

Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-20 14:56:19 +01:00
Zhonghui Fu
431d452af1 PM / sleep: add pm-trace support for suspending phase
Occasionally, the system can't come back up after suspend/resume
due to problems of device suspending phase. This patch make
PM_TRACE infrastructure cover device suspending phase of
suspend/resume process, and the information in RTC can tell
developers which device suspending function make system hang.

Signed-off-by: Zhonghui Fu <zhonghui.fu@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-18 15:54:27 +01:00
Ingo Molnar
ac9af4983e x86/asm/entry/64: Remove thread_struct::usersp
Nothing uses thread_struct::usersp anymore, so remove it.

Originally-from: Denys Vlasenko <dvlasenk@redhat.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 16:01:41 +01:00
Ingo Molnar
9854dd74c3 x86/asm/entry/64: Simplify 'old_rsp' usage
Remove all manipulations of PER_CPU(old_rsp) in C code:

 - it is not used on SYSRET return anymore, and system entries
   are atomic, so updating it from the fork and context switch
   paths is pointless.

 - Tweak a few related comments as well: we no longer have a
   "partial stack frame" on entry, ever.

Based on (split out of) patch from Denys Vlasenko.

Originally-from: Denys Vlasenko <dvlasenk@redhat.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426599779-8010-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 16:01:41 +01:00
Denys Vlasenko
d828c71fba x86/asm/entry/32: Document the 32-bit SYSENTER "emergency stack" better
Before the patch, the 'tss_struct::stack' field was not referenced anywhere.

It was used only to set SYSENTER's stack to point after the last byte
of tss_struct, thus the trailing field, stack[64], was used.

But grep would not know it. You can comment it out, compile,
and kernel will even run until an unlucky NMI corrupts
io_bitmap[] (which is also not easily detectable).

This patch changes code so that the purpose and usage of this
field is not mysterious anymore, and can be easily grepped for.

This does change generated code, for a subtle reason:
since tss_struct is ____cacheline_aligned, there happens to be
5 longs of padding at the end. Old code was using the padding
too; new code will strictly use it only for SYSENTER_stack[].

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1425912738-559-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:29 +01:00
Denys Vlasenko
5c39403e00 x86/asm/entry: Simplify task_pt_regs() macro definition
Before this change, task_pt_regs() was using KSTK_TOP(),
and it was the only use of that macro. In turn, KSTK_TOP used
THREAD_SIZE_LONGS, and it was the only use of that macro too.

Fold these macros into task_pt_regs(). Tweak comment
about "- 8" - we now use a symbolic constant, not literal 8.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426255743-5394-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:28 +01:00
Andy Lutomirski
76e4c4908a x86/asm/entry/32: Document our abuse of x86_hw_tss::ss1 and x86_hw_tss::sp1
This has confused me for a while.  Now that I figured it out, document it.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b7efc1b7364039824776f68e9ddee9ec1500e894.1426009661.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:27 +01:00
Andy Lutomirski
d9e05cc5a5 x86/asm/entry: Unify and fix initial thread_struct::sp0 values
x86_32 and x86_64 need slightly different thread_struct::sp0 values, and
x86_32's was incorrect for init.

This never mattered -- the init thread never runs user code, so we never
used thread_struct::sp0 for anything.

Fix it and mostly unify them.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1b810c1d2e797e27bb4a7708c426101161edd1f6.1426009661.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:27 +01:00
Andy Lutomirski
3ee4298f44 x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro
x86_32, unlike x86_64, pads the top of the kernel stack, because the
hardware stack frame formats are variable in size.

Document this padding and give it a name.

This should make no change whatsoever to the compiled kernel
image. It also doesn't fix any of the current bugs in this area.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net
[ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:26 +01:00
Andy Lutomirski
9a036b93a3 x86/signal/64: Remove 'fs' and 'gs' from sigcontext
As far as I can tell, these fields have been set to zero on save
and ignored on restore since Linux was imported into git.
Rename them '__pad1' and '__pad2' to avoid confusion.  This may
also allow us to recycle them some day.

This also adds a comment clarifying the history of those fields.

I'm intentionally avoiding calling either of them '__pad0': the
field formerly known as '__pad0' is now 'ss'.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/844f8490e938780c03355be4c9b69eb4c494bf4e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:26 +01:00
Andy Lutomirski
c6f2062935 x86/signal/64: Fix SS handling for signals delivered to 64-bit programs
The comment in the signal code says that apps can save/restore
other segments on their own.  It's true that apps can *save* SS
on their own, but there's no way for apps to restore it: SYSCALL
effectively resets SS to __USER_DS, so any value that user code
tries to load into SS gets lost on entry to sigreturn.

This recycles two padding bytes in the segment selector area for SS.

While we're at it, we need a second change to make this useful.

If the signal we're delivering is caused by a bad SS value,
saving that value isn't enough.  We need to remove that bad
value from the regs before we try to deliver the signal.  Oddly,
the i386 code already got this right.

I suspect that 64-bit programs that try to run 16-bit code and
use signals will have a lot of trouble without this.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-17 09:25:25 +01:00
Borislav Petkov
69797dafe3 Revert "x86/mm/ASLR: Propagate base load address calculation"
This reverts commit:

  f47233c2d3 ("x86/mm/ASLR: Propagate base load address calculation")

The main reason for the revert is that the new boot flag does not work
at all currently, and in order to make this work, we need non-trivial
changes to the x86 boot code which we didn't manage to get done in
time for merging.

And even if we did, they would've been too risky so instead of
rushing things and break booting 4.1 on boxes left and right, we
will be very strict and conservative and will take our time with
this to fix and test it properly.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: H. Peter Anvin <hpa@linux.intel.com
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/20150316100628.GD22995@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-16 11:18:21 +01:00
Len Brown
b253149b84 sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance
In Linux-3.9 we removed the mwait_idle() loop:

  69fb3676df ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")

The reasoning was that modern machines should be sufficiently
happy during the boot process using the default_idle() HALT
loop, until cpuidle loads and either acpi_idle or intel_idle
invoke the newer MWAIT-with-hints idle loop.

But two machines reported problems:

 1. Certain Core2-era machines support MWAIT-C1 and HALT only.
    MWAIT-C1 is preferred for optimal power and performance.
    But if they support just C1, cpuidle never loads and
    so they use the boot-time default idle loop forever.

 2. Some laptops will boot-hang if HALT is used,
    but will boot successfully if MWAIT is used.
    This appears to be a hidden assumption in BIOS SMI,
    that is presumably valid on the proprietary OS
    where the BIOS was validated.

       https://bugzilla.kernel.org/show_bug.cgi?id=60770

So here we effectively revert the patch above, restoring
the mwait_idle() loop.  However, we don't bother restoring
the idle=mwait cmdline parameter, since it appears to add
no value.

Maintainer notes:

  For 3.9, simply revert 69fb3676df
  for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
  context For 3.11, 3.12, 3.13, this patch applies cleanly

Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> # 3.9+
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Malone <ibmalone@gmail.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
[ Ported to recent kernels. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-16 11:14:21 +01:00