android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Ingo Molnar	dca5b52ad7	x86/asm/entry/64: Rename THREAD_INFO() to ASM_THREAD_INFO() The THREAD_INFO() macro has a somewhat confusingly generic name, defined in a generic .h C header file. It also does not make it clear that it constructs a memory operand for use in assembly code. Rename it to ASM_THREAD_INFO() to make it all glaringly obvious on first glance. Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184442.GC14760@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:31 +01:00
Ingo Molnar	f9d71854b4	x86/asm/entry/64: Merge the field offset into the THREAD_INFO() macro Before: TI_sysenter_return+THREAD_INFO(%rsp,38),%r10d After: movl THREAD_INFO(TI_sysenter_return, %rsp, 38), %r10d to turn it into a clear thread_info accessor. No code changed: md5: fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.before.asm fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.after.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.before.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.after.asm Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184411.GB14760@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:31 +01:00
Ingo Molnar	d56fe4bf5f	x86/asm/entry/64: Always set up SYSENTER MSRs On CONFIG_IA32_EMULATION=y kernels we set up MSR_IA32_SYSENTER_CS/ESP/EIP, but on !CONFIG_IA32_EMULATION kernels we leave them unchanged. Clear them to make sure the instruction is disabled properly. SYSCALL is set up properly in both cases. Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:25 +01:00
Denys Vlasenko	65c2377486	x86/asm/entry/64: Get rid of int_ret_from_sys_call_fixup With the FIXUP_TOP_OF_STACK macro removed, this intermediate jump is unnecessary. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	a71ffdd780	x86/asm/entry/64: Get rid of the FIXUP_TOP_OF_STACK/RESTORE_TOP_OF_STACK macros The FIXUP_TOP_OF_STACK macro is only necessary because we don't save %r11 to pt_regs->r11 on SYSCALL64 fast path, but we want ptrace to see it populated. Bite the bullet, add a single additional PUSH instruction, and remove the FIXUP_TOP_OF_STACK macro. The RESTORE_TOP_OF_STACK macro is already a nop. Remove it too. On SandyBridge CPU, it does not get slower: measured 54.22 ns per getpid syscall before and after last two changes on defconfig kernel. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	9ed8e7d860	x86/asm/entry/64: Use PUSH instructions to build pt_regs on stack With this change, on SYSCALL64 code path we are now populating pt_regs->cs, pt_regs->ss and pt_regs->rcx unconditionally and therefore don't need to do that in FIXUP_TOP_OF_STACK. We lose a number of large instructions there: text data bss dec hex filename 13298 0 0 13298 33f2 entry_64_before.o 12978 0 0 12978 32b2 entry_64.o What's more important, we convert two "MOVQ $imm,off(%rsp)" to "PUSH $imm" (the ones which fill pt_regs->cs,ss). Before this patch, placing them on fast path was slowing it down by two cycles: this form of MOV is very large, 12 bytes, and this probably reduces decode bandwidth to one instruction per cycle when CPU sees them. Therefore they were living in FIXUP_TOP_OF_STACK instead (away from fast path). "PUSH $imm" is a small 2-byte instruction. Moving it to fast path does not slow it down in my measurements. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	ef593260f0	x86/asm/entry: Get rid of KERNEL_STACK_OFFSET PER_CPU_VAR(kernel_stack) was set up in a way where it points five stack slots below the top of stack. Presumably, it was done to avoid one "sub $5*8,%rsp" in syscall/sysenter code paths, where iret frame needs to be created by hand. Ironically, none of them benefits from this optimization, since all of them need to allocate additional data on stack (struct pt_regs), so they still have to perform subtraction. This patch eliminates KERNEL_STACK_OFFSET. PER_CPU_VAR(kernel_stack) now points directly to top of stack. pt_regs allocations are adjusted to allocate iret frame as well. Hopefully we can merge it later with 32-bit specific PER_CPU_VAR(cpu_current_top_of_stack) variable... Net result in generated code is that constants in several insns are changed. This change is necessary for changing struct pt_regs creation in SYSCALL64 code path from MOV to PUSH instructions. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	b3fe8ba320	x86/asm/entry/64: Change the THREAD_INFO() definition to not depend on KERNEL_STACK_OFFSET This changes the THREAD_INFO() definition and all its callsites so that they do not count stack position from (top of stack - KERNEL_STACK_OFFSET), but from top of stack. Semi-mysterious expressions THREAD_INFO(%rsp,RIP) - "why RIP??" are now replaced by more logical THREAD_INFO(%rsp,SIZEOF_PTREGS) - "calculate thread_info's address using information that rsp is SIZEOF_PTREGS bytes below top of stack". While at it, replace "(off)-THREAD_SIZE(reg)" with equivalent "((off)-THREAD_SIZE)(reg)". The form without parentheses falsely looks like we invoke THREAD_SIZE() macro. Improve comment atop THREAD_INFO macro definition. This patch does not change generated code (verified by objdump). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:37 +01:00
Denys Vlasenko	a76c7f4604	x86/asm/entry/64: Fold syscall32_cpu_init() into its sole user Having syscall32/sysenter32 initialization in a separate tiny function, called from within a function that is already syscall init specific, serves no real purpose. Its existense also caused an unintended effect of having wrmsrl(MSR_CSTAR) performed twice: once we set it to a dummy function returning -ENOSYS, and immediately after (if CONFIG_IA32_EMULATION), we set it to point to the proper syscall32 entry point, ia32_cstar_target. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 08:20:51 +01:00
Denys Vlasenko	34061f134f	x86/asm/entry/64: Fix incorrect comment The recent old_rsp -> rsp_scratch rename also changed this comment, but in this case "old_rsp" was not referring to PER_CPU(old_rsp). Fix this. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427115839-6397-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 14:28:54 +01:00
Andy Lutomirski	d74ef1118a	x86/asm/entry: Replace some open-coded VM86 checks with v8086_mode() checks This allows us to remove some unnecessary ifdefs. There should be no change to the generated code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f7e00f0d668e253abf0bd8bf36491ac47bd761ff.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:14:40 +01:00
Andy Lutomirski	f39b6f0ef8	x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()' user_mode_vm() and user_mode() are now the same. Change all callers of user_mode_vm() to user_mode(). The next patch will remove the definition of user_mode_vm. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/43b1f57f3df70df5a08b0925897c660725015554.1426728647.git.luto@kernel.org [ Merged to a more recent kernel. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:14:17 +01:00
Andy Lutomirski	ae60f0710a	x86/asm/entry: Use user_mode_ignore_vm86() where appropriate A few of the user_mode() checks in traps.c are immediately after explicit checks for vm86 mode. Change them to user_mode_ignore_vm86(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0b324d5b75c3402be07f8d3c6245ed7f4995029e.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:13:46 +01:00
Andy Lutomirski	383f3af3f8	x86/asm/entry, perf: Explicitly optimize vm86 handling in code_segment_base() There's no point in checking the VM bit on 64-bit, and, since we're explicitly checking it, we can use user_mode_ignore_vm86() after the check. While we're at it, rearrange the #ifdef slightly to make the code flow a bit clearer. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/dc1457a734feccd03a19bb3538a7648582f57cdd.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:13:41 +01:00
Ingo Molnar	e4518ab90f	Merge tag 'v4.0-rc5' into x86/asm, to resolve conflicts Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:13:15 +01:00
Andy Lutomirski	c56716af8d	x86/asm/entry, perf: Fix incorrect TIF_IA32 check in code_segment_base() We want to check whether user code is in 32-bit mode, not whether the task is nominally 32-bit. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/33e5107085ce347a8303560302b15c2cadd62c4c.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 10:08:21 +01:00
Brian Gerst	1daeaa3151	x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRET Both the execve() and sigreturn() family of syscalls have the ability to change registers in ways that may not be compatabile with the syscall path they were called from. In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss, and some bits in eflags. These syscalls have stubs that are hardcoded to jump to the IRET path, and not return to the original syscall path. The following commit: `76f5df43ca` ("Always allocate a complete "struct pt_regs" on the kernel stack") recently changed this for some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit program to a 64-bit program would fail because it still returned via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit. This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so that the IRET path is always taken on exit to userspace. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@gmail.com [ Improved the changelog and comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 08:52:46 +01:00
Ingo Molnar	c38e503804	x86/asm/entry/64: Rename 'old_rsp' to 'rsp_scratch' Make clear that the usage of PER_CPU(old_rsp) is purely temporary, by renaming it to 'rsp_scratch'. Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:42 +01:00
Ingo Molnar	7fcb3bc361	x86/asm/entry/64: Update comments about stack frames Tweak a few outdated comments that were obsoleted by recent changes to syscall entry code: - we no longer have a "partial stack frame" on entry, ever. - explain the syscall entry usage of old_rsp. Partially based on a (split out of) patch from Denys Vlasenko. Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:41 +01:00
Ingo Molnar	ac9af4983e	x86/asm/entry/64: Remove thread_struct::usersp Nothing uses thread_struct::usersp anymore, so remove it. Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:41 +01:00
Ingo Molnar	9854dd74c3	x86/asm/entry/64: Simplify 'old_rsp' usage Remove all manipulations of PER_CPU(old_rsp) in C code: - it is not used on SYSRET return anymore, and system entries are atomic, so updating it from the fork and context switch paths is pointless. - Tweak a few related comments as well: we no longer have a "partial stack frame" on entry, ever. Based on (split out of) patch from Denys Vlasenko. Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426599779-8010-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:41 +01:00
Denys Vlasenko	33db1fd48a	x86/asm/entry/64: Enable interrupts after we fetch PER_CPU_VAR(old_rsp) We want to use PER_CPU_VAR(old_rsp) as a simple temporary register, to shuffle user-space RSP into (and from) when we set up the system call stack frame. At that point we cannot shuffle values into general purpose registers, because we have not saved them yet. To be able to do this shuffling into a memory location, we must be atomic and must not be preempted while we do the shuffling, otherwise the 'temporary' register gets overwritten by some other task's temporary register contents ... Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426600344-8254-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:40 +01:00
Ingo Molnar	8b6c0ab1a1	x86/asm/entry: Document and clean up the enable_sep_cpu() and syscall32_cpu_init() functions Clean up the flow and document the functions a bit better. Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:29 +01:00
Denys Vlasenko	d828c71fba	x86/asm/entry/32: Document the 32-bit SYSENTER "emergency stack" better Before the patch, the 'tss_struct::stack' field was not referenced anywhere. It was used only to set SYSENTER's stack to point after the last byte of tss_struct, thus the trailing field, stack[64], was used. But grep would not know it. You can comment it out, compile, and kernel will even run until an unlucky NMI corrupts io_bitmap[] (which is also not easily detectable). This patch changes code so that the purpose and usage of this field is not mysterious anymore, and can be easily grepped for. This does change generated code, for a subtle reason: since tss_struct is ____cacheline_aligned, there happens to be 5 longs of padding at the end. Old code was using the padding too; new code will strictly use it only for SYSENTER_stack[]. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425912738-559-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:29 +01:00
Andy Lutomirski	d9e05cc5a5	x86/asm/entry: Unify and fix initial thread_struct::sp0 values x86_32 and x86_64 need slightly different thread_struct::sp0 values, and x86_32's was incorrect for init. This never mattered -- the init thread never runs user code, so we never used thread_struct::sp0 for anything. Fix it and mostly unify them. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1b810c1d2e797e27bb4a7708c426101161edd1f6.1426009661.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:27 +01:00
Andy Lutomirski	3ee4298f44	x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro x86_32, unlike x86_64, pads the top of the kernel stack, because the hardware stack frame formats are variable in size. Document this padding and give it a name. This should make no change whatsoever to the compiled kernel image. It also doesn't fix any of the current bugs in this area. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net [ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:26 +01:00
Andy Lutomirski	9a036b93a3	x86/signal/64: Remove 'fs' and 'gs' from sigcontext As far as I can tell, these fields have been set to zero on save and ignored on restore since Linux was imported into git. Rename them '__pad1' and '__pad2' to avoid confusion. This may also allow us to recycle them some day. This also adds a comment clarifying the history of those fields. I'm intentionally avoiding calling either of them '__pad0': the field formerly known as '__pad0' is now 'ss'. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/844f8490e938780c03355be4c9b69eb4c494bf4e.1426193719.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:26 +01:00
Andy Lutomirski	c6f2062935	x86/signal/64: Fix SS handling for signals delivered to 64-bit programs The comment in the signal code says that apps can save/restore other segments on their own. It's true that apps can save SS on their own, but there's no way for apps to restore it: SYSCALL effectively resets SS to __USER_DS, so any value that user code tries to load into SS gets lost on entry to sigreturn. This recycles two padding bytes in the segment selector area for SS. While we're at it, we need a second change to make this useful. If the signal we're delivering is caused by a bad SS value, saving that value isn't enough. We need to remove that bad value from the regs before we try to deliver the signal. Oddly, the i386 code already got this right. I suspect that 64-bit programs that try to run 16-bit code and use signals will have a lot of trouble without this. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 09:25:25 +01:00
Borislav Petkov	69797dafe3	Revert "x86/mm/ASLR: Propagate base load address calculation" This reverts commit: `f47233c2d3` ("x86/mm/ASLR: Propagate base load address calculation") The main reason for the revert is that the new boot flag does not work at all currently, and in order to make this work, we need non-trivial changes to the x86 boot code which we didn't manage to get done in time for merging. And even if we did, they would've been too risky so instead of rushing things and break booting 4.1 on boxes left and right, we will be very strict and conservative and will take our time with this to fix and test it properly. Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Baoquan He <bhe@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com Cc: Jiri Kosina <jkosina@suse.cz> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/20150316100628.GD22995@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-16 11:18:21 +01:00
Oleg Nesterov	a7c80ebcac	x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() math_state_restore() assumes it is called with irqs disabled, but this is not true if the caller is __restore_xstate_sig(). This means that if ia32_fxstate == T and __copy_from_user() fails, __restore_xstate_sig() returns with irqs disabled too. This triggers: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41 dump_stack ___might_sleep ? _raw_spin_unlock_irqrestore __might_sleep down_read ? _raw_spin_unlock_irqrestore print_vma_addr signal_fault sys32_rt_sigreturn Change __restore_xstate_sig() to call set_used_math() unconditionally. This avoids enabling and disabling interrupts in math_state_restore(). If copy_from_user() fails, we can simply do fpu_finit() by hand. [ Note: this is only the first step. math_state_restore() should not check used_math(), it should set this flag. While init_fpu() should simply die. ] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-13 12:44:28 +01:00
Daniel J Blueman	c8a470cab0	x86/apic/numachip: Fix sibling map with NumaChip On NumaChip systems, the physical processor ID assignment wasn't accounting for the number of nodes in AMD multi-module processors, giving an incorrect sibling map: $ cd /sys/devices/system/cpu/cpu29/topology $ grep . * core_id:5 core_siblings:00000000,ff000000 core_siblings_list:24-31 physical_package_id:3 thread_siblings:00000000,30000000 thread_siblings_list:28-29 This fixes it: $ cd /sys/devices/system/cpu/cpu29/topology $ grep . * core_id:5 core_siblings:00000000,ffff0000 core_siblings_list:16-31 physical_package_id:1 thread_siblings:00000000,30000000 thread_siblings_list:28-29 Signed-off-by: Daniel J Blueman <daniel@numascale.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426135950-10110-1-git-send-email-daniel@numascale.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-12 16:58:59 +01:00
Li, Aubrey	7486341a98	x86/platform, acpi: Bypass legacy PIC and PIT in ACPI hardware reduced mode On a platform in ACPI Hardware-reduced mode, the legacy PIC and PIT may not be initialized even though they may be present in silicon. Touching these legacy components causes unexpected results on the system. On the Bay Trail-T(ASUS-T100) platform, touching these legacy components blocks platform hardware low idle power state(S0ix) during system suspend. So we should bypass them in ACPI hardware reduced mode. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Li Aubrey <aubrey.li@linux.intel.com> Cc: <alan@linux.intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Link: http://lkml.kernel.org/r/54FFF81C.20703@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-12 12:07:13 +01:00
Denys Vlasenko	263042e463	x86/asm/entry/64: Save user RSP in pt_regs->sp on SYSCALL64 fastpath Prepare for the removal of 'usersp', by simplifying PER_CPU(old_rsp) usage: - use it only as temp storage - store the userspace stack pointer immediately in pt_regs->sp on syscall entry, instead of using it later, on syscall exit. - change C code to use pt_regs->sp only, instead of PER_CPU(old_rsp) and task->thread.usersp. FIXUP/RESTORE_TOP_OF_STACK are simplified as well. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425926364-9526-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-10 13:56:10 +01:00
Denys Vlasenko	616ab249f1	x86/asm/entry/64: Remove stub_iopl stub_iopl is no longer needed: pt_regs->flags needs no fixing up after previous change. Remove it. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425984307-2143-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-10 13:56:10 +01:00
Denys Vlasenko	29722cd4ef	x86/asm/entry/64: Save R11 into pt_regs->flags on SYSCALL64 fastpath Before this patch, R11 was saved in pt_regs->r11. Which looks natural, but requires messy shuffling to/from iret frame whenever ptrace or e.g. sys_iopl() wants to modify flags - because that's how this register is used by SYSCALL/SYSRET. This patch saves R11 in pt_regs->flags, and uses that value for the SYSRET64 instruction. Shuffling is eliminated. FIXUP/RESTORE_TOP_OF_STACK are simplified. stub_iopl is no longer needed: pt_regs->flags needs no fixing up. Testing shows that syscall fast path is ~54.3 ns before and after the patch (on 2.7 GHz Sandy Bridge CPU). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425926364-9526-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-10 13:56:10 +01:00
Andy Lutomirski	394838c960	x86/asm/entry/32: Fix user_mode() misuses The one in do_debug() is probably harmless, but better safe than sorry. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/d67deaa9df5458363623001f252d1aee3215d014.1425948056.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-10 04:21:51 +01:00
Denys Vlasenko	3e1aa7cb59	x86/asm: Optimize unnecessarily wide TEST instructions By the nature of the TEST operation, it is often possible to test a narrower part of the operand: "testl $3, mem" -> "testb $3, mem", "testq $3, %rcx" -> "testb $3, %cl" This results in shorter instructions, because the TEST instruction has no sign-entending byte-immediate forms unlike other ALU ops. Note that this change does not create any LCP (Length-Changing Prefix) stalls, which happen when adding a 0x66 prefix, which happens when 16-bit immediates are used, which changes such TEST instructions: [test_opcode] [modrm] [imm32] to: [0x66] [test_opcode] [modrm] [imm16] where [imm16] has a different length now: 2 bytes instead of 4. This confuses the decoder and slows down execution. REX prefixes were carefully designed to almost never hit this case: adding REX prefix does not change instruction length except MOVABS and MOV [addr],RAX instruction. This patch does not add instructions which would use a 0x66 prefix, code changes in assembly are: -48 f7 07 01 00 00 00 testq $0x1,(%rdi) +f6 07 01 testb $0x1,(%rdi) -48 f7 c1 01 00 00 00 test $0x1,%rcx +f6 c1 01 test $0x1,%cl -48 f7 c1 02 00 00 00 test $0x2,%rcx +f6 c1 02 test $0x2,%cl -41 f7 c2 01 00 00 00 test $0x1,%r10d +41 f6 c2 01 test $0x1,%r10b -48 f7 c1 04 00 00 00 test $0x4,%rcx +f6 c1 04 test $0x4,%cl -48 f7 c1 08 00 00 00 test $0x8,%rcx +f6 c1 08 test $0x8,%cl Linus further notes: "There are no stalls from using 8-bit instruction forms. Now, changing from 64-bit or 32-bit 'test' instructions to 8-bit ones could cause problems if it ends up having forwarding issues, so that instead of just forwarding the result, you end up having to wait for it to be stable in the L1 cache (or possibly the register file). The forwarding from the store buffer is simplest and most reliable if the read is done at the exact same address and the exact same size as the write that gets forwarded. But that's true only if: (a) the write was very recent and is still in the write queue. I'm not sure that's the case here anyway. (b) on at least most Intel microarchitectures, you have to test a different byte than the lowest one (so forwarding a 64-bit write to a 8-bit read ends up working fine, as long as the 8-bit read is of the low 8 bits of the written data). A very similar issue might show up for registers too, not just memory writes, if you use 'testb' with a high-byte register (where instead of forwarding the value from the original producer it needs to go through the register file and then shifted). But it's mainly a problem for store buffers. But afaik, the way Denys changed the test instructions, neither of the above issues should be true. The real problem for store buffer forwarding tends to be "write 8 bits, read 32 bits". That can be really surprisingly expensive, because the read ends up having to wait until the write has hit the cacheline, and we might talk tens of cycles of latency here. But "write 32 bits, read the low 8 bits" should be fast on pretty much all x86 chips, afaik." Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425675332-31576-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-07 11:12:43 +01:00
Andy Lutomirski	a7fcf28d43	x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32 I broke 32-bit kernels. The implementation of sp0 was correct as far as I can tell, but sp0 was much weirder on x86_32 than I realized. It has the following issues: - Init's sp0 is inconsistent with everything else's: non-init tasks are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.) - vm86 does crazy things to sp0. Fix it up by replacing this_cpu_sp0() with current_top_of_stack() and using a new percpu variable to track the top of the stack on x86_32. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `75182b1632` ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()") Link: http://lkml.kernel.org/r/d09dbe270883433776e0cbee3c7079433349e96d.1425692936.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-07 09:34:03 +01:00
Andy Lutomirski	b27559a433	x86/asm/entry: Delay loading sp0 slightly on task switch The change: `75182b1632` ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()") had the unintended side effect of changing the return value of current_thread_info() during part of the context switch process. Change it back. This has no effect as far as I can tell -- it's just for consistency. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9fcaa47dd8487db59eed7a3911b6ae409476763e.1425692936.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-07 09:34:03 +01:00
Andy Lutomirski	9b47668843	x86/asm/entry: Rename 'INIT_TSS_IST' to 'CPU_TSS_IST' This has nothing to do with the init thread or the initial anything. It's just the CPU's TSS. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a0bd5e26b32a2e1f08ff99017d0997118fbb2485.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:58 +01:00
Andy Lutomirski	d0a0de21f8	x86/asm/entry: Remove INIT_TSS and fold the definitions into 'cpu_tss' The INIT_TSS is unnecessary. Just define the initial TSS where 'cpu_tss' is defined. While we're at it, merge the 32-bit and 64-bit definitions. The only syntactic change is that 32-bit kernels were computing sp0 as long, but now they compute it as unsigned long. Verified by objdump: the contents and relocations of .data..percpu..shared_aligned are unchanged on 32-bit and 64-bit kernels. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8fc39fa3f6c5d635e93afbdd1a0fe0678a6d7913.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:58 +01:00
Andy Lutomirski	24933b82c0	x86/asm/entry: Rename 'init_tss' to 'cpu_tss' It has nothing to do with init -- there's only one TSS per cpu. Other names considered include: - current_tss: Confusing because we never switch the tss. - singleton_tss: Too long. This patch was generated with 's/init_tss/cpu_tss/g'. Followup patches will fix INIT_TSS and INIT_TSS_IST by hand. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/da29fb2a793e4f649d93ce2d1ed320ebe8516262.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:58 +01:00
Andy Lutomirski	9d0c914c60	x86/asm/entry/64/compat: Change the 32-bit sysenter code to use sp0 The ia32 sysenter code loaded the top of the kernel stack into rsp by loading kernel_stack and then adjusting it. It can be simplified to just read sp0 directly. This requires the addition of a new asm-offsets entry for sp0. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/88ff9006163d296a0665338585c36d9bfb85235d.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:58 +01:00
Andy Lutomirski	75182b1632	x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0() This will make modifying the semantics of kernel_stack easier. The change to ist_begin_non_atomic() is necessary because sp0 no longer points to the same THREAD_SIZE-aligned region as RSP; it's one byte too high for that. At Denys' suggestion, rather than offsetting it, just check explicitly that we're in the correct range ending at sp0. This has the added benefit that we no longer assume that the thread stack is aligned to THREAD_SIZE. Suggested-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ef8254ad414cbb8034c9a56396eeb24f5dd5b0de.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:57 +01:00
Andy Lutomirski	8ef46a672a	x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu We currently store references to the top of the kernel stack in multiple places: kernel_stack (with an offset) and init_tss.x86_tss.sp0 (no offset). The latter is defined by hardware and is a clean canonical way to find the top of the stack. Add an accessor so we can start using it. This needs minor paravirt tweaks. On native, sp0 defines the top of the kernel stack and is therefore always correct. On Xen and lguest, the hypervisor tracks the top of the stack, but we want to start reading sp0 in the kernel. Fixing this is simple: just update our local copy of sp0 as well as the hypervisor's copy on task switches. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8d675581859712bee09a055ed8f785d80dac1eca.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:57 +01:00
Andy Lutomirski	956421fbb7	x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization 'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and the related state make sense for 'ret_from_sys_call'. This is entirely the wrong check. TS_COMPAT would make a little more sense, but there's really no point in keeping this optimization at all. This fixes a return to the wrong user CS if we came from int 0x80 in a 64-bit task. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net [ Backported from tip:x86/asm. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-05 01:12:23 +01:00
Wang Nan	5eca7453d6	x86/traps: Separate set_intr_gate() and clean up early_trap_init() As early_trap_init() doesn't use IST, replace set_intr_gate_ist() and set_system_intr_gate_ist() with their standard counterparts. set_intr_gate() requires a trace_debug symbol which we don't have and won't use. This patch separates set_intr_gate() into two parts, and uses base version in early_trap_init(). Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: <dave.hansen@linux.intel.com> Cc: <lizefan@huawei.com> Cc: <masami.hiramatsu.pt@hitachi.com> Cc: <oleg@redhat.com> Cc: <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425010789-13714-1-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-05 00:47:29 +01:00
Andy Lutomirski	1e3fbb8a1d	x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization 'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and the related state make sense for 'ret_from_sys_call'. This is entirely the wrong check. TS_COMPAT would make a little more sense, but there's really no point in keeping this optimization at all. This fixes a return to the wrong user CS if we came from int 0x80 in a 64-bit task. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-04 22:50:53 +01:00
Denys Vlasenko	d441c1f2b7	x86/asm/entry/64: Simplify optimistic SYSRET Avoid redundant load of %r11 (it is already loaded a few instructions before). Also simplify %rsp restoration, instead of two steps: add $0x80, %rsp mov 0x18(%rsp), %rsp we can do a simplified single step to restore user-space RSP: mov 0x98(%rsp), %rsp and get the same result. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> [ Clarified the changelog. ] Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1aef69b346a6db0d99cdfb0f5ba83e8c985e27d7.1424989793.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-04 22:50:52 +01:00
Denys Vlasenko	911d2bb5cc	x86/asm/entry/64: Use more readable constants Constants such as SS+8 or SS+8-RIP are mysterious. In most cases, SS+8 is just meant to be SIZEOF_PTREGS, SS+8-RIP is RIP's offset in the iret frame. This patch changes some of these constants to be less mysterious. No code changes (verified with objdump). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1d20491384773bd606e23a382fac23ddb49b5178.1424989793.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-04 22:50:52 +01:00

1 2 3 4 5 ...

10641 Commits