android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Denys Vlasenko	0784b36448	x86/asm/entry/64: Fold the 'test_in_nmi' macro into its only user No code changes. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427899858-7165-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-02 12:00:10 +02:00
Steffen Liebergeld	f59df35fc2	kgdb/x86: Fix reporting of 'si' in kgdb on x86_64 This patch fixes an error in kgdb for x86_64 which would report the value of dx when asked to give the value of si. Signed-off-by: Steffen Liebergeld <steffen.liebergeld@kernkonzept.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-02 11:32:16 +02:00
Andy Lutomirski	7ea2416909	x86/asm/entry/64: Disable opportunistic SYSRET if regs->flags has TF set When I wrote the opportunistic SYSRET code, I missed an important difference between SYSRET and IRET. Both instructions are capable of setting EFLAGS.TF, but they behave differently when doing so: - IRET will not issue a #DB trap after execution when it sets TF. This is critical -- otherwise you'd never be able to make forward progress when returning to userspace. - SYSRET, on the other hand, will trap with #DB immediately after returning to CPL3, and the next instruction will never execute. This breaks anything that opportunistically SYSRETs to a user context with TF set. For example, running this code with TF set and a SIGTRAP handler loaded never gets past 'post_nop': extern unsigned char post_nop[]; asm volatile ("pushfq\n\t" "popq %%r11\n\t" "nop\n\t" "post_nop:" : : "c" (post_nop) : "r11"); In my defense, I can't find this documented in the AMD or Intel manual. Fix it by using IRET to restore TF. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `2a23c6b8a9` ("x86_64, entry: Use sysret to return to userspace when possible") Link: http://lkml.kernel.org/r/9472f1ca4c19a38ecda45bba9c91b7168135fcfa.1427923514.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-02 11:09:54 +02:00
Christoph Hellwig	ec776ef6bb	x86/mm: Add support for the non-standard protected e820 type Various recent BIOSes support NVDIMMs or ADR using a non-standard e820 memory type, and Intel supplied reference Linux code using this type to various vendors. Wire this e820 table type up to export platform devices for the pmem driver so that we can use it in Linux. Based on earlier work from: Dave Jiang <dave.jiang@intel.com> Dan Williams <dan.j.williams@intel.com> Includes fixes for NUMA regions from Boaz Harrosh. Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jens Axboe <axboe@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-nvdimm@ml01.01.org Link: http://lkml.kernel.org/r/1427872339-6688-2-git-send-email-hch@lst.de [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 17:02:43 +02:00
Stefan Lippers-Hollmann	80313b3078	x86/reboot: Add ASRock Q1900DC-ITX mainboard reboot quirk The ASRock Q1900DC-ITX mainboard (Baytrail-D) hangs randomly in both BIOS and UEFI mode while rebooting unless reboot=pci is used. Add a quirk to reboot via the pci method. The problem is very intermittent and hard to debug, it might succeed rebooting just fine 40 times in a row - but fails half a dozen times the next day. It seems to be slightly less common in BIOS CSM mode than native UEFI (with the CSM disabled), but it does happen in either mode. Since I've started testing this patch in late january, rebooting has been 100% reliable. Most of the time it already hangs during POST, but occasionally it might even make it through the bootloader and the kernel might even start booting, but then hangs before the mode switch. The same symptoms occur with grub-efi, gummiboot and grub-pc, just as well as (at least) kernel 3.16-3.19 and 4.0-rc6 (I haven't tried older kernels than 3.16). Upgrading to the most current mainboard firmware of the ASRock Q1900DC-ITX, version 1.20, does not improve the situation. ( Searching the web seems to suggest that other Bay Trail-D mainboards might be affected as well. ) -- Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Cc: <stable@vger.kernel.org> Cc: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/20150330224427.0fb58e42@mir Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 14:08:09 +02:00
Denys Vlasenko	a6de5a21fb	x86/asm/entry/64: Use local label to skip around sycall dispatch Logically, we just want to jump around the following instruction and its prologue/epilogue: call *sys_call_table(,%rax,8) if the syscall number is too big - we do not specifically target the "int_ret_from_sys_call" label. Use a local, numerical label for this jump, for more clarity. This also makes the code smaller: -ffffffff8187756b: 0f 87 0f 00 00 00 ja ffffffff81877580 <int_ret_from_sys_call> +ffffffff8187756b: 77 0f ja ffffffff8187757c <int_ret_from_sys_call> because jumps to global labels are never translated to short jump instructions by GAS. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-9-git-send-email-dvlasenk@redhat.com [ Improved the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:39 +02:00
Denys Vlasenko	a734b4a23e	x86/asm: Replace "MOVQ $imm, %reg" with MOVL There is no reason to use MOVQ to load a non-negative immediate constant value into a 64-bit register. MOVL does the same, since the upper 32 bits are zero-extended by the CPU. This makes the code a bit smaller, while leaving functionality unchanged. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-8-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:39 +02:00
Denys Vlasenko	36acef2510	x86/asm/entry/64: Simplify looping around preempt_schedule_irq() At the 'exit_intr' label we test whether interrupt/exception was in kernel. If it did, we jump to the preemption check. If preemption does happen (IOW if we call preempt_schedule_irq()), we go back to 'exit_intr'. But it's pointless, we already know that the test succeeded last time, preemption doesn't change the fact that interrupt/exception was in the kernel. We can go back directly to checking PER_CPU_VAR(__preempt_count) instead. This makes the 'exit_intr' label unused, drop it. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:39 +02:00
Denys Vlasenko	32a04077fe	x86/asm/entry/64: Remove redundant DISABLE_INTERRUPTS() At this location, we already have interrupts off, always. To be more specific, we already disabled them here: ret_from_intr: DISABLE_INTERRUPTS(CLBR_NONE) Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:38 +02:00
Denys Vlasenko	6ba71b7617	x86/asm/entry/64: Simplify retint_kernel label usage, make retint_restore_args label local Get rid of #define obfuscation of retint_kernel in CONFIG_PREEMPT case by defining retint_kernel label always, not only for CONFIG_PREEMPT. Strip retint_kernel of .global-ness (ENTRY macro) - it has no users outside of this file. This looks like cosmetics, but it is not: "je LABEL" can be optimized to short jump by assember only if LABEL is not global, for global labels jump is always a near one with relocation. Convert retint_restore_args to a local numeric label, making it clearer that it is not used elsewhere in the file. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:38 +02:00
Denys Vlasenko	4416c5a6da	x86/asm/entry/64: Do not TRACE_IRQS fast SYSRET64 path SYSRET code path has a small irq-off block. On this code path, TRACE_IRQS_ON can't be called right before interrupts are enabled for real, we can't clobber registers there. So current code does it earlier, in a safe place. But with this, TRACE_IRQS_OFF/ON frames just two fast instructions, which is ridiculous: now most of irq-off block is _outside_ of the framing. Do the same thing that we do on SYSCALL entry: do not track this irq-off block, it is very small to ever cause noticeable irq latency. Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before TRACE_IRQS_OFF. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 13:17:38 +02:00
Bandan Das	4399c03c67	x86/apic: Remove verify_local_APIC() __verify_local_APIC() is detritus from the early APIC days. Its return value isn't used anywhere and the information it prints when debug is enabled is already part of APIC initialization messages printed to syslog. Off with it! Signed-off-by: Bandan Das <bsd@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/jpgy4mcsxsq.fsf@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 10:47:57 +02:00
Ingo Molnar	55474c48b4	x86/asm/entry: Remove user_mode_ignore_vm86() user_mode_ignore_vm86() can be used instead of user_mode(), in places where we have already done a v8086_mode() security check of ptregs. But doing this check in the wrong place would be a bug that could result in security problems, and also the naming still isn't very clear. Furthermore, it only affects 32-bit kernels, while most development happens on 64-bit kernels. If we replace them with user_mode() checks then the cost is only a very minor increase in various slowpaths: text data bss dec hex filename 10573391 703562 1753042 13029995 c6d26b vmlinux.o.before 10573423 703562 1753042 13030027 c6d28b vmlinux.o.after So lets get rid of this distinction once and for all. Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150329090233.GA1963@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 11:45:19 +02:00
Hector Marco-Gisbert	4e26d11f52	x86/mm: Improve AMD Bulldozer ASLR workaround The ASLR implementation needs to special-case AMD F15h processors by clearing out bits [14:12] of the virtual address in order to avoid I$ cross invalidations and thus performance penalty for certain workloads. For details, see: `dfb09f9b7a` ("x86, amd: Avoid cache aliasing penalties on AMD family 15h") This special case reduces the mmapped file's entropy by 3 bits. The following output is the run on an AMD Opteron 62xx class CPU processor under x86_64 Linux 4.0.0: $ for i in `seq 1 10`; do cat /proc/self/maps \| grep "r-xp.*libc" ; done b7588000-b7736000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7570000-b771e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75d0000-b777e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b75b0000-b775e000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 b7578000-b7726000 r-xp 00000000 00:01 4924 /lib/i386-linux-gnu/libc.so.6 ... Bits [12:14] are always 0, i.e. the address always ends in 0x8000 or 0x0000. 32-bit systems, as in the example above, are especially sensitive to this issue because 32-bit randomness for VA space is 8 bits (see mmap_rnd()). With the Bulldozer special case, this diminishes to only 32 different slots of mmap virtual addresses. This patch randomizes per boot the three affected bits rather than setting them to zero. Since all the shared pages have the same value at bits [12..14], there is no cache aliasing problems. This value gets generated during system boot and it is thus not known to a potential remote attacker. Therefore, the impact from the Bulldozer workaround gets diminished and ASLR randomness increased. More details at: http://hmarco.org/bugs/AMD-Bulldozer-linux-ASLR-weakness-reducing-mmaped-files-by-eight.html Original white paper by AMD dealing with the issue: http://developer.amd.com/wordpress/media/2012/10/SharedL1InstructionCacheonAMD15hCPU.pdf Mentored-by: Ismael Ripoll <iripoll@disca.upv.es> Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan-Simon <dl9pf@gmx.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-fsdevel@vger.kernel.org Link: http://lkml.kernel.org/r/1427456301-3764-1-git-send-email-hecmargi@upv.es Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 10:01:17 +02:00
Michael S. Tsirkin	46423ffaf4	x86/microcode/amd: Drop the pci_ids.h dependency This file doesn't use any macros from pci_ids.h anymore, drop the include. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andreas Herrmann <herrmann.der.user@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427635734-24786-80-git-send-email-mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 09:54:32 +02:00
Denys Vlasenko	a3675b32aa	x86/asm/entry/64: Do not GET_THREAD_INFO() too early At exit_intr, we GET_THREAD_INFO(%rcx) and then jump to retint_kernel if saved CS was from kernel. But the code at retint_kernel doesn't need %rcx. Move GET_THREAD_INFO(%rcx) down, after CS check and branch. While at it, remove "has a correct top of stack" comment. After recent changes which eliminated FIXUP_TOP_OF_STACK, we always have a correct pt_regs layout. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427738975-7391-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 09:31:11 +02:00
Denys Vlasenko	627276cb55	x86/asm/entry/64: Move retint_kernel code block closer to its user The "retint_kernel" code block is misplaced. Since its logical continuation is "retint_restore_args", it is more natural to place it above that label. This also makes two jumps "short". This change only moves code block around, without changing logic. This enables the next simplification: making "retint_restore_args" label a local numeric one. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427738975-7391-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 09:31:11 +02:00
Ingo Molnar	c5e77f5216	Merge tag 'v4.0-rc6' into timers/core, before applying new patches Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-31 09:08:13 +02:00
Denys Vlasenko	27be87c5d5	x86/asm/entry/64: Add missing CFI annotation This is a missing bit of the recent MOV-to-PUSH conversion. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427452582-21624-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 12:27:57 +01:00
Denys Vlasenko	487d1edb9a	x86/asm/entry/64: Fix comment about SYSENTER MSRs The comment is ancient, it dates to the time when only AMD's x86_64 implementation existed. AMD wasn't (and still isn't) supporting SYSENTER, so these writes were "just in case" back then. This has changed: Intel's x86_64 appeared, and Intel does support SYSENTER in long mode. "Some future 64-bit CPU" is here already. The code may appear "buggy" for AMD as it stands, since MSR_IA32_SYSENTER_EIP is only 32-bit for AMD CPUs. Writing a kernel function's address to it would drop high bits. Subsequent use of this MSR for branch via SYSENTER seem to allow user to transition to CPL0 while executing his code. Scary, eh? Explain why that is not a bug: because SYSENTER insn would not work on AMD CPU. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427453956-21931-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 12:23:16 +01:00
Peter Zijlstra	34f439278c	perf: Add per event clockid support While thinking on the whole clock discussion it occurred to me we have two distinct uses of time: 1) the tracking of event/ctx/cgroup enabled/running/stopped times which includes the self-monitoring support in struct perf_event_mmap_page. 2) the actual timestamps visible in the data records. And we've been conflating them. The first is all about tracking time deltas, nobody should really care in what time base that happens, its all relative information, as long as its internally consistent it works. The second however is what people are worried about when having to merge their data with external sources. And here we have the discussion on MONOTONIC vs MONOTONIC_RAW etc.. Where MONOTONIC is good for correlating between machines (static offset), MONOTNIC_RAW is required for correlating against a fixed rate hardware clock. This means configurability; now 1) makes that hard because it needs to be internally consistent across groups of unrelated events; which is why we had to have a global perf_clock(). However, for 2) it doesn't really matter, perf itself doesn't care what it writes into the buffer. The below patch makes the distinction between these two cases by adding perf_event_clock() which is used for the second case. It further makes this configurable on a per-event basis, but adds a few sanity checks such that we cannot combine events with different clocks in confusing ways. And since we then have per-event configurability we might as well retain the 'legacy' behaviour as a default. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 10:13:22 +01:00
Ingo Molnar	b381e63b48	Merge branch 'perf/core' into perf/timer, before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 10:10:47 +01:00
Ingo Molnar	4e6d7c2aa9	Merge branch 'timers/core' into perf/timer, to apply dependent patch An upcoming patch will depend on tai_ns() and NMI-safe ktime_get_raw_fast(), so merge timers/core here in a separate topic branch until it's all cooked and timers/core is merged upstream. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 10:09:21 +01:00
Ingo Molnar	4bfe186dbe	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - Miscellaneous fixes. - Add in-kernel API to enable and disable expediting of normal RCU grace periods. - Improve RCU's handling of (hotplug-) outgoing CPUs. Note: ARM support is lagging a bit here, and these improved diagnostics might generate (harmless) splats. - NO_HZ_FULL_SYSIDLE fixes. - Tiny RCU updates to make it more tiny. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 10:04:06 +01:00
Denys Vlasenko	47eb582e70	x86/asm/entry/64: Use smaller instructions The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2 is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn instead of 10-byte MOVABS one. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427303896-24023-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:57:06 +01:00
Denys Vlasenko	146b2b097d	x86/asm/entry/64: Use better label name, fix comments A named label "ret_from_sys_call" implies that there are jumps to this location from elsewhere, as happens with many other labels in this file. But this label is used only by the JMP a few insns above. To make that obvious, use local numeric label instead. Improve comments: "and return regs->ax" isn't too informative. We always return regs->ax. The comment suggesting that it'd be cool to use rip relative addressing for CALL is deleted. It's unclear why that would be an improvement - we aren't striving to use position-independent code here. PIC code here would require something like LEA sys_call_table(%rip),reg + CALL (reg,%rax8)... "iret frame is also incomplete" is no longer true, fix that too. Also fix typo in comment. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427303896-24023-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:57:05 +01:00
David Ahern	9332d250b4	perf/x86: Remove redundant calls to perf_pmu_{dis\|en}able() perf_pmu_disable() is called before pmu->add() and perf_pmu_enable() is called afterwards. No need to call these inside of x86_pmu_add() as well. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424281543-67335-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:49:44 +01:00
Ingo Molnar	936c663aed	Merge branch 'perf/x86' into perf/core, because it's ready Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:46:19 +01:00
Ingo Molnar	072e5a1cfa	Merge branch 'perf/urgent' into perf/core, to pick up fixes and to refresh the tree Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:46:03 +01:00
Peter Zijlstra	876e78818d	time: Rename timekeeper::tkr to timekeeper::tkr_mono In preparation of adding another tkr field, rename this one to tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base, since the structure is not specific to CLOCK_MONOTONIC and the mono name got added to the tk_read_base instance. Lots of trivial churn. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:45:06 +01:00
Andi Kleen	294fe0f52a	perf/x86/intel: Add INST_RETIRED.ALL workarounds On Broadwell INST_RETIRED.ALL cannot be used with any period that doesn't have the lowest 6 bits cleared. And the period should not be smaller than 128. This is erratum BDM11 and BDM55: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf BDM11: When using a period < 100; we may get incorrect PEBS/PMI interrupts and/or an invalid counter state. BDM55: When bit0-5 of the period are !0 we may get redundant PEBS records on overflow. Add a new callback to enforce this, and set it for Broadwell. How does this handle the case when an app requests a specific period with some of the bottom bits set? Short answer: Any useful instruction sampling period needs to be 4-6 orders of magnitude larger than 128, as an PMI every 128 instructions would instantly overwhelm the system and be throttled. So the +-64 error from this is really small compared to the period, much smaller than normal system jitter. Long answer (by Peterz): IFF we guarantee perf_event_attr::sample_period >= 128. Suppose we start out with sample_period=192; then we'll set period_left to 192, we'll end up with left = 128 (we truncate the lower bits). We get an interrupt, find that period_left = 64 (>0 so we return 0 and don't get an overflow handler), up that to 128. Then we trigger again, at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get an overflow). We increment with sample_period so we get left = 128. We fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an overflow). And on and on. So while the individual interrupts are 'wrong' we get then with interval=256,128 in exactly the right ratio to average out at 192. And this works for everything >=128. So the num_samples*fixed_period thing is still entirely correct +- 127, which is good enough I'd say, as you already have that error anyhow. So no need to 'fix' the tools, al we need to do is refuse to create INST_RETIRED:ALL events with sample_period < 128. Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Updated comments and changelog a bit. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:14:03 +01:00
Andi Kleen	91f1b70582	perf/x86/intel: Add Broadwell core support Add Broadwell support for Broadwell to perf. The basic support is very similar to Haswell. We use the new cache event list added for Haswell earlier. The only differences are a few bits related to remote nodes. To avoid an extra, mostly identical, table these are patched up in the initialization code. The constraint list has one new event that needs to be handled over Haswell. Includes code and testing from Kan Liang. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:14:02 +01:00
Andi Kleen	0f1b5ca240	perf/x86/intel: Add new cache events table for Haswell Haswell offcore events are quite different from Sandy Bridge. Add a new table to handle Haswell properly. Note that the offcore bits listed in the SDM are not quite correct (this is currently being fixed). An uptodate list of bits is in the patch. The basic setup is similar to Sandy Bridge. The prefetch columns have been removed, as prefetch counting is not very reliable on Haswell. One L1 event that is not in the event list anymore has been also removed. - data reads do not include code reads (comparable to earlier Sandy Bridge tables) - data counts include speculative execution (except L1 write, dtlb, bpu) - remote node access includes both remote memory, remote cache, remote mmio. - prefetches are not included in the counts for consistency (different from Sandy Bridge, which includes prefetches in the remote node) Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Removed the HSM30 comments; we don't have them for SNB/IVB either. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 09:14:01 +01:00
Catalin Marinas	828aef376d	ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID CPU hardware ID (phys_id) is defined as u32 in structure acpi_processor, but phys_id is used as int in acpi processor driver, so it will lead to some inconsistence for the drivers. Furthermore, to cater for ACPI arch ports that implement 64 bits CPU ids a generic CPU physical id type is required. So introduce typedef u32 phys_cpuid_t in a common file, and introduce a macro PHYS_CPUID_INVALID as (phys_cpuid_t)(-1) if it's not defined by other archs, this will solve the inconsistence in acpi processor driver, and will prepare for the ACPI on ARM64 for the 64 bit CPU hardware ID in the following patch. CC: Rafael J Wysocki <rjw@rjwysocki.net> Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Grant Likely <grant.likely@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [hj: reworked cpu physid map return codes] Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-03-26 15:12:51 +00:00
Ingo Molnar	06ab9c1ba6	Merge branch 'x86/urgent' into x86/asm, to resolve conflict Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-25 13:19:43 +01:00
Andy Lutomirski	b3494a4ab2	x86/asm/entry: Check for syscall exit work with IRQs disabled We currently have a race: if we're preempted during syscall exit, we can fail to process syscall return work that is queued up while we're preempted in ret_from_sys_call after checking ti.flags. Fix it by disabling interrupts before checking ti.flags. Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com> Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tejun Heo <tj@kernel.org> Fixes: `96b6352c12` ("x86_64, entry: Remove the syscall exit audit") Link: http://lkml.kernel.org/r/189320d42b4d671df78c10555976bb10af1ffc75.1427137498.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 21:08:28 +01:00
Ingo Molnar	dca5b52ad7	x86/asm/entry/64: Rename THREAD_INFO() to ASM_THREAD_INFO() The THREAD_INFO() macro has a somewhat confusingly generic name, defined in a generic .h C header file. It also does not make it clear that it constructs a memory operand for use in assembly code. Rename it to ASM_THREAD_INFO() to make it all glaringly obvious on first glance. Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184442.GC14760@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:31 +01:00
Ingo Molnar	f9d71854b4	x86/asm/entry/64: Merge the field offset into the THREAD_INFO() macro Before: TI_sysenter_return+THREAD_INFO(%rsp,38),%r10d After: movl THREAD_INFO(TI_sysenter_return, %rsp, 38), %r10d to turn it into a clear thread_info accessor. No code changed: md5: fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.before.asm fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.after.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.before.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.after.asm Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184411.GB14760@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:31 +01:00
Ingo Molnar	d56fe4bf5f	x86/asm/entry/64: Always set up SYSENTER MSRs On CONFIG_IA32_EMULATION=y kernels we set up MSR_IA32_SYSENTER_CS/ESP/EIP, but on !CONFIG_IA32_EMULATION kernels we leave them unchanged. Clear them to make sure the instruction is disabled properly. SYSCALL is set up properly in both cases. Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 20:57:25 +01:00
Denys Vlasenko	65c2377486	x86/asm/entry/64: Get rid of int_ret_from_sys_call_fixup With the FIXUP_TOP_OF_STACK macro removed, this intermediate jump is unnecessary. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	a71ffdd780	x86/asm/entry/64: Get rid of the FIXUP_TOP_OF_STACK/RESTORE_TOP_OF_STACK macros The FIXUP_TOP_OF_STACK macro is only necessary because we don't save %r11 to pt_regs->r11 on SYSCALL64 fast path, but we want ptrace to see it populated. Bite the bullet, add a single additional PUSH instruction, and remove the FIXUP_TOP_OF_STACK macro. The RESTORE_TOP_OF_STACK macro is already a nop. Remove it too. On SandyBridge CPU, it does not get slower: measured 54.22 ns per getpid syscall before and after last two changes on defconfig kernel. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	9ed8e7d860	x86/asm/entry/64: Use PUSH instructions to build pt_regs on stack With this change, on SYSCALL64 code path we are now populating pt_regs->cs, pt_regs->ss and pt_regs->rcx unconditionally and therefore don't need to do that in FIXUP_TOP_OF_STACK. We lose a number of large instructions there: text data bss dec hex filename 13298 0 0 13298 33f2 entry_64_before.o 12978 0 0 12978 32b2 entry_64.o What's more important, we convert two "MOVQ $imm,off(%rsp)" to "PUSH $imm" (the ones which fill pt_regs->cs,ss). Before this patch, placing them on fast path was slowing it down by two cycles: this form of MOV is very large, 12 bytes, and this probably reduces decode bandwidth to one instruction per cycle when CPU sees them. Therefore they were living in FIXUP_TOP_OF_STACK instead (away from fast path). "PUSH $imm" is a small 2-byte instruction. Moving it to fast path does not slow it down in my measurements. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	ef593260f0	x86/asm/entry: Get rid of KERNEL_STACK_OFFSET PER_CPU_VAR(kernel_stack) was set up in a way where it points five stack slots below the top of stack. Presumably, it was done to avoid one "sub $5*8,%rsp" in syscall/sysenter code paths, where iret frame needs to be created by hand. Ironically, none of them benefits from this optimization, since all of them need to allocate additional data on stack (struct pt_regs), so they still have to perform subtraction. This patch eliminates KERNEL_STACK_OFFSET. PER_CPU_VAR(kernel_stack) now points directly to top of stack. pt_regs allocations are adjusted to allocate iret frame as well. Hopefully we can merge it later with 32-bit specific PER_CPU_VAR(cpu_current_top_of_stack) variable... Net result in generated code is that constants in several insns are changed. This change is necessary for changing struct pt_regs creation in SYSCALL64 code path from MOV to PUSH instructions. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Denys Vlasenko	b3fe8ba320	x86/asm/entry/64: Change the THREAD_INFO() definition to not depend on KERNEL_STACK_OFFSET This changes the THREAD_INFO() definition and all its callsites so that they do not count stack position from (top of stack - KERNEL_STACK_OFFSET), but from top of stack. Semi-mysterious expressions THREAD_INFO(%rsp,RIP) - "why RIP??" are now replaced by more logical THREAD_INFO(%rsp,SIZEOF_PTREGS) - "calculate thread_info's address using information that rsp is SIZEOF_PTREGS bytes below top of stack". While at it, replace "(off)-THREAD_SIZE(reg)" with equivalent "((off)-THREAD_SIZE)(reg)". The form without parentheses falsely looks like we invoke THREAD_SIZE() macro. Improve comment atop THREAD_INFO macro definition. This patch does not change generated code (verified by objdump). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:37 +01:00
Aravind Gopalakrishnan	43eaa2a1ad	x86/mce: Define mce_severity function pointer Rename mce_severity() to mce_severity_intel() and assign the mce_severity function pointer to mce_severity_amd() during init on AMD. This way, we can avoid a test to call mce_severity_amd every time we get into mce_severity(). And it's cleaner to do it this way. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Suggested-by: Tony Luck <tony.luck@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1427125373-2918-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-03-24 12:14:15 +01:00
Aravind Gopalakrishnan	bf80bbd7dc	x86/mce: Add an AMD severities-grading function Add a severities function that caters to AMD processors. This allows us to do some vendor-specific work within the function if necessary. Also, introduce a vendor flag bitfield for vendor-specific settings. The severities code uses this to define error scope based on the prescence of the flags field. This is based off of work by Boris Petkov. Testing details: Fam10h, Model 9h (Greyhound) Fam15h: Models 0h-0fh (Orochi), 30h-3fh (Kaveri) and 60h-6fh (Carrizo), Fam16h Model 00h-0fh (Kabini) Boris: Intel SNB AMD K8 (JH-E0) Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/1427125373-2918-2-git-send-email-Aravind.Gopalakrishnan@amd.com [ Fixup build, clean up comments. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2015-03-24 12:13:34 +01:00
Denys Vlasenko	a76c7f4604	x86/asm/entry/64: Fold syscall32_cpu_init() into its sole user Having syscall32/sysenter32 initialization in a separate tiny function, called from within a function that is already syscall init specific, serves no real purpose. Its existense also caused an unintended effect of having wrmsrl(MSR_CSTAR) performed twice: once we set it to a dummy function returning -ENOSYS, and immediately after (if CONFIG_IA32_EMULATION), we set it to point to the proper syscall32 entry point, ia32_cstar_target. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 08:20:51 +01:00
Marcelo Tosatti	0a4e6be9ca	x86: kvm: Revert "remove sched notifier for cross-cpu migrations" The following point: 2. per-CPU pvclock time info is updated if the underlying CPU changes. Is not true anymore since "KVM: x86: update pvclock area conditionally, on cpu migration". Add task migration notification back. Problem noticed by Andy Lutomirski. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: stable@kernel.org # 3.11+	2015-03-23 20:22:48 -03:00
Greg Kroah-Hartman	caa445d808	Merge 4.0-rc5 into tty-next We want the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-03-23 21:45:24 +01:00
Denys Vlasenko	34061f134f	x86/asm/entry/64: Fix incorrect comment The recent old_rsp -> rsp_scratch rename also changed this comment, but in this case "old_rsp" was not referring to PER_CPU(old_rsp). Fix this. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427115839-6397-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 14:28:54 +01:00

... 39 40 41 42 43 ...

12766 Commits