Commit Graph

10246 Commits

Author SHA1 Message Date
Thomas Gleixner
d0a7166bc7 x86/smp: Move smp_function_call implementations into IPI code
Move it where it belongs. That allows to keep all the shorthand logic in
one place.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105220.677835995@linutronix.de
2019-07-25 16:12:01 +02:00
Thomas Gleixner
22ca7ee933 x86/apic: Provide and use helper for send_IPI_allbutself()
To support IPI shorthands wrap invocations of apic->send_IPI_allbutself()
in a helper function, so the static key controlling the shorthand mode is
only in one place.

Fixup all callers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105220.492691679@linutronix.de
2019-07-25 16:12:00 +02:00
Thomas Gleixner
6a1cb5f5c6 x86/apic: Add static key to Control IPI shorthands
The IPI shorthand functionality delivers IPI/NMI broadcasts to all CPUs in
the system. This can have similar side effects as the MCE broadcasting when
CPUs are waiting in the BIOS or are offlined.

The kernel tracks already the state of offlined CPUs whether they have been
brought up at least once so that the CR4 MCE bit is set to make sure that
MCE broadcasts can't brick the machine.

Utilize that information and compare it to the cpu_present_mask. If all
present CPUs have been brought up at least once then the broadcast side
effect is mitigated by disabling regular interrupt/IPI delivery in the APIC
itself and by the cpu offline check at the begin of the NMI handler.

Use a static key to switch between broadcasting via shorthands or sending
the IPI/NMI one by one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105220.386410643@linutronix.de
2019-07-25 16:12:00 +02:00
Thomas Gleixner
60dcaad573 x86/hotplug: Silence APIC and NMI when CPU is dead
In order to support IPI/NMI broadcasting via the shorthand mechanism side
effects of shorthands need to be mitigated:

 Shorthand IPIs and NMIs hit all CPUs including unplugged CPUs

Neither of those can be handled on unplugged CPUs for obvious reasons.

It would be trivial to just fully disable the APIC via the enable bit in
MSR_APICBASE. But that's not possible because clearing that bit on systems
based on the 3 wire APIC bus would require a hardware reset to bring it
back as the APIC would lose track of bus arbitration. On systems with FSB
delivery APICBASE could be disabled, but it has to be guaranteed that no
interrupt is sent to the APIC while in that state and it's not clear from
the SDM whether it still responds to INIT/SIPI messages.

Therefore stay on the safe side and switch the APIC into soft disabled mode
so it won't deliver any regular vector to the CPU.

NMIs are still propagated to the 'dead' CPUs. To mitigate that add a check
for the CPU being offline on early nmi entry and if so bail.

Note, this cannot use the stop/restart_nmi() magic which is used in the
alternatives code. A dead CPU cannot invoke nmi_enter() or anything else
due to RCU and other reasons.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907241723290.1791@nanos.tec.linutronix.de
2019-07-25 16:11:59 +02:00
Thomas Gleixner
9c92374b63 x86/cpu: Move arch_smt_update() to a neutral place
arch_smt_update() will be used to control IPI/NMI broadcasting via the
shorthand mechanism. Keeping it in the bugs file and calling the apic
function from there is possible, but not really intuitive.

Move it to a neutral place and invoke the bugs function from there.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.910317273@linutronix.de
2019-07-25 16:11:59 +02:00
Thomas Gleixner
82e5747823 x86/apic/uv: Make x2apic_extra_bits static
Not used outside of the UV apic source.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.725264153@linutronix.de
2019-07-25 16:11:58 +02:00
Thomas Gleixner
ba77b2a02e x86/apic: Move apic_flat_64 header into apic directory
Only used locally.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.526508168@linutronix.de
2019-07-25 16:11:58 +02:00
Thomas Gleixner
8b542da372 x86/apic: Move ipi header into apic directory
Only used locally.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.434738036@linutronix.de
2019-07-25 16:11:57 +02:00
Thomas Gleixner
cdc86c9d1f x86/apic: Move IPI inlines into ipi.c
No point in having them in an header file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.252225936@linutronix.de
2019-07-25 16:11:57 +02:00
Masahiro Yamada
d9c5252295 treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
UAPI headers licensed under GPL are supposed to have exception
"WITH Linux-syscall-note" so that they can be included into non-GPL
user space application code.

The exception note is missing in some UAPI headers.

Some of them slipped in by the treewide conversion commit b24413180f
("License cleanup: add SPDX GPL-2.0 license identifier to files with
no license"). Just run:

  $ git show --oneline b24413180f -- arch/x86/include/uapi/asm/

I believe they are not intentional, and should be fixed too.

This patch was generated by the following script:

  git grep -l --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
    -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild |
  while read file
  do
          sed -i -e '/[[:space:]]OR[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
          -e '/[[:space:]]or[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
          -e '/[[:space:]]OR[[:space:]]/!{/[[:space:]]or[[:space:]]/!s/\(GPL-[^[:space:]]*\)/\1 WITH Linux-syscall-note/g}' $file
  done

After this patch is applied, there are 5 UAPI headers that do not contain
"WITH Linux-syscall-note". They are kept untouched since this exception
applies only to GPL variants.

  $ git grep --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
    -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild
  include/uapi/drm/panfrost_drm.h:/* SPDX-License-Identifier: MIT */
  include/uapi/linux/batman_adv.h:/* SPDX-License-Identifier: MIT */
  include/uapi/linux/qemu_fw_cfg.h:/* SPDX-License-Identifier: BSD-3-Clause */
  include/uapi/linux/vbox_err.h:/* SPDX-License-Identifier: MIT */
  include/uapi/linux/virtio_iommu.h:/* SPDX-License-Identifier: BSD-3-Clause */

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-25 11:05:10 +02:00
Linus Torvalds
7626077457 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Bugfixes, a pvspinlock optimization, and documentation moving"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption
  Documentation: move Documentation/virtual to Documentation/virt
  KVM: nVMX: Set cached_vmcs12 and cached_shadow_vmcs12 NULL after free
  KVM: X86: Dynamically allocate user_fpu
  KVM: X86: Fix fpu state crash in kvm guest
  Revert "kvm: x86: Use task structs fpu field for user"
  KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
2019-07-24 09:46:13 -07:00
Jan Kiszka
21e450d21c x86/mm: Avoid redundant interrupt disable in load_mm_cr4()
load_mm_cr4() is always called with interrupts disabled from:

 - switch_mm_irqs_off()
 - refresh_pce(), which is a on_each_cpu() callback

Thus, disabling interrupts in cr4_set/clear_bits() is redundant.

Implement cr4_set/clear_bits_irqsoff() helpers, rename load_mm_cr4() to
load_mm_cr4_irqsoff() and use the new helpers. The new helpers do not need
a lockdep assert as __cr4_set() has one already.

The renaming in combination with the checks in __cr4_set() ensure that any
changes in the boundary conditions at the call sites will be detected.

[ tglx: Massaged change log ]

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/0fbbcb64-5f26-4ffb-1bb9-4f5f48426893@siemens.com
2019-07-24 14:43:37 +02:00
Masahiro Yamada
bdd50d7421 x86/bitops: Use __builtin_constant_p() directly instead of IS_IMMEDIATE()
__builtin_constant_p(nr) is used everywhere now. It does not make much
sense to define IS_IMMEDIATE() as its alias.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190723074415.26811-1-yamada.masahiro@socionext.com
2019-07-23 13:44:18 +02:00
Masahiro Yamada
7010105321 x86/build: Remove unneeded uapi asm-generic wrappers
These are listed in include/uapi/asm-generic/Kbuild, so Kbuild will
automatically generate them.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190723112646.14046-1-yamada.masahiro@socionext.com
2019-07-23 13:42:14 +02:00
Wanpeng Li
d9a710e5fc KVM: X86: Dynamically allocate user_fpu
After reverting commit 240c35a378 (kvm: x86: Use task structs fpu field
for user), struct kvm_vcpu is 19456 bytes on my server, PAGE_ALLOC_COSTLY_ORDER(3)
is the order at which allocations are deemed costly to service. In serveless
scenario, one host can service hundreds/thoudands firecracker/kata-container
instances, howerver, new instance will fail to launch after memory is too
fragmented to allocate kvm_vcpu struct on host, this was observed in some
cloud provider product environments.

This patch dynamically allocates user_fpu, kvm_vcpu is 15168 bytes now on my
Skylake server.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-22 13:55:48 +02:00
Paolo Bonzini
ec269475cb Revert "kvm: x86: Use task structs fpu field for user"
This reverts commit 240c35a378
("kvm: x86: Use task structs fpu field for user", 2018-11-06).
The commit is broken and causes QEMU's FPU state to be destroyed
when KVM_RUN is preempted.

Fixes: 240c35a378 ("kvm: x86: Use task structs fpu field for user")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-22 13:55:47 +02:00
Josh Poimboeuf
be261ffce6 x86: Remove X86_FEATURE_MFENCE_RDTSC
AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
They've both had it for a long time, and AMD has had it enabled in Linux
since Spectre v1 was announced.

Back then, there was a proposal to remove the serializing mfence feature
bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
serializing lfence.  At the time, it was (ahem) speculated that some
hypervisors might not yet support its removal, so it remained for the
time being.

Now a year-and-a-half later, it should be safe to remove.

I asked Andrew Cooper about whether it's still needed:

  So if you're virtualised, you've got no choice in the matter.  lfence
  is either dispatch-serialising or not on AMD, and you won't be able to
  change it.

  Furthermore, you can't accurately tell what state the bit is in, because
  the MSR might not be virtualised at all, or may not reflect the true
  state in hardware.  Worse still, attempting to set the bit may not be
  successful even if there isn't a fault for doing so.

  Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of
  things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT).  ISTR other hypervisor
  vendors saying the same, but I don't have any information to hand.

  If you are running under a hypervisor which has been updated, then
  lfence will almost certainly be dispatch-serialising in practice, and
  you'll almost certainly see the bit already set in DE_CFG.  If you're
  running under a hypervisor which hasn't been patched since Spectre,
  you've already lost in many more ways.

  I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.

So remove it.  This will reduce some code rot, and also make it easier
to hook barrier_nospec() up to a cmdline disable for performance
raisins, without having to need an alternative_3() macro.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/d990aa51e40063acb9888e8c1b688e41355a9588.1562255067.git.jpoimboe@redhat.com
2019-07-22 12:00:51 +02:00
Pingfan Liu
6973210242 x86/realmode: Remove trampoline_status
There is no reader of trampoline_status, it's only written.

It turns out that after commit ce4b1b1650 ("x86/smpboot: Initialize
secondary CPU only if master CPU will wait for it"), trampoline_status is
not needed any more.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1563266424-3472-1-git-send-email-kernelfans@gmail.com
2019-07-22 11:30:18 +02:00
Maya Nakamura
8c3e44bde7 x86/hyperv: Add functions to allocate/deallocate page for Hyper-V
Introduce two new functions, hv_alloc_hyperv_page() and
hv_free_hyperv_page(), to allocate/deallocate memory with the size and
alignment that Hyper-V expects as a page. Although currently they are not
used, they are ready to be used to allocate/deallocate memory on x86 when
their ARM64 counterparts are implemented, keeping symmetry between
architectures with potentially different guest page sizes.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/lkml/alpine.DEB.2.21.1906272334560.32342@nanos.tec.linutronix.de/
Link: https://lore.kernel.org/lkml/87muindr9c.fsf@vitty.brq.redhat.com/
Link: https://lkml.kernel.org/r/706b2e71eb3e587b5f8801e50f090fae2a00e35d.1562916939.git.m.maya.nakamura@gmail.com
2019-07-22 11:06:45 +02:00
Maya Nakamura
fcd3f6222a x86/hyperv: Create and use Hyper-V page definitions
Define HV_HYP_PAGE_SHIFT, HV_HYP_PAGE_SIZE, and HV_HYP_PAGE_MASK because
the Linux guest page size and hypervisor page size concepts are different,
even though they happen to be the same value on x86.

Also, replace PAGE_SIZE with HV_HYP_PAGE_SIZE.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lkml.kernel.org/r/e95111629abf65d016e983f72494cbf110ce605f.1562916939.git.m.maya.nakamura@gmail.com
2019-07-22 11:06:44 +02:00
Gayatri Kammela
018ebca8bd x86/cpufeatures: Enable a new AVX512 CPU feature
Add a new AVX512 instruction group/feature for enumeration in
/proc/cpuinfo: AVX512_VP2INTERSECT.

CPUID.(EAX=7,ECX=0):EDX[bit 8]  AVX512_VP2INTERSECT

Detailed information of CPUID bits for this feature can be found in
the Intel Architecture Intsruction Set Extensions Programming Reference
document (refer to Table 1-2). A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=204215.

Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190717234632.32673-3-gayatri.kammela@intel.com
2019-07-22 10:38:25 +02:00
Andy Lutomirski
6365b842aa x86/syscalls: Split the x32 syscalls into their own table
For unfortunate historical reasons, the x32 syscalls and the x86_64
syscalls are not all numbered the same.  As an example, ioctl() is nr 16 on
x86_64 but 514 on x32.

This has potentially nasty consequences, since it means that there are two
valid RAX values to do ioctl(2) and two invalid RAX values.  The valid
values are 16 (i.e. ioctl(2) using the x86_64 ABI) and (514 | 0x40000000)
(i.e. ioctl(2) using the x32 ABI).

The invalid values are 514 and (16 | 0x40000000).  514 will enter the
"COMPAT_SYSCALL_DEFINE3(ioctl, ...)" entry point with in_compat_syscall()
and in_x32_syscall() returning false, whereas (16 | 0x40000000) will enter
the native entry point with in_compat_syscall() and in_x32_syscall()
returning true.  Both are bogus, and both will exercise code paths in the
kernel and in any running seccomp filters that really ought to be
unreachable.

Splitting out the x32 syscalls into their own tables, allows both bogus
invocations to return -ENOSYS.  I've checked glibc, musl, and Bionic, and
all of them appear to call syscalls with their correct numbers, so this
change should have no effect on them.

There is an added benefit going forward: new syscalls that need special
handling on x32 can share the same number on x32 and x86_64.  This means
that the special syscall range 512-547 can be treated as a legacy wart
instead of something that may need to be extended in the future.

Also add a selftest to verify the new behavior.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/208024256b764312598f014ebfb0a42472c19354.1562185330.git.luto@kernel.org
2019-07-22 10:31:23 +02:00
Andy Lutomirski
45e29d119e x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long
Currently, it's an int.  This is bizarre.  Fortunately, the code using it
still works: ~__X32_SYSCALL_BIT is also int, so, if nr is unsigned long,
then C kindly sign-extends the ~__X32_SYSCALL_BIT part, and it actually
results in the desired value.

This is far more subtle than it deserves to be.  Syscall numbers are, for
all practical purposes, unsigned long, so make __X32_SYSCALL_BIT be
unsigned long.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/99b0d83ad891c67105470a1a6b63243fd63a5061.1562185330.git.luto@kernel.org
2019-07-22 10:31:22 +02:00
Andrew Cooper
83b584d9c6 x86/paravirt: Drop {read,write}_cr8() hooks
There is a lot of infrastructure for functionality which is used
exclusively in __{save,restore}_processor_state() on the suspend/resume
path.

cr8 is an alias of APIC_TASKPRI, and APIC_TASKPRI is saved/restored by
lapic_{suspend,resume}().  Saving and restoring cr8 independently of the
rest of the Local APIC state isn't a clever thing to be doing.

Delete the suspend/resume cr8 handling, which shrinks the size of struct
saved_context, and allows for the removal of both PVOPS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lkml.kernel.org/r/20190715151641.29210-1-andrew.cooper3@citrix.com
2019-07-22 10:12:33 +02:00
Linus Torvalds
c6dd78fcb8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of x86 specific fixes and updates:

   - The CR2 corruption fixes which store CR2 early in the entry code
     and hand the stored address to the fault handlers.

   - Revert a forgotten leftover of the dropped FSGSBASE series.

   - Plug a memory leak in the boot code.

   - Make the Hyper-V assist functionality robust by zeroing the shadow
     page.

   - Remove a useless check for dead processes with LDT

   - Update paravirt and VMware maintainers entries.

   - A few cleanup patches addressing various compiler warnings"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/64: Prevent clobbering of saved CR2 value
  x86/hyper-v: Zero out the VP ASSIST PAGE on allocation
  x86, boot: Remove multiple copy of static function sanitize_boot_params()
  x86/boot/compressed/64: Remove unused variable
  x86/boot/efi: Remove unused variables
  x86/mm, tracing: Fix CR2 corruption
  x86/entry/64: Update comments and sanity tests for create_gap
  x86/entry/64: Simplify idtentry a little
  x86/entry/32: Simplify common_exception
  x86/paravirt: Make read_cr2() CALLEE_SAVE
  MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
  x86/process: Delete useless check for dead process with LDT
  x86: math-emu: Hide clang warnings for 16-bit overflow
  x86/e820: Use proper booleans instead of 0/1
  x86/apic: Silence -Wtype-limits compiler warnings
  x86/mm: Free sme_early_buffer after init
  x86/boot: Fix memory leak in default_get_smp_config()
  Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test
2019-07-20 11:24:49 -07:00
Linus Torvalds
e6023adc5c Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:

 - A collection of objtool fixes which address recent fallout partially
   exposed by newer toolchains, clang, BPF and general code changes.

 - Force USER_DS for user stack traces

[ Note: the "objtool fixes" are not all to objtool itself, but for
  kernel code that triggers objtool warnings.

  Things like missing function size annotations, or code that confuses
  the unwinder etc.   - Linus]

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  objtool: Support conditional retpolines
  objtool: Convert insn type to enum
  objtool: Fix seg fault on bad switch table entry
  objtool: Support repeated uses of the same C jump table
  objtool: Refactor jump table code
  objtool: Refactor sibling call detection logic
  objtool: Do frame pointer check before dead end check
  objtool: Change dead_end_function() to return boolean
  objtool: Warn on zero-length functions
  objtool: Refactor function alias logic
  objtool: Track original function across branches
  objtool: Add mcsafe_handle_tail() to the uaccess safe list
  bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()
  x86/uaccess: Remove redundant CLACs in getuser/putuser error paths
  x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail()
  x86/uaccess: Remove ELF function annotation from copy_user_handle_tail()
  x86/head/64: Annotate start_cpu0() as non-callable
  x86/entry: Fix thunk function ELF sizes
  x86/kvm: Don't call kvm_spurious_fault() from .fixup
  x86/kvm: Replace vmx_vmenter()'s call to kvm_spurious_fault() with UD2
  ...
2019-07-20 10:45:15 -07:00
Linus Torvalds
07ab9d5bc5 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
 "Mostly bugfixes, but also:

   - s390 support for KVM selftests

   - LAPIC timer offloading to housekeeping CPUs

   - Extend an s390 optimization for overcommitted hosts to all
     architectures

   - Debugging cleanups and improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: x86: Add fixed counters to PMU filter
  KVM: nVMX: do not use dangling shadow VMCS after guest reset
  KVM: VMX: dump VMCS on failed entry
  KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
  KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup
  KVM: Boost vCPUs that are delivering interrupts
  KVM: selftests: Remove superfluous define from vmx.c
  KVM: SVM: Fix detection of AMD Errata 1096
  KVM: LAPIC: Inject timer interrupt via posted interrupt
  KVM: LAPIC: Make lapic timer unpinned
  KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters
  KVM: nVMX: Ignore segment base for VMX memory operand when segment not FS or GS
  kvm: x86: ioapic and apic debug macros cleanup
  kvm: x86: some tsc debug cleanup
  kvm: vmx: fix coccinelle warnings
  x86: kvm: avoid constant-conversion warning
  x86: kvm: avoid -Wsometimes-uninitized warning
  KVM: x86: expose AVX512_BF16 feature to guest
  KVM: selftests: enable pgste option for the linker on s390
  KVM: selftests: Move kvm_create_max_vcpus test to generic code
  ...
2019-07-20 10:20:27 -07:00
Eric Hankland
30cd860432 KVM: x86: Add fixed counters to PMU filter
Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist
fixed counters.

Signed-off-by: Eric Hankland <ehankland@google.com>
[No need to check padding fields for zero. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-20 09:00:48 +02:00
Linus Torvalds
b5d72dda89 Merge tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
 "Fixes and features:

   - A series to introduce a common command line parameter for disabling
     paravirtual extensions when running as a guest in virtualized
     environment

   - A fix for int3 handling in Xen pv guests

   - Removal of the Xen-specific tmem driver as support of tmem in Xen
     has been dropped (and it was experimental only)

   - A security fix for running as Xen dom0 (XSA-300)

   - A fix for IRQ handling when offlining cpus in Xen guests

   - Some small cleanups"

* tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: let alloc_xenballooned_pages() fail if not enough memory free
  xen/pv: Fix a boot up hang revealed by int3 self test
  x86/xen: Add "nopv" support for HVM guest
  x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
  xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete
  x86: Add "nopv" parameter to disable PV extensions
  x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
  xen: remove tmem driver
  Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"
  xen/events: fix binding user event channels to cpus
2019-07-19 11:41:26 -07:00
Josh Poimboeuf
3901336ed9 x86/kvm: Don't call kvm_spurious_fault() from .fixup
After making a change to improve objtool's sibling call detection, it
started showing the following warning:

  arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame

The problem is the ____kvm_handle_fault_on_reboot() macro.  It does a
fake call by pushing a fake RIP and doing a jump.  That tricks the
unwinder into printing the function which triggered the exception,
rather than the .fixup code.

Instead of the hack to make it look like the original function made the
call, just change the macro so that the original function actually does
make the call.  This allows removal of the hack, and also makes objtool
happy.

I triggered a vmx instruction exception and verified that the stack
trace is still sane:

  kernel BUG at arch/x86/kvm/x86.c:358!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16
  Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017
  RIP: 0010:kvm_spurious_fault+0x5/0x10
  Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
  RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246
  RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000
  RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0
  RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000
  R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0
  R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000
  FS:  00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   loaded_vmcs_init+0x4f/0xe0
   alloc_loaded_vmcs+0x38/0xd0
   vmx_create_vcpu+0xf7/0x600
   kvm_vm_ioctl+0x5e9/0x980
   ? __switch_to_asm+0x40/0x70
   ? __switch_to_asm+0x34/0x70
   ? __switch_to_asm+0x40/0x70
   ? __switch_to_asm+0x34/0x70
   ? free_one_page+0x13f/0x4e0
   do_vfs_ioctl+0xa4/0x630
   ksys_ioctl+0x60/0x90
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x55/0x1c0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7fa349b1ee5b

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.com
2019-07-18 21:01:04 +02:00
Josh Poimboeuf
083db67648 x86/paravirt: Fix callee-saved function ELF sizes
The __raw_callee_save_*() functions have an ELF symbol size of zero,
which confuses objtool and other tools.

Fixes a bunch of warnings like the following:

  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pte_val() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pgd_val() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pte() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pgd() is missing an ELF size annotation

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/afa6d49bb07497ca62e4fc3b27a2d0cece545b4e.1563413318.git.jpoimboe@redhat.com
2019-07-18 21:01:03 +02:00
Linus Torvalds
818e95c768 Merge tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "The main changes in this release include:

   - Add user space specific memory reading for kprobes

   - Allow kprobes to be executed earlier in boot

  The rest are mostly just various clean ups and small fixes"

* tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  tracing: Make trace_get_fields() global
  tracing: Let filter_assign_type() detect FILTER_PTR_STRING
  tracing: Pass type into tracing_generic_entry_update()
  ftrace/selftest: Test if set_event/ftrace_pid exists before writing
  ftrace/selftests: Return the skip code when tracing directory not configured in kernel
  tracing/kprobe: Check registered state using kprobe
  tracing/probe: Add trace_event_call accesses APIs
  tracing/probe: Add probe event name and group name accesses APIs
  tracing/probe: Add trace flag access APIs for trace_probe
  tracing/probe: Add trace_event_file access APIs for trace_probe
  tracing/probe: Add trace_event_call register API for trace_probe
  tracing/probe: Add trace_probe init and free functions
  tracing/uprobe: Set print format when parsing command
  tracing/kprobe: Set print format right after parsed command
  kprobes: Fix to init kprobes in subsys_initcall
  tracepoint: Use struct_size() in kmalloc()
  ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS
  ftrace: Enable trampoline when rec count returns back to one
  tracing/kprobe: Do not run kprobe boot tests if kprobe_event is on cmdline
  tracing: Make a separate config for trace event self tests
  ...
2019-07-18 11:51:00 -07:00
Peter Zijlstra
a0d14b8909 x86/mm, tracing: Fix CR2 corruption
Despite the current efforts to read CR2 before tracing happens there still
exist a number of possible holes:

  idtentry page_fault             do_page_fault           has_error_code=1
    call error_entry
      TRACE_IRQS_OFF
        call trace_hardirqs_off*
          #PF // modifies CR2

      CALL_enter_from_user_mode
        __context_tracking_exit()
          trace_user_exit(0)
            #PF // modifies CR2

    call do_page_fault
      address = read_cr2(); /* whoopsie */

And similar for i386.

Fix it by pulling the CR2 read into the entry code, before any of that
stuff gets a chance to run and ruin things.

Reported-by: He Zhe <zhe.he@windriver.com>
Reported-by: Eiichi Tsukata <devel@etsukata.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: bp@alien8.de
Cc: rostedt@goodmis.org
Cc: torvalds@linux-foundation.org
Cc: hpa@zytor.com
Cc: dave.hansen@linux.intel.com
Cc: jgross@suse.com
Cc: joel@joelfernandes.org
Link: https://lkml.kernel.org/r/20190711114336.116812491@infradead.org

Debugged-by: Steven Rostedt <rostedt@goodmis.org>
2019-07-17 23:17:38 +02:00
Peter Zijlstra
55aedddb61 x86/paravirt: Make read_cr2() CALLEE_SAVE
The one paravirt read_cr2() implementation (Xen) is actually quite trivial
and doesn't need to clobber anything other than the return register.

Making read_cr2() CALLEE_SAVE avoids all the PUSH/POP nonsense and allows
more convenient use from assembly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: bp@alien8.de
Cc: rostedt@goodmis.org
Cc: luto@kernel.org
Cc: torvalds@linux-foundation.org
Cc: hpa@zytor.com
Cc: dave.hansen@linux.intel.com
Cc: zhe.he@windriver.com
Cc: joel@joelfernandes.org
Cc: devel@etsukata.com
Link: https://lkml.kernel.org/r/20190711114335.887392493@infradead.org
2019-07-17 23:17:37 +02:00
Zhenzhong Duan
b23e5844df xen/pv: Fix a boot up hang revealed by int3 self test
Commit 7457c0da02 ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in int3 exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[    0.772876] general protection fault: 0000 [#1] SMP NOPTI
[    0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[    0.772893] RIP: e030:int3_magic+0x0/0x7
[    0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246
[    0.773334] Call Trace:
[    0.773334]  alternative_instructions+0x3d/0x12e
[    0.773334]  check_bugs+0x7c9/0x887
[    0.773334]  ? __get_locked_pte+0x178/0x1f0
[    0.773334]  start_kernel+0x4ff/0x535
[    0.773334]  ? set_init_arg+0x55/0x55
[    0.773334]  xen_start_kernel+0x571/0x57a

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack. To convert back to "normal" looking exceptions,
the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.

E.g. Extracting 'xen_pv_trap xenint3' we have:
xen_xenint3:
 pop %rcx;
 pop %r11;
 jmp xenint3

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
bef6e0ae74 x86/xen: Add "nopv" support for HVM guest
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Remove static from bool_x86_init_noop/x86_op_int_noop so they could be
used globally. Move xen_platform_hvm() after xen_hvm_guest_late_init()
to avoid compile error.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
cc8f3b4dd2 x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch reports warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
3097834637 x86: Add "nopv" parameter to disable PV extensions
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Zhenzhong Duan
1b37683cda x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Robin Murphy
175967318c mm: introduce ARCH_HAS_PTE_DEVMAP
ARCH_HAS_ZONE_DEVICE is somewhat meaningless in itself, and combined
with the long-out-of-date comment can lead to the impression than an
architecture may just enable it (since __add_pages() now "comprehends
device memory" for itself) and expect things to work.

In practice, however, ZONE_DEVICE users have little chance of
functioning correctly without __HAVE_ARCH_PTE_DEVMAP, so let's clean
that up the same way as ARCH_HAS_PTE_SPECIAL and make it the proper
dependency so the real situation is clearer.

Link: http://lkml.kernel.org/r/87554aa78478a02a63f2c4cf60a847279ae3eb3b.1558547956.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Stephen Kitt
3a7f0adfe7 arch/*: remove unused isa_page_to_bus()
isa_page_to_bus() is deprecated and is no longer used anywhere.  Remove
it entirely.

Link: http://lkml.kernel.org/r/20190613161155.16946-1-steve@sk2.org
Signed-off-by: Stephen Kitt <steve@sk2.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:22 -07:00
Qian Cai
ec63355869 x86/apic: Silence -Wtype-limits compiler warnings
There are many compiler warnings like this,

In file included from ./arch/x86/include/asm/smp.h:13,
                 from ./arch/x86/include/asm/mmzone_64.h:11,
                 from ./arch/x86/include/asm/mmzone.h:5,
                 from ./include/linux/mmzone.h:969,
                 from ./include/linux/gfp.h:6,
                 from ./include/linux/mm.h:10,
                 from arch/x86/kernel/apic/io_apic.c:34:
arch/x86/kernel/apic/io_apic.c: In function 'check_timer':
./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
expression >= 0 is always true [-Wtype-limits]
   if ((v) <= apic_verbosity) \
           ^~
arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro
'apic_printk'
  apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X "
  ^~~~~~~~~~~
./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
expression >= 0 is always true [-Wtype-limits]
   if ((v) <= apic_verbosity) \
           ^~
arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro
'apic_printk'
    apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: "
    ^~~~~~~~~~~

APIC_QUIET is 0, so silence them by making apic_verbosity type int.

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw
2019-07-16 23:13:48 +02:00
Linus Torvalds
5516745311 Merge tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:
 "Gathered a bunch of x86 platform driver changes. It's rather big,
  since includes two big refactors and completely new driver:

   - ASUS WMI driver got a big refactoring in order to support the TUF
     Gaming laptops. Besides that, the regression with backlight being
     permanently off on various EeePC laptops has been fixed.

   - Accelerometer on HP ProBook 450 G0 shows wrong measurements due to
     X axis being inverted. This has been fixed.

   - Intel PMC core driver has been extended to be ACPI enumerated if
     the DSDT provides device with _HID "INT33A1". This allows to
     convert the driver to be pure platform and support new hardware
     purely based on ACPI DSDT.

   - From now on the Intel Speed Select Technology is supported thru a
     corresponding driver. This driver provides an access to the
     features of the ISST, such as Performance Profile, Core Power, Base
     frequency and Turbo Frequency.

   - Mellanox platform drivers has been refactored and now extended to
     support more systems, including new coming ones.

   - The OLPC XO-1.75 platform is now supported.

   - CB4063 Beckhoff Automation board is using PMC clocks, provided via
     pmc_atom driver, for ethernet controllers in a way that they can't
     be managed by the clock driver. The quirk has been extended to
     cover this case.

   - Touchscreen on Chuwi Hi10 Plus tablet has been enabled. Meanwhile
     the information of Chuwi Hi10 Air has been fixed to cover more
     models based on the same platform.

   - Xiaomi notebooks have WMI interface enabled. Thus, the driver to
     support it has been provided. It required some extension of the
     generic WMI library, which allows to propagate opaque context to
     the ->probe() of the individual drivers.

  This release includes debugfs clean up from Greg KH for several
  drivers that drop return code check and make debugfs absence or
  failure non-fatal.

  Also miscellaneous fixes here and there, mostly for Acer WMI and
  various Intel drivers"

* tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86: (74 commits)
  platform/x86: Fix PCENGINES_APU2 Kconfig warning
  tools/power/x86/intel-speed-select: Add .gitignore file
  platform/x86: mlx-platform: Fix error handling in mlxplat_init()
  platform/x86: intel_pmc_core: Attach using APCI HID "INT33A1"
  platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds
  platform/x86: asus-wmi: Use dev_get_drvdata()
  Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
  platform/x86: mlx-platform: Add more reset cause attributes
  platform/x86: mlx-platform: Modify DMI matching order
  platform/x86: mlx-platform: Add regmap structure for the next generation systems
  platform/x86: mlx-platform: Change API for i2c-mlxcpld driver activation
  platform/x86: mlx-platform: Move regmap initialization before all drivers activation
  MAINTAINERS: Update for Intel Speed Select Technology
  tools/power/x86: A tool to validate Intel Speed Select commands
  platform/x86: ISST: Restore state on resume
  platform/x86: ISST: Add Intel Speed Select PUNIT MSR interface
  platform/x86: ISST: Add Intel Speed Select mailbox interface via MSRs
  platform/x86: ISST: Add Intel Speed Select mailbox interface via PCI
  platform/x86: ISST: Add Intel Speed Select mmio interface
  platform/x86: ISST: Add IOCTL to Translate Linux logical CPU to PUNIT CPU number
  ...
2019-07-14 16:51:47 -07:00
Linus Torvalds
5f26f11436 Merge tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
 "The asm-generic changes for 5.3 consist of a cleanup series to remove
  ptrace.h from Christoph Hellwig, who explains:

    'asm-generic/ptrace.h is a little weird in that it doesn't actually
     implement any functionality, but it provided multiple layers of
     macros that just implement trivial inline functions. We implement
     those directly in the few architectures and be off with a much
     simpler design.'

  at https://lore.kernel.org/lkml/20190624054728.30966-1-hch@lst.de/"

* tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: remove ptrace.h
  x86: don't use asm-generic/ptrace.h
  sh: don't use asm-generic/ptrace.h
  powerpc: don't use asm-generic/ptrace.h
  arm64: don't use asm-generic/ptrace.h
2019-07-12 15:41:33 -07:00
Linus Torvalds
39d7530d74 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
 "ARM:
   - support for chained PMU counters in guests
   - improved SError handling
   - handle Neoverse N1 erratum #1349291
   - allow side-channel mitigation status to be migrated
   - standardise most AArch64 system register accesses to msr_s/mrs_s
   - fix host MPIDR corruption on 32bit
   - selftests ckleanups

  x86:
   - PMU event {white,black}listing
   - ability for the guest to disable host-side interrupt polling
   - fixes for enlightened VMCS (Hyper-V pv nested virtualization),
   - new hypercall to yield to IPI target
   - support for passing cstate MSRs through to the guest
   - lots of cleanups and optimizations

  Generic:
   - Some txt->rST conversions for the documentation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits)
  Documentation: virtual: Add toctree hooks
  Documentation: kvm: Convert cpuid.txt to .rst
  Documentation: virtual: Convert paravirt_ops.txt to .rst
  KVM: x86: Unconditionally enable irqs in guest context
  KVM: x86: PMU Event Filter
  kvm: x86: Fix -Wmissing-prototypes warnings
  KVM: Properly check if "page" is valid in kvm_vcpu_unmap
  KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register
  KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane
  kvm: LAPIC: write down valid APIC registers
  KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
  KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register
  KVM: arm/arm64: Add save/restore support for firmware workaround state
  arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests
  KVM: arm/arm64: Support chained PMU counters
  KVM: arm/arm64: Remove pmc->bitmask
  KVM: arm/arm64: Re-create event when setting counter value
  KVM: arm/arm64: Extract duplicated code to own function
  KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions
  KVM: LAPIC: ARBPRI is a reserved register for x2APIC
  ...
2019-07-12 15:35:14 -07:00
Linus Torvalds
16c97650a5 Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyper-v updates from Sasha Levin:

 - Add a module description to the Hyper-V vmbus module.

 - Rework some vmbus code to separate architecture specifics out to
   arch/x86/. This is part of the work of adding arm64 support to
   Hyper-V.

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Break out ISA independent parts of mshyperv.h
  drivers: hv: Add a module description line to the hv_vmbus driver
2019-07-12 15:28:38 -07:00
Mike Rapoport
5fba4af445 asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
Most architectures have identical or very similar implementation of
pte_alloc_one_kernel(), pte_alloc_one(), pte_free_kernel() and
pte_free().

Add a generic implementation that can be reused across architectures and
enable its use on x86.

The generic implementation uses

	GFP_KERNEL | __GFP_ZERO

for the kernel page tables and

	GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT

for the user page tables.

The "base" functions for PTE allocation, namely __pte_alloc_one_kernel()
and __pte_alloc_one() are intended for the architectures that require
additional actions after actual memory allocation or must use non-default
GFP flags.

x86 is switched to use generic pte_alloc_one_kernel(), pte_free_kernel() and
pte_free().

x86 still implements pte_alloc_one() to allow run-time control of GFP
flags required for "userpte" command line option.

Link: http://lkml.kernel.org/r/1557296232-15361-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:45 -07:00
Christoph Hellwig
39656e83da mm: lift the x86_32 PAE version of gup_get_pte to common code
The split low/high access is the only non-READ_ONCE version of gup_get_pte
that did show up in the various arch implemenations.  Lift it to common
code and drop the ifdef based arch override.

Link: http://lkml.kernel.org/r/20190625143715.1689-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: James Hogan <jhogan@kernel.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:44 -07:00
Christoph Hellwig
26f4c32807 mm: simplify gup_fast_permitted
Pass in the already calculated end value instead of recomputing it, and
leave the end > start check in the callers instead of duplicating them in
the arch code.

Link: http://lkml.kernel.org/r/20190625143715.1689-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: James Hogan <jhogan@kernel.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:44 -07:00
Marco Elver
751ad98d5f asm-generic, x86: add bitops instrumentation for KASAN
This adds a new header to asm-generic to allow optionally instrumenting
architecture-specific asm implementations of bitops.

This change includes the required change for x86 as reference and
changes the kernel API doc to point to bitops-instrumented.h instead.
Rationale: the functions in x86's bitops.h are no longer the kernel API
functions, but instead the arch_ prefixed functions, which are then
instrumented via bitops-instrumented.h.

Other architectures can similarly add support for asm implementations of
bitops.

The documentation text was derived from x86 and existing bitops
asm-generic versions: 1) references to x86 have been removed; 2) as a
result, some of the text had to be reworded for clarity and consistency.

Tested using lib/test_kasan with bitops tests (pre-requisite patch).
Bugzilla ref: https://bugzilla.kernel.org/show_bug.cgi?id=198439

Link: http://lkml.kernel.org/r/20190613125950.197667-4-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:42 -07:00