Commit Graph

31838 Commits

Author SHA1 Message Date
David S. Miller
ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
Brajeswar Ghosh
fc5014cc55 x86/asm-prototypes: Remove duplicate include <asm/page.h>
Remove asm/page.h which is included more than once.

Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: jrdr.linux@gmail.com
Cc: sabyasachi.linux@gmail.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190128161847.GA2318@hp-pavilion-15-notebook-pc-brajeswar
2019-01-28 18:32:34 +01:00
Masahiro Yamada
afa974b771 kbuild: add real-prereqs shorthand for $(filter-out FORCE,$^)
In Kbuild, if_changed and friends must have FORCE as a prerequisite.

Hence, $(filter-out FORCE,$^) or $(filter-out $(PHONY),$^) is a common
idiom to get the names of all the prerequisites except phony targets.

Add real-prereqs as a shorthand.

Note:
We cannot replace $(filter %.o,$^) in cmd_link_multi-m because $^ may
include auto-generated dependencies from the .*.cmd file when a single
object module is changed into a multi object module. Refer to commit
69ea912fda ("kbuild: remove unneeded link_multi_deps"). I added some
comment to avoid accidental breakage.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
2019-01-28 09:11:17 +09:00
Linus Torvalds
8a5f06056a Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Fix the swapped outb() parameters in the KASLR code

   - Fix the PKEY handling at fork which missed to preserve the pkey
     state for the child. Comes with a test case to validate that.

   - Fix the entry stack handling for XEN PV to respect that XEN PV
     systems enter the function already on the current thread stack and
     not on the trampoline.

   - Fix kexec load failure caused by using a stale value when the
     kexec_buf structure is reused for subsequent allocations.

   - Fix a bogus sizeof() in the memory encryption code

   - Enforce PCI dependency for the Intel Low Power Subsystem

   - Enforce PCI_LOCKLESS_CONFIG when PCI is enabled"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Select PCI_LOCKLESS_CONFIG if PCI is enabled
  x86/entry/64/compat: Fix stack switching for XEN PV
  x86/kexec: Fix a kexec_file_load() failure
  x86/mm/mem_encrypt: Fix erroneous sizeof()
  x86/selftests/pkeys: Fork() to check for state being preserved
  x86/pkeys: Properly copy pkey state at fork()
  x86/kaslr: Fix incorrect i8254 outb() parameters
  x86/intel/lpss: Make PCI dependency explicit
2019-01-27 12:02:00 -08:00
Linus Torvalds
351e1aa6cb Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer fixes from Thomas Gleixner:
 "Two commits which were missed to be sent during the merge window.

   - The TSC calibration fix turns out to be more urgent as recent
     Skylake-X systems seem to have massive trouble with calibration
     disturbance. This should go back into stable for that reason and it
     the risk of breakage is rather low.

   - Drop an unused define"

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hpet: Remove unused FSEC_PER_NSEC define
  x86/tsc: Make calibration refinement more robust
2019-01-27 11:57:46 -08:00
Linus Torvalds
1fc7f56db7 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Quite a few fixes for x86: nested virtualization save/restore, AMD
  nested virtualization and virtual APIC, 32-bit fixes, an important fix
  to restore operation on older processors, and a bunch of hyper-v
  bugfixes. Several are marked stable.

  There are also fixes for GCC warnings and for a GCC/objtool interaction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Mark expected switch fall-throughs
  KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
  KVM: selftests: check returned evmcs version range
  x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
  KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
  kvm: selftests: Fix region overlap check in kvm_util
  kvm: vmx: fix some -Wmissing-prototypes warnings
  KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
  svm: Fix AVIC incomplete IPI emulation
  svm: Add warning message for AVIC IPI invalid target
  KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
  KVM: x86: Fix PV IPIs for 32-bit KVM host
  x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
  x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
  kvm: x86/vmx: Use kzalloc for cached_vmcs12
  KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
  KVM: x86: Fix single-step debugging
  x86/kvm/hyper-v: don't announce GUEST IDLE MSR support
2019-01-27 09:21:00 -08:00
Jiong Wang
69f827eb6e x32: bpf: implement jitting of JMP32
This patch implements code-gen for new JMP32 instructions on x32.
Also fixed several reverse xmas tree coding style issues as I am there.

Cc: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26 13:33:02 -08:00
Jiong Wang
3f5d6525f2 x86_64: bpf: implement jitting of JMP32
This patch implements code-gen for new JMP32 instructions on x86_64.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26 13:33:01 -08:00
Gustavo A. R. Silva
6fcebf1302 x86/kernel: Mark expected switch-case fall-throughs
In preparation to enable -Wimplicit-fallthrough by default, mark
switch-case statements where fall-through is intentional, explicitly in
order to fix a couple of -Wimplicit-fallthrough warnings.

Warning level 3 was used: -Wimplicit-fallthrough=3.

 [ bp: Massasge and trim commit message. ]

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: David Wang <davidwang@zhaoxin.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190125183903.GA4712@embeddedor
2019-01-26 11:19:13 +01:00
Gustavo A. R. Silva
89da344629 x86/insn-eval: Mark expected switch-case fall-through
In preparation to enable -Wimplicit-fallthrough by default, mark
switch-case statements where fall-through is intentional, explicitly.

Thus fix the following warning:

  arch/x86/lib/insn-eval.c: In function ‘resolve_default_seg’:
  arch/x86/lib/insn-eval.c:179:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
     if (insn->addr_bytes == 2)
        ^
  arch/x86/lib/insn-eval.c:182:2: note: here
    case -EDOM:
    ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This is part of the ongoing efforts to enable -Wimplicit-fallthrough by
default.

 [ bp: Massage commit message. ]

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190125205520.GA9602@embeddedor
2019-01-26 10:46:42 +01:00
Gustavo A. R. Silva
b2869f28e1 KVM: x86: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/x86/kvm/lapic.c:1037:27: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/lapic.c:1876:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/hyperv.c:1637:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/svm.c:4396:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/mmu.c:4372:36: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:3835:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/x86.c:7938:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:2015:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/x86/kvm/vmx/vmx.c:1773:6: warning: this statement may fall through [-Wimplicit-fallthrough=]

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:29:36 +01:00
Masahiro Yamada
5cd5548ff4 KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.

The reason of having -I. here is to make the incorrectly set
TRACE_INCLUDE_PATH working.

As the comment block in include/trace/define_trace.h says,
TRACE_INCLUDE_PATH should be a relative path to the define_trace.h

Fix the TRACE_INCLUDE_PATH, and remove the iffy include paths.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:12:37 +01:00
Vitaly Kuznetsov
3a2f5773ba x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
Commit e2e871ab2f ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version()
helper") broke EVMCS enablement: to set vmcs_version we now call
nested_get_evmcs_version() but this function checks
enlightened_vmcs_enabled flag which is not yet set so we end up returning
zero.

Fix the issue by re-arranging things in nested_enable_evmcs().

Fixes: e2e871ab2f ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:37 +01:00
Sean Christopherson
5ad6ece869 KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
...along with the function's STACK_FRAME_NON_STANDARD tag.  Moving the
asm blob results in a significantly smaller amount of code that is
marked with STACK_FRAME_NON_STANDARD, which makes it far less likely
that gcc will split the function and trigger a spurious objtool warning.
As a bonus, removing STACK_FRAME_NON_STANDARD from vmx_vcpu_run() allows
the bulk of code to be properly checked by objtool.

Because %rbp is not loaded via VMCS fields, vmx_vcpu_run() must manually
save/restore the host's RBP and load the guest's RBP prior to calling
vmx_vmenter().  Modifying %rbp triggers objtool's stack validation code,
and so vmx_vcpu_run() is tagged with STACK_FRAME_NON_STANDARD since it's
impossible to avoid modifying %rbp.

Unfortunately, vmx_vcpu_run() is also a gigantic function that gcc will
split into separate functions, e.g. so that pieces of the function can
be inlined.  Splitting the function means that the compiled Elf file
will contain one or more vmx_vcpu_run.part.* functions in addition to
a vmx_vcpu_run function.  Depending on where the function is split,
objtool may warn about a "call without frame pointer save/setup" in
vmx_vcpu_run.part.* since objtool's stack validation looks for exact
names when whitelisting functions tagged with STACK_FRAME_NON_STANDARD.

Up until recently, the undesirable function splitting was effectively
blocked because vmx_vcpu_run() was tagged with __noclone.  At the time,
__noclone had an unintended side effect that put vmx_vcpu_run() into a
separate optimization unit, which in turn prevented gcc from inlining
the function (or any of its own function calls) and thus eliminated gcc's
motivation to split the function.  Removing the __noclone attribute
allowed gcc to optimize vmx_vcpu_run(), exposing the objtool warning.

Kudos to Qian Cai for root causing that the fnsplit optimization is what
caused objtool to complain.

Fixes: 453eafbe65 ("KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines")
Tested-by: Qian Cai <cai@lca.pw>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:37 +01:00
Yi Wang
8997f65700 kvm: vmx: fix some -Wmissing-prototypes warnings
We get some warnings when building kernel with W=1:
arch/x86/kvm/vmx/vmx.c:426:5: warning: no previous prototype for ‘kvm_fill_hv_flush_list_func’ [-Wmissing-prototypes]
arch/x86/kvm/vmx/nested.c:58:6: warning: no previous prototype for ‘init_vmcs_shadow_fields’ [-Wmissing-prototypes]

Make them static to fix this.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:35 +01:00
Vitaly Kuznetsov
619ad846fc KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
kvm-unit-tests' eventinj "NMI failing on IDT" test results in NMI being
delivered to the host (L1) when it's running nested. The problem seems to
be: svm_complete_interrupts() raises 'nmi_injected' flag but later we
decide to reflect EXIT_NPF to L1. The flag remains pending and we do NMI
injection upon entry so it got delivered to L1 instead of L2.

It seems that VMX code solves the same issue in prepare_vmcs12(), this was
introduced with code refactoring in commit 5f3d579997 ("KVM: nVMX: Rework
event injection and recovery").

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:35 +01:00
Suravee Suthikulpanit
bb218fbcfa svm: Fix AVIC incomplete IPI emulation
In case of incomplete IPI with invalid interrupt type, the current
SVM driver does not properly emulate the IPI, and fails to boot
FreeBSD guests with multiple vcpus when enabling AVIC.

Fix this by update APIC ICR high/low registers, which also
emulate sending the IPI.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:34 +01:00
Suravee Suthikulpanit
37ef0c4414 svm: Add warning message for AVIC IPI invalid target
Print warning message when IPI target ID is invalid due to one of
the following reasons:
  * In logical mode: cluster > max_cluster (64)
  * In physical mode: target > max_physical (512)
  * Address is not present in the physical or logical ID tables

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:34 +01:00
Sean Christopherson
de81c2f912 KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
KVM hypercalls return a negative value error code in case of a fatal
error, e.g. when the hypercall isn't supported or was made with invalid
parameters.  WARN_ONCE on fatal errors when sending PV IPIs as any such
error all but guarantees an SMP system will hang due to a missing IPI.

Fixes: aaffcfd1e8 ("KVM: X86: Implement PV IPIs in linux guest")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:33 +01:00
Sean Christopherson
1ed199a41c KVM: x86: Fix PV IPIs for 32-bit KVM host
The recognition of the KVM_HC_SEND_IPI hypercall was unintentionally
wrapped in "#ifdef CONFIG_X86_64", causing 32-bit KVM hosts to reject
any and all PV IPI requests despite advertising the feature.  This
results in all KVM paravirtualized guests hanging during SMP boot due
to IPIs never being delivered.

Fixes: 4180bf1b65 ("KVM: X86: Implement "send IPI" hypercall")
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:32 +01:00
Vitaly Kuznetsov
f1adceaf01 x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
We shouldn't probably be suggesting using Enlightened VMCS when it's not
enabled (not supported from guest's point of view). Hyper-V on KVM seems
to be fine either way but let's be consistent.

Fixes: 2bc39970e9 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 19:11:25 +01:00
Vitaly Kuznetsov
1998fd32aa x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
System reset through synthetic MSR is not recommended neither by genuine
Hyper-V nor my QEMU.

Fixes: 2bc39970e9 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 18:53:45 +01:00
Tom Roeder
3a33d030da kvm: x86/vmx: Use kzalloc for cached_vmcs12
This changes the allocation of cached_vmcs12 to use kzalloc instead of
kmalloc. This removes the information leak found by Syzkaller (see
Reported-by) in this case and prevents similar leaks from happening
based on cached_vmcs12.

It also changes vmx_get_nested_state to copy out the full 4k VMCS12_SIZE
in copy_to_user rather than only the size of the struct.

Tested: rebuilt against head, booted, and ran the syszkaller repro
  https://syzkaller.appspot.com/text?tag=ReproC&x=174efca3400000 without
  observing any problems.

Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com
Fixes: 8fcc4b5923
Cc: stable@vger.kernel.org
Signed-off-by: Tom Roeder <tmroeder@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 18:53:10 +01:00
Sean Christopherson
85ba2b165d KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
Fix a recently introduced bug that results in the wrong VMCS control
field being updated when applying a IA32_PERF_GLOBAL_CTRL errata.

Fixes: c73da3fcab ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls")
Reported-by: Harald Arnesen <harald@skogtun.org>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 18:52:54 +01:00
Alexander Popov
5cc244a20b KVM: x86: Fix single-step debugging
The single-step debugging of KVM guests on x86 is broken: if we run
gdb 'stepi' command at the breakpoint when the guest interrupts are
enabled, RIP always jumps to native_apic_mem_write(). Then other
nasty effects follow.

Long investigation showed that on Jun 7, 2017 the
commit c8401dda2f ("KVM: x86: fix singlestepping over syscall")
introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
be called without X86_EFLAGS_TF set.

Let's fix it. Please consider that for -stable.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: stable@vger.kernel.org
Fixes: c8401dda2f ("KVM: x86: fix singlestepping over syscall")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 18:52:53 +01:00
Vitaly Kuznetsov
9699f970de x86/kvm/hyper-v: don't announce GUEST IDLE MSR support
HV_X64_MSR_GUEST_IDLE_AVAILABLE appeared in kvm_vcpu_ioctl_get_hv_cpuid()
by mistake: it announces support for HV_X64_MSR_GUEST_IDLE (0x400000F0)
which we don't support in KVM (yet).

Fixes: 2bc39970e9 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-25 18:52:34 +01:00
Arnd Bergmann
0d6040d468 arch: add split IPC system calls where needed
The IPC system call handling is highly inconsistent across architectures,
some use sys_ipc, some use separate calls, and some use both.  We also
have some architectures that require passing IPC_64 in the flags, and
others that set it implicitly.

For the addition of a y2038 safe semtimedop() system call, I chose to only
support the separate entry points, but that requires first supporting
the regular ones with their own syscall numbers.

The IPC_64 is now implied by the new semctl/shmctl/msgctl system
calls even on the architectures that require passing it with the ipc()
multiplexer.

I'm not adding the new semtimedop() or semop() on 32-bit architectures,
those will get implemented using the new semtimedop_time64() version
that gets added along with the other time64 calls.
Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-25 17:22:50 +01:00
Sinan Kaya
625210cfa6 x86/Kconfig: Select PCI_LOCKLESS_CONFIG if PCI is enabled
After commit

  5d32a66541 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set")

dependencies on CONFIG_PCI that previously were satisfied implicitly
through dependencies on CONFIG_ACPI have to be specified directly.

PCI_LOCKLESS_CONFIG depends on PCI but this dependency has not been
mentioned in the Kconfig so add an explicit dependency here and fix

  WARNING: unmet direct dependencies detected for PCI_LOCKLESS_CONFIG
    Depends on [n]: PCI [=n]
    Selected by [y]:
    - X86 [=y]

Fixes: 5d32a66541 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set")
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-acpi@vger.kernel.org
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190121231958.28255-2-okaya@kernel.org
2019-01-22 17:06:28 +01:00
Borislav Petkov
bae54dc4f3 x86/fpu: Get rid of CONFIG_AS_FXSAVEQ
This was a "workaround" to probe for binutils which could generate
FXSAVEQ, apparently gas with min version 2.16. In the meantime, minimal
required gas version is 2.20 so all those workarounds for older binutils
can be dropped.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20190117232408.GH5023@zn.tnic
2019-01-22 14:16:39 +01:00
Borislav Petkov
ee35b9b9f6 x86/traps: Have read_cr0() only once in the #NM handler
... instead of twice in the code. In any case, CR0 ends up being read
once anyway:

1. The CONFIG_MATH_EMULATION case does so and exits.
2. The normal case does it once too.

However, read it on function entry instead to make the code even simpler
to follow.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20190117120728.3811-1-bp@alien8.de
2019-01-22 14:13:35 +01:00
Andrew Murray
88dbe3c94e perf/core, arch/x86: Strengthen exclusion checks with PERF_PMU_CAP_NO_EXCLUDE
For x86 PMUs that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

This change means that amd/iommu and amd/uncore will now also
indicate that they do not support exclude_{hv|idle} and intel/uncore
that it does not support exclude_{guest|host}.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-12-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:29 +01:00
Andrew Murray
2ff4025069 perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

PMU drivers that support at least one exclude flag won't have the
PERF_PMU_CAP_NOEXCLUDE capability set - these PMU drivers should still
check and fail on unsupported exclude flags. These missing tests are
not added in this patch.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-11-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:27 +01:00
Will Deacon
6e693b3ffe x86: uaccess: Inhibit speculation past access_ok() in user_access_begin()
Commit 594cc251fd ("make 'user_access_begin()' do 'access_ok()'")
makes the access_ok() check part of the user_access_begin() preceding a
series of 'unsafe' accesses.  This has the desirable effect of ensuring
that all 'unsafe' accesses have been range-checked, without having to
pick through all of the callsites to verify whether the appropriate
checking has been made.

However, the consolidated range check does not inhibit speculation, so
it is still up to the caller to ensure that they are not susceptible to
any speculative side-channel attacks for user addresses that ultimately
fail the access_ok() check.

This is an oversight, so use __uaccess_begin_nospec() to ensure that
speculation is inhibited until the access_ok() check has passed.

Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-20 15:33:22 +12:00
Linus Torvalds
e6ec2fda2d Merge tag 'for-linus-5.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - Several fixes for the Xen pvcalls drivers (1 fix for the backend and
   8 for the frontend).

 - A fix for a rather longstanding bug in the Xen sched_clock()
   interface which led to weird time jumps when migrating the system.

 - A fix for avoiding accesses to x2apic MSRs in Xen PV guests.

* tag 'for-linus-5.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Fix x86 sched_clock() interface for xen
  pvcalls-front: fix potential null dereference
  always clear the X2APIC_ENABLE bit for PV guest
  pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock
  xen/pvcalls: remove set but not used variable 'intf'
  pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read
  pvcalls-front: don't return error when the ring is full
  pvcalls-front: properly allocate sk
  pvcalls-front: don't try to free unallocated rings
  pvcalls-front: read all data before closing the connection
2019-01-19 05:53:41 +12:00
Jiaxun Yang
0237199186 x86/CPU/AMD: Set the CPB bit unconditionally on F17h
Some F17h models do not have CPB set in CPUID even though the CPU
supports it. Set the feature bit unconditionally on all F17h.

 [ bp: Rewrite commit message and patch. ]

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181120030018.5185-1-jiaxun.yang@flygoat.com
2019-01-18 16:44:03 +01:00
Eric Biggers
793ff5ffc1 crypto: x86/aesni-gcm - make 'struct aesni_gcm_tfm_s' static const
Add missing static keywords to fix the following sparse warnings:

    arch/x86/crypto/aesni-intel_glue.c:197:24: warning: symbol 'aesni_gcm_tfm_sse' was not declared. Should it be static?
    arch/x86/crypto/aesni-intel_glue.c:246:24: warning: symbol 'aesni_gcm_tfm_avx_gen2' was not declared. Should it be static?
    arch/x86/crypto/aesni-intel_glue.c:291:24: warning: symbol 'aesni_gcm_tfm_avx_gen4' was not declared. Should it be static?

I also made the affected structures 'const', and adjusted the
indentation in the struct definition to not be insane.

Cc: Dave Watson <davejwatson@fb.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-01-18 18:43:43 +08:00
Jan Beulich
fc24d75a7f x86/entry/64/compat: Fix stack switching for XEN PV
While in the native case entry into the kernel happens on the trampoline
stack, PV Xen kernels get entered with the current thread stack right
away. Hence source and destination stacks are identical in that case,
and special care is needed.

Other than in sync_regs() the copying done on the INT80 path isn't
NMI / #MC safe, as either of these events occurring in the middle of the
stack copying would clobber data on the (source) stack.

There is similar code in interrupt_entry() and nmi(), but there is no fixup
required because those code paths are unreachable in XEN PV guests.

[ tglx: Sanitized subject, changelog, Fixes tag and stable mail address. Sigh ]

Fixes: 7f2590a110 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: xen-devel@lists.xenproject.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/5C3E1128020000780020DFAD@prv1-mh.provo.novell.com
2019-01-18 00:39:33 +01:00
Shirish S
30aa3d26ed x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
The MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths. Carve it
out into a separate function and call it mce_amd_feature_init() where
the two code paths of the state transitions converge.

 [ bp: massage commit message and the carved out function. ]

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1547651417-23583-3-git-send-email-shirish.s@amd.com
2019-01-16 19:42:00 +01:00
Juergen Gross
867cefb4cb xen: Fix x86 sched_clock() interface for xen
Commit f94c8d1169 ("sched/clock, x86/tsc: Rework the x86 'unstable'
sched_clock() interface") broke Xen guest time handling across
migration:

[  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  187.251137] OOM killer disabled.
[  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  187.252299] suspending xenstore...
[  187.266987] xen:grant_table: Grant tables using version 1 layout
[18446743811.706476] OOM killer enabled.
[18446743811.706478] Restarting tasks ... done.
[18446743811.720505] Setting capacity to 16777216

Fix that by setting xen_sched_clock_offset at resume time to ensure a
monotonic clock value.

[boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
 to avoid printing with incorrect timestamp during resume (as we
 haven't re-adjusted the clock yet)]

Fixes: f94c8d1169 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
Cc: <stable@vger.kernel.org> # 4.11
Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2019-01-16 13:06:05 -05:00
Gustavo A. R. Silva
2bc217c616 x86/platform/UV: Replace kmalloc() and memset() with k[cz]alloc() calls
Replace kmalloc_node() and memset() with kzalloc_node(), and
kmalloc_array() and memset() with kcalloc().

This code was detected with the help of Coccinelle.

No functional changes.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andrew Banman <abanman@hpe.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Varsha Rao <rvarsha016@gmail.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190115173713.GA31031@embeddedor
2019-01-16 13:01:26 +01:00
Borislav Petkov
093ae8f9a8 x86/TSC: Use RDTSCP
Currently, the kernel uses

  [LM]FENCE; RDTSC

in the timekeeping code, to guarantee monotonicity of time where the
*FENCE is selected based on vendor.

Replace that sequence with RDTSCP which is faster or on-par and gives
the same guarantees.

A microbenchmark on Intel shows that the change is on-par.

On AMD, the change is either on-par with the current LFENCE-prefixed
RDTSC or slightly better with RDTSCP.

The comparison is done with the LFENCE-prefixed RDTSC (and not with the
MFENCE-prefixed one, as one would normally expect) because all modern
AMD families make LFENCE serializing and thus avoid the heavy MFENCE by
effectively enabling X86_FEATURE_LFENCE_RDTSC.

Co-developed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20181119184556.11479-1-bp@alien8.de
2019-01-16 12:43:08 +01:00
Borislav Petkov
71a93c2693 x86/alternatives: Add an ALTERNATIVE_3() macro
Similar to ALTERNATIVE_2(), ALTERNATIVE_3() selects between 3 possible
variants. Will be used for adding RDTSCP to the rdtsc_ordered()
alternatives.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: X86 ML <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181211222326.14581-4-bp@alien8.de
2019-01-16 12:42:50 +01:00
Borislav Petkov
c1d4e4192a x86/alternatives: Print containing function
... in the "debug-alternative" output so that one can find her way
easier when staring at the vmlinux disassembly.

For example:

  apply_alternatives: feat: 3*32+18, old: (read_tsc+0x0/0x10 (ffffffff8101d1c0) len: 5), repl: (ffffffff824e6d33, len: 5)
  					   ^^^^^^^^^^^^^^^^^
  ffffffff8101d1c0: old_insn: 0f 31 90 90 90
  ffffffff824e6d33: rpl_insn: 0f ae e8 0f 31
  ffffffff8101d1c0: final_insn: 0f ae e8 0f 31

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: X86 ML <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181211222326.14581-3-bp@alien8.de
2019-01-16 12:41:57 +01:00
Borislav Petkov
1c1ed4731c x86/alternatives: Add macro comments
... so that when one stares at the .s output, one can find her way
around the resulting asm magic.

With it, ALTERNATIVE looks like this now:

          # ALT: oldnstr
  661:
          ...
  662:
          # ALT: padding
  .skip ...
  663:
  .pushsection .altinstructions,"a"

  ...

  .popsection
  .pushsection .altinstr_replacement, "ax"
          # ALT: replacement 1
  6641:
  	...
  6651:
          .popsection

Merge __OLDINSTR() into OLDINSTR(), while at it.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: X86 ML <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181211222326.14581-2-bp@alien8.de
2019-01-16 12:40:07 +01:00
Borislav Petkov
e6d7bc0bdf x86/build: Use the single-argument OUTPUT_FORMAT() linker script command
The various x86 linker scripts use the three-argument linker script
command variant OUTPUT_FORMAT(DEFAULT, BIG, LITTLE) which specifies
three object file formats when the -EL and -EB linker command line
options are used. When -EB is specified, OUTPUT_FORMAT issues the BIG
object file format, when -EL, LITTLE, respectively, and when neither is
specified, DEFAULT.

However, those -E[LB] options are not used by arch/x86/ so switch to the
simple OUTPUT_FORMAT(BFDNAME) macro variant.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20190109181531.27513-1-bp@alien8.de
2019-01-16 12:21:53 +01:00
Sinan Kaya
e9820d6b0a x86/intel/lpss: Make PCI dependency explicit
After commit 5d32a66541 (PCI/ACPI: Allow ACPI to be built without
CONFIG_PCI set) dependencies on CONFIG_PCI that previously were
satisfied implicitly through dependencies on CONFIG_ACPI have to be
specified directly.

LPSS code relies on PCI infrastructure but this dependency has not
been called out explicitly yet.

Fixes: 5d32a66541 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set")
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-15 23:17:28 +01:00
Shirish S
c95b323dcd x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
MC4_MISC thresholding is not supported on all family 0x15 processors,
hence skip the x86_model check when applying the quirk.

 [ bp: massage commit message. ]

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1547106849-3476-2-git-send-email-shirish.s@amd.com
2019-01-15 18:11:38 +01:00
Dave Young
993a110319 x86/kexec: Fix a kexec_file_load() failure
Commit

  b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")

changed the behavior of kexec_locate_mem_hole(): it will try to allocate
free memory only when kbuf.mem is initialized to zero.

However, x86's kexec_file_load() implementation reuses a struct
kexec_buf allocated on the stack and its kbuf.mem member gets set by
each kexec_add_buffer() invocation.

The second kexec_add_buffer() will reuse the same kbuf but not
reinitialize kbuf.mem.

Therefore, explictily reset kbuf.mem each time in order for
kexec_locate_mem_hole() to locate a free memory region each time.

 [ bp: massage commit message. ]

Fixes: b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Philipp Rudo <prudo@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yannik Sembritzki <yannik@sembritzki.me>
Cc: Yi Wang <wang.yi59@zte.com.cn>
Cc: kexec@lists.infradead.org
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181228011247.GA9999@dhcp-128-65.nay.redhat.com
2019-01-15 12:12:50 +01:00
Peng Hao
bf7d28c534 x86/mm/mem_encrypt: Fix erroneous sizeof()
Using sizeof(pointer) for determining the size of a memset() only works
when the size of the pointer and the size of type to which it points are
the same. For pte_t this is only true for 64bit and 32bit-NONPAE. On 32bit
PAE systems this is wrong as the pointer size is 4 byte but the PTE entry
is 8 bytes. It's actually not a real world issue as this code depends on
64bit, but it's wrong nevertheless.

Use sizeof(*p) for correctness sake.

Fixes: aad983913d ("x86/mm/encrypt: Simplify sme_populate_pgd() and sme_populate_pgd_large()")
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: dave.hansen@linux.intel.com
Cc: peterz@infradead.org
Cc: luto@kernel.org
Link: https://lkml.kernel.org/r/1546065252-97996-1-git-send-email-peng.hao2@zte.com.cn
2019-01-15 11:41:58 +01:00
Dave Hansen
a31e184e4f x86/pkeys: Properly copy pkey state at fork()
Memory protection key behavior should be the same in a child as it was
in the parent before a fork.  But, there is a bug that resets the
state in the child at fork instead of preserving it.

The creation of new mm's is a bit convoluted.  At fork(), the code
does:

  1. memcpy() the parent mm to initialize child
  2. mm_init() to initalize some select stuff stuff
  3. dup_mmap() to create true copies that memcpy() did not do right

For pkeys two bits of state need to be preserved across a fork:
'execute_only_pkey' and 'pkey_allocation_map'.

Those are preserved by the memcpy(), but mm_init() invokes
init_new_context() which overwrites 'execute_only_pkey' and
'pkey_allocation_map' with "new" values.

The author of the code erroneously believed that init_new_context is *only*
called at execve()-time.  But, alas, init_new_context() is used at execve()
and fork().

The result is that, after a fork(), the child's pkey state ends up looking
like it does after an execve(), which is totally wrong.  pkeys that are
already allocated can be allocated again, for instance.

To fix this, add code called by dup_mmap() to copy the pkey state from
parent to child explicitly.  Also add a comment above init_new_context() to
make it more clear to the next poor sod what this code is used for.

Fixes: e8c24d3a23 ("x86/pkeys: Allocation/free syscalls")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: peterz@infradead.org
Cc: mpe@ellerman.id.au
Cc: will.deacon@arm.com
Cc: luto@kernel.org
Cc: jroedel@suse.de
Cc: stable@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/20190102215655.7A69518C@viggo.jf.intel.com
2019-01-15 10:33:45 +01:00