Pull x86 entry updates from Ingo Molnar:
"This contains x32 and compat syscall improvements, the biggest one of
which splits x32 syscalls into their own table, which allows new
syscalls to share the x32 and x86-64 number - which turns the
512-547 special syscall numbers range into a legacy wart that won't be
extended going forward"
* 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/syscalls: Split the x32 syscalls into their own table
x86/syscalls: Disallow compat entries for all types of 64-bit syscalls
x86/syscalls: Use the compat versions of rt_sigsuspend() and rt_sigprocmask()
x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long
MPX is being removed from the kernel due to a lack of support in the
toolchain going forward (gcc).
This is the smallest possible patch to fix some issues that have been
reported around running the MPX selftests. It it would also have been part
of any removal series, it is offered first.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190705175318.784C233E@viggo.jf.intel.com
For unfortunate historical reasons, the x32 syscalls and the x86_64
syscalls are not all numbered the same. As an example, ioctl() is nr 16 on
x86_64 but 514 on x32.
This has potentially nasty consequences, since it means that there are two
valid RAX values to do ioctl(2) and two invalid RAX values. The valid
values are 16 (i.e. ioctl(2) using the x86_64 ABI) and (514 | 0x40000000)
(i.e. ioctl(2) using the x32 ABI).
The invalid values are 514 and (16 | 0x40000000). 514 will enter the
"COMPAT_SYSCALL_DEFINE3(ioctl, ...)" entry point with in_compat_syscall()
and in_x32_syscall() returning false, whereas (16 | 0x40000000) will enter
the native entry point with in_compat_syscall() and in_x32_syscall()
returning true. Both are bogus, and both will exercise code paths in the
kernel and in any running seccomp filters that really ought to be
unreachable.
Splitting out the x32 syscalls into their own tables, allows both bogus
invocations to return -ENOSYS. I've checked glibc, musl, and Bionic, and
all of them appear to call syscalls with their correct numbers, so this
change should have no effect on them.
There is an added benefit going forward: new syscalls that need special
handling on x32 can share the same number on x32 and x86_64. This means
that the special syscall range 512-547 can be treated as a legacy wart
instead of something that may need to be extended in the future.
Also add a selftest to verify the new behavior.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/208024256b764312598f014ebfb0a42472c19354.1562185330.git.luto@kernel.org
Pull Documentation updates from Jonathan Corbet:
"It's been a relatively busy cycle for docs:
- A fair pile of RST conversions, many from Mauro. These create more
than the usual number of simple but annoying merge conflicts with
other trees, unfortunately. He has a lot more of these waiting on
the wings that, I think, will go to you directly later on.
- A new document on how to use merges and rebases in kernel repos,
and one on Spectre vulnerabilities.
- Various improvements to the build system, including automatic
markup of function() references because some people, for reasons I
will never understand, were of the opinion that
:c:func:``function()`` is unattractive and not fun to type.
- We now recommend using sphinx 1.7, but still support back to 1.4.
- Lots of smaller improvements, warning fixes, typo fixes, etc"
* tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
docs: automarkup.py: ignore exceptions when seeking for xrefs
docs: Move binderfs to admin-guide
Disable Sphinx SmartyPants in HTML output
doc: RCU callback locks need only _bh, not necessarily _irq
docs: format kernel-parameters -- as code
Doc : doc-guide : Fix a typo
platform: x86: get rid of a non-existent document
Add the RCU docs to the core-api manual
Documentation: RCU: Add TOC tree hooks
Documentation: RCU: Rename txt files to rst
Documentation: RCU: Convert RCU UP systems to reST
Documentation: RCU: Convert RCU linked list to reST
Documentation: RCU: Convert RCU basic concepts to reST
docs: filesystems: Remove uneeded .rst extension on toctables
scripts/sphinx-pre-install: fix out-of-tree build
docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
Documentation: PGP: update for newer HW devices
Documentation: Add section about CPU vulnerabilities for Spectre
Documentation: platform: Delete x86-laptop-drivers.txt
docs: Note that :c:func: should no longer be used
...
Pull x86 CPU feature updates from Thomas Gleixner:
"Updates for x86 CPU features:
- Support for UMWAIT/UMONITOR, which allows to use MWAIT and MONITOR
instructions in user space to save power e.g. in HPC workloads
which spin wait on synchronization points.
The maximum time a MWAIT can halt in userspace is controlled by the
kernel and can be adjusted by the sysadmin.
- Speed up the MTRR handling code on CPUs which support cache
self-snooping correctly.
On those CPUs the wbinvd() invocations can be omitted which speeds
up the MTRR setup by a factor of 50.
- Support for the new x86 vendor Zhaoxin who develops processors
based on the VIA Centaur technology.
- Prevent 'cat /proc/cpuinfo' from affecting isolated NOHZ_FULL CPUs
by sending IPIs to retrieve the CPU frequency and use the cached
values instead.
- The addition and late revert of the FSGSBASE support. The revert
was required as it turned out that the code still has hard to
diagnose issues. Yet another engineering trainwreck...
- Small fixes, cleanups, improvements and the usual new Intel CPU
family/model addons"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
x86/fsgsbase: Revert FSGSBASE support
selftests/x86/fsgsbase: Fix some test case bugs
x86/entry/64: Fix and clean up paranoid_exit
x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled
selftests/x86: Test SYSCALL and SYSENTER manually with TF set
x86/mtrr: Skip cache flushes on CPUs with cache self-snooping
x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errata
Documentation/ABI: Document umwait control sysfs interfaces
x86/umwait: Add sysfs interface to control umwait maximum time
x86/umwait: Add sysfs interface to control umwait C0.2 state
x86/umwait: Initialize umwait control values
x86/cpufeatures: Enumerate user wait instructions
x86/cpu: Disable frequency requests via aperfmperf IPI for nohz_full CPUs
x86/acpi/cstate: Add Zhaoxin processors support for cache flush policy in C3
ACPI, x86: Add Zhaoxin processors support for NONSTOP TSC
x86/cpu: Create Zhaoxin processors architecture support file
x86/cpu: Split Tremont based Atoms from the rest
Documentation/x86/64: Add documentation for GS/FS addressing mode
x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit
...
This document is used by multiple architectures:
$ echo $(git grep -l pkey_mprotect arch|cut -d'/' -f 2|sort|uniq)
alpha arm arm64 ia64 m68k microblaze mips parisc powerpc s390 sh sparc x86 xtensa
So, let's move it to the core book and adjust the links to it
accordingly.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 263 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Atom-based CPUs trigger stack fault when invoke 32-bit SYSENTER instruction
with invalid register values. So we also need SIGBUS handling in this case.
Following is assembly when the fault exception happens.
(gdb) disassemble $eip
Dump of assembler code for function __kernel_vsyscall:
0xf7fd8fe0 <+0>: push %ecx
0xf7fd8fe1 <+1>: push %edx
0xf7fd8fe2 <+2>: push %ebp
0xf7fd8fe3 <+3>: mov %esp,%ebp
0xf7fd8fe5 <+5>: sysenter
0xf7fd8fe7 <+7>: int $0x80
=> 0xf7fd8fe9 <+9>: pop %ebp
0xf7fd8fea <+10>: pop %edx
0xf7fd8feb <+11>: pop %ecx
0xf7fd8fec <+12>: ret
End of assembler dump.
According to Intel SDM, this could also be a Stack Segment Fault(#SS, 12),
except a normal Page Fault(#PF, 14). Especially, in section 6.9 of Vol.3A,
both stack and page faults are within the 10th(lowest priority) class, and
as it said, "exceptions within each class are implementation-dependent and
may vary from processor to processor". It's expected for processors like
Intel Atom to trigger stack fault(SIGBUS), while we get page fault(SIGSEGV)
from common Core processors.
Signed-off-by: Tong Bo <bo.tong@intel.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Fix the swapped outb() parameters in the KASLR code
- Fix the PKEY handling at fork which missed to preserve the pkey
state for the child. Comes with a test case to validate that.
- Fix the entry stack handling for XEN PV to respect that XEN PV
systems enter the function already on the current thread stack and
not on the trampoline.
- Fix kexec load failure caused by using a stale value when the
kexec_buf structure is reused for subsequent allocations.
- Fix a bogus sizeof() in the memory encryption code
- Enforce PCI dependency for the Intel Low Power Subsystem
- Enforce PCI_LOCKLESS_CONFIG when PCI is enabled"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/Kconfig: Select PCI_LOCKLESS_CONFIG if PCI is enabled
x86/entry/64/compat: Fix stack switching for XEN PV
x86/kexec: Fix a kexec_file_load() failure
x86/mm/mem_encrypt: Fix erroneous sizeof()
x86/selftests/pkeys: Fork() to check for state being preserved
x86/pkeys: Properly copy pkey state at fork()
x86/kaslr: Fix incorrect i8254 outb() parameters
x86/intel/lpss: Make PCI dependency explicit
There is a spelling mistake eprintf error message, fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
Some toolchains need -no-pie to build all tests, others do not support
the -no-pie flag at all. Therefore, add another test for the
availability of the flag.
This amends commit 3346a6a4e5
("selftests: x86: sysret_ss_attrs doesn't build on a PIE build").
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
32-bit user code that uses int $80 doesn't care about r8-r11. There is,
however, some 64-bit user code that intentionally uses int $0x80 to invoke
32-bit system calls. From what I've seen, basically all such code assumes
that r8-r15 are all preserved, but the kernel clobbers r8-r11. Since I
doubt that there's any code that depends on int $0x80 zeroing r8-r11,
change the kernel to preserve them.
I suspect that very little user code is broken by the old clobber, since
r8-r11 are only rarely allocated by gcc, and they're clobbered by function
calls, so they only way we'd see a problem is if the same function that
invokes int $0x80 also spills something important to one of these
registers.
The current behavior seems to date back to the historical commit
"[PATCH] x86-64 merge for 2.6.4". Before that, all regs were
preserved. I can't find any explanation of why this change was made.
Update the test_syscall_vdso_32 testcase as well to verify the new
behavior, and it strengthens the test to make sure that the kernel doesn't
accidentally permute r8..r15.
Suggested-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org