Commit Graph

2859 Commits

Author SHA1 Message Date
Vasily Gorbik
f3122a79a1 s390/topology: avoid firing events before kobjs are created
arch_update_cpu_topology is first called from:
kernel_init_freeable->sched_init_smp->sched_init_domains

even before cpus has been registered in:
kernel_init_freeable->do_one_initcall->s390_smp_init

Do not trigger kobject_uevent change events until cpu devices are
actually created. Fixes the following kasan findings:

BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb40/0xee0
Read of size 8 at addr 0000000000000020 by task swapper/0/1

BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb36/0xee0
Read of size 8 at addr 0000000000000018 by task swapper/0/1

CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
([<0000000143c6db7e>] show_stack+0x14e/0x1a8)
 [<0000000145956498>] dump_stack+0x1d0/0x218
 [<000000014429fb4c>] print_address_description+0x64/0x380
 [<000000014429f630>] __kasan_report+0x138/0x168
 [<0000000145960b96>] kobject_uevent_env+0xb36/0xee0
 [<0000000143c7c47c>] arch_update_cpu_topology+0x104/0x108
 [<0000000143df9e22>] sched_init_domains+0x62/0xe8
 [<000000014644c94a>] sched_init_smp+0x3a/0xc0
 [<0000000146433a20>] kernel_init_freeable+0x558/0x958
 [<000000014599002a>] kernel_init+0x22/0x160
 [<00000001459a71d4>] ret_from_fork+0x28/0x30
 [<00000001459a71dc>] kernel_thread_starter+0x0/0x10

Cc: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-23 23:27:52 +02:00
Thomas Richter
2cb549a821 s390/cpum_sf: Support ioctl PERF_EVENT_IOC_PERIOD
A perf_event can be set up to deliver overflow notifications
via SIGIO signal.  The setup of the event is:

 1. create event with perf_event_open()
 2. assign it a signal for I/O notification with fcntl()
 3. Install signal handler and consume samples

The initial setup of perf_event_open() determines the
period/frequency time span needed to elapse before each signal
is delivered to the user process.

While the event is active, system call
ioctl(.., PERF_EVENT_IOC_PERIOD, value) can be used the change
the frequency/period time span of the active event.
The remaining signal handler invocations honour the new value.

This does not work on s390. In fact the time span does not change
regardless of ioctl's third argument 'value'. The call succeeds
but the time span does not change.

Support this behavior and make it common with other platforms.
This is achieved by changing the interval value of the sampling
control block accordingly and feed this new value every time
the event is enabled using pmu_event_enable().

Before this change the interval value was set only once at
pmu_event_add() and never changed.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-19 12:56:07 +02:00
Linus Torvalds
d590284419 Merge tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:

 - Add support for IBM z15 machines.

 - Add SHA3 and CCA AES cipher key support in zcrypt and pkey
   refactoring.

 - Move to arch_stack_walk infrastructure for the stack unwinder.

 - Various kasan fixes and improvements.

 - Various command line parsing fixes.

 - Improve decompressor phase debuggability.

 - Lift no bss usage restriction for the early code.

 - Use refcount_t for reference counters for couple of places in mm
   code.

 - Logging improvements and return code fix in vfio-ccw code.

 - Couple of zpci fixes and minor refactoring.

 - Remove some outdated documentation.

 - Fix secure boot detection.

 - Other various minor code clean ups.

* tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
  s390: remove pointless drivers-y in drivers/s390/Makefile
  s390/cpum_sf: Fix line length and format string
  s390/pci: fix MSI message data
  s390: add support for IBM z15 machines
  s390/crypto: Support for SHA3 via CPACF (MSA6)
  s390/startup: add pgm check info printing
  s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
  vfio-ccw: fix error return code in vfio_ccw_sch_init()
  s390: vfio-ap: fix warning reset not completed
  s390/base: remove unused s390_base_mcck_handler
  s390/sclp: Fix bit checked for has_sipl
  s390/zcrypt: fix wrong handling of cca cipher keygenflags
  s390/kasan: add kdump support
  s390/setup: avoid using strncmp with hardcoded length
  s390/sclp: avoid using strncmp with hardcoded length
  s390/module: avoid using strncmp with hardcoded length
  s390/pci: avoid using strncmp with hardcoded length
  s390/kaslr: reserve memory for kasan usage
  s390/mem_detect: provide single get_mem_detect_end
  s390/cmma: reuse kstrtobool for option value parsing
  ...
2019-09-17 14:04:43 -07:00
Thomas Richter
03e9e42f79 s390/cpum_sf: Fix line length and format string
Rewrite some lines to match line length and replace
format string 0x%x to %#x.  Add and remove blank line.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-16 13:21:51 +02:00
Martin Schwidefsky
a0e2251132 s390: add support for IBM z15 machines
Add detection for machine types 0x8562 and 8x8561 and set the ELF platform
name to z15. Add the miscellaneous-instruction-extension 3 facility to
the list of facilities for z15.

And allow to generate code that only runs on a z15 machine.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-09-13 12:19:14 +02:00
Matthew Garrett
45893a0abe kexec: Fix file verification on S390
I accidentally typoed this #ifdef, so verification would always be
disabled.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Reported-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2019-09-10 13:27:51 +01:00
Vasily Gorbik
512222789c s390/base: remove unused s390_base_mcck_handler
s390_base_mcck_handler was used during system reset if diag308 set was
not available. But after commit d485235b00 ("s390: assume diag308 set
always works") is a dead code and could be removed.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-03 13:53:56 +02:00
Vasily Gorbik
d0b319843b s390/setup: avoid using strncmp with hardcoded length
Replace strncmp usage in console mode setup code with simple strcmp.
Replace strncmp which is used for prefix comparison with str_has_prefix.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-29 15:34:58 +02:00
Vasily Gorbik
54fb07d030 s390/sclp: avoid using strncmp with hardcoded length
"earlyprintk" option documentation does not clearly state which
platform supports which additional values (e.g. ",keep"). Preserve old
option behaviour and reuse str_has_prefix instead of strncmp for prefix
testing.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-29 15:34:58 +02:00
Vasily Gorbik
b29cd7c4c4 s390/module: avoid using strncmp with hardcoded length
Reuse str_has_prefix instead of strncmp with hardcoded length to
make the intent of a comparison more obvious.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-29 15:34:57 +02:00
Vasily Gorbik
3d64436453 s390/vdso: reuse kstrtobool for option value parsing
"vdso" option setup already recognises integer and textual values. Yet
kstrtobool is a more common way to parse boolean values, reuse it to
unify option value parsing behavior and simplify code a bit.

While at it, __setup value parsing callbacks are expected to return
1 when an option is recognized, and returning any other value won't
trigger any error message currently, so simply return 1.

Also don't change default vdso_enabled value of 1 when "vdso" option
value is invalid.

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-26 12:51:17 +02:00
Vasily Gorbik
e991e5bb11 s390/stacktrace: use common arch_stack_walk infrastructure
Use common arch_stack_walk infrastructure to avoid duplicated code and
avoid taking care of the stack storage and filtering.

Common code also uses try_get_task_stack/put_task_stack when needed which
have been missing in our code, which also solves potential problem for us.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:58:53 +02:00
Vasily Gorbik
2c7fa8a11c s390/kasan: avoid report in get_wchan
Reading other running task's stack can be a dangerous endeavor. Kasan
stack memory access instrumentation includes special prologue and epilogue
to mark/remove red zones in shadow memory between stack variables. For
that reason there is always a race between a task reading value in other
task's stack and that other task returning from a function and entering
another one generating different red zones pattern.

To avoid kasan reports simply perform uninstrumented memory reads.

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:58:53 +02:00
Vasily Gorbik
8769f610fe s390/process: avoid potential reading of freed stack
With THREAD_INFO_IN_TASK (which is selected on s390) task's stack usage
is refcounted and should always be protected by get/put when touching
other task's stack to avoid race conditions with task's destruction code.

Fixes: d5c352cdd0 ("s390: move thread_info into task_struct")
Cc: stable@vger.kernel.org # v4.10+
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:58:52 +02:00
Vasily Gorbik
2e83e0eb85 s390: clean .bss before running uncompressed kernel
Clean uncompressed kernel .bss section in the startup code before
the uncompressed kernel is executed. At this point of time initrd and
certificates have been already rescued. Uncompressed kernel .bss size
is known from vmlinux_info. It is also taken into consideration during
uncompressed kernel positioning by kaslr (so it is safe to clean it).

With that uncompressed kernel is starting with .bss section zeroed and
no .bss section usage restrictions apply. Which makes chkbss checks for
uncompressed kernel objects obsolete and they can be removed.

early_nobss.c is also not needed anymore. Parts of it which are still
relevant are moved to early.c. Kasan initialization code is now called
directly from head64 (early.c is instrumented and should not be
executed before kasan shadow memory is set up).

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:58:52 +02:00
Vasily Gorbik
59793c5ab9 s390: move vmalloc option parsing to startup code
Few other crucial memory setup options are already handled in
the startup code. Those values are needed by kaslr and kasan
implementations. "vmalloc" is the last piece required for future
improvements such as early decision on kernel page levels depth required
for actual memory setup, as well as vmalloc memory area access monitoring
in kasan.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:41:43 +02:00
Jiri Bohac
99d5cadfde kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE
This is a preparatory patch for kexec_file_load() lockdown.  A locked down
kernel needs to prevent unsigned kernel images from being loaded with
kexec_file_load().  Currently, the only way to force the signature
verification is compiling with KEXEC_VERIFY_SIG.  This prevents loading
usigned images even when the kernel is not locked down at runtime.

This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE.
Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG
turns on the signature verification but allows unsigned images to be
loaded.  KEXEC_SIG_FORCE disallows images without a valid signature.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
cc: kexec@lists.infradead.org
Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19 21:54:15 -07:00
Heiko Carstens
404861e15b s390/vdso: map vdso also for statically linked binaries
s390 does not map the vdso for statically linked binaries, assuming
that this doesn't make sense. See commit fc5243d98a ("[S390]
arch_setup_additional_pages arguments").

However with glibc commit d665367f596d ("linux: Enable vDSO for static
linking as default (BZ#19767)") and commit 5e855c895401 ("s390: Enable
VDSO for static linking") the vdso is also used for statically linked
binaries - if the kernel would make it available.

Therefore map the vdso always, just like all other architectures.

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-09 11:03:28 +02:00
Vasily Gorbik
24350fdadb s390: put _stext and _etext into .text section
Perf relies on _etext and _stext symbols being one of 't', 'T', 'v' or
'V'. Put them into .text section to guarantee that.

Also moves padding to page boundary inside .text which has an effect that
.text section is now padded with nops rather than 0's, which apparently
has been the initial intention for specifying 0x0700 fill expression.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06 13:58:35 +02:00
Vasily Gorbik
b9f23b7376 s390/head64: cleanup unused labels
Cleanup labels in head64 some of which are not being used since git
recorded history.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06 13:58:35 +02:00
Vasily Gorbik
fd0c7435d7 s390/unwind: remove stack recursion warning
Remove pointless stack recursion on stack type ... warning, which
only confuses people. There is no way to make backchain unwinder 100%
reliable. When a task is interrupted in-between stack frame allocation
and backchain write instructions new stack frame backchain pointer is
left uninitialized (there are also sometimes additional instruction
in-between stack frame allocation and backchain write instructions due
to gcc shrink-wrapping). In attempt to unwind such stack the unwinder
would still try to use that invalid backchain value and perform all kind
of sanity checks on it to make sure we are not pointed out of stack. In
some cases that invalid backchain value would be 0 and we would falsely
treat next stackframe as pt_regs and again gprs[15] in those pt_regs
might happen to point at some address within the task's stack.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06 13:58:35 +02:00
Vasily Gorbik
218ddd5acf s390/setup: adjust start_code of init_mm to _text
After some investigation it doesn't look like init_mm fields
start_code/end_code are used anywhere besides potentially in dump_mm for
debugging purposes. Originally the value of 0 for start_code reflected
the presence of lowcore and early boot code. But with kaslr in place
start_code/end_code range should not span over unoccupied by the code
segment memory. So, adjust init_mm start_code to point at the beginning
of the code segment like other architectures do it.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06 13:58:34 +02:00
Vasily Gorbik
a287a49e67 s390/protvirt: avoid memory sharing for diag 308 set/store
This reverts commit db9492cef4 ("s390/protvirt: add memory sharing for
diag 308 set/store") which due to ultravisor implementation change is
not needed after all.

Fixes: db9492cef4 ("s390/protvirt: add memory sharing for diag 308 set/store")
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06 13:58:34 +02:00
Thiago Jung Bauermann
c8424e776b MODSIGN: Export module signature definitions
IMA will use the module_signature format for append signatures, so export
the relevant definitions and factor out the code which verifies that the
appended signature trailer is valid.

Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it
and be able to use mod_check_sig() without having to depend on either
CONFIG_MODULE_SIG or CONFIG_MODULES.

s390 duplicated the definition of struct module_signature so now they can
use the new <linux/module_signature.h> header instead.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Acked-by: Jessica Yu <jeyu@kernel.org>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05 18:39:56 -04:00
Vasily Gorbik
1877011a35 s390/kexec: add missing include to machine_kexec_reloc.c
Include <asm/kexec.h> into machine_kexec_reloc.c to expose
arch_kexec_do_relocs declaration and avoid the following sparse warnings:
arch/s390/kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static?
arch/s390/boot/../kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static?

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-29 18:05:03 +02:00
Vasily Gorbik
06f9895fda s390/perf: make cf_diag_csd static
Since there is really no reason for cf_diag_csd per cpu variable to be
globally visible make it static to avoid the following sparse warning:
arch/s390/kernel/perf_cpum_cf_diag.c:37:1: warning: symbol 'cf_diag_csd' was not declared. Should it be static?

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-29 18:05:02 +02:00
Vasily Gorbik
5518aed82d s390: wire up clone3 system call
Tested (64-bit and compat mode) using program from
http://lkml.kernel.org/r/20190604212930.jaaztvkent32b7d3@brauner.io
with the following:
       return syscall(__NR_clone, flags, 0, pidfd, 0, 0);
changed to:
       return syscall(__NR_clone, 0, flags, pidfd, 0, 0);
due to CLONE_BACKWARDS2.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-07-23 10:45:53 +02:00
Matteo Croce
eec4844fae proc/sysctl: add shared variables for range check
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range.  This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data                                         old     new   delta
    sysctl_vals                                    -      12     +12
    __kstrtab_sysctl_vals                          -      12     +12
    max                                           14      10      -4
    int_max                                       16       -     -16
    one                                           68       -     -68
    zero                                         128      28    -100
    Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
  Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
  Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 17:08:07 -07:00
Christian Brauner
1a271a68e0 arch: mark syscall number 435 reserved for clone3
A while ago Arnd made it possible to give new system calls the same
syscall number on all architectures (except alpha). To not break this
nice new feature let's mark 435 for clone3 as reserved on all
architectures that do not yet implement it.
Even if an architecture does not plan to implement it this ensures that
new system calls coming after clone3 will have the same number on all
architectures.

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: linux-arch@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Link: https://lore.kernel.org/r/20190714192205.27190-2-christian@brauner.io
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <christian@brauner.io>
2019-07-15 00:39:33 +02:00
Linus Torvalds
aabfea8dc9 Merge tag 's390-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:

 - Fix integer overflow during stack frame unwind with invalid
   backchain.

 - Cleanup unused symbol export in zcrypt code.

 - Fix MIO addressing control activation in PCI code and expose its
   usage via sysfs.

 - Fix kernel image signature verification report presence detection.

 - Fix irq registration in vfio-ap code.

 - Add CPU measurement counters for newer machines.

 - Add base DASD thin provisioning support and code cleanups.

* tag 's390-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (21 commits)
  s390/unwind: avoid int overflow in outside_of_stack
  s390/zcrypt: remove the exporting of ap_query_configuration
  s390/pci: add mio_enabled attribute
  s390: fix setting of mio addressing control
  s390/ipl: Fix detection of has_secure attribute
  s390: vfio-ap: fix irq registration
  s390/cpumf: Add extended counter set definitions for model 8561 and 8562
  s390/dasd: Handle out-of-space constraint
  s390/dasd: Add discard support for ESE volumes
  s390/dasd: Use ALIGN_DOWN macro
  s390/dasd: Make dasd_setup_queue() a discipline function
  s390/dasd: Add new ioctl to release space
  s390/dasd: Add dasd_sleep_on_queue_interruptible()
  s390/dasd: Add missing intensity definition
  s390/dasd: Fix whitespace
  s390/dasd: Add dynamic formatting support for ESE volumes
  s390/dasd: Recognise data for ESE volumes
  s390/dasd: Put sub-order definitions in a separate section
  s390/dasd: Make layout analysis ESE compatible
  s390/dasd: Remove old defines and function
  ...
2019-07-12 15:39:22 -07:00
Vasily Gorbik
9a15919041 s390/unwind: avoid int overflow in outside_of_stack
When current task is interrupted in-between stack frame allocation
and backchain write instructions new stack frame backchain pointer
is left uninitialized. That invalid backchain value is passed into
outside_of_stack for sanity check. Make sure int overflow does not happen
by subtracting stack_frame size from the stack "end" rather than adding
it to "random" backchain value.

Fixes: 41b0474c1b1c ("s390/unwind: introduce stack unwind API")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-11 20:40:02 +02:00
Sebastian Ott
9964f396f1 s390: fix setting of mio addressing control
Move enablement of mio addressing control from detect_machine_facilities
to pci_base_init. detect_machine_facilities runs so early that the
static branches have not been toggled yet, thus mio addressing control
was always off. In pci_base_init we have to use the SMP aware
ctl_set_bit though.

Fixes: 833b441ec0 ("s390: enable processes for mio instructions")
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-11 20:40:02 +02:00
Philipp Rudo
1b2be2071a s390/ipl: Fix detection of has_secure attribute
Use the correct bit for detection of the machine capability associated
with the has_secure attribute. It is expected that the underlying
platform (including hypervisors) unsets the bit when they don't provide
secure ipl for their guests.

Fixes: c9896acc78 ("s390/ipl: Provide has_secure sysfs attribute")
Cc: stable@vger.kernel.org # 5.2
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-11 20:40:02 +02:00
Thomas Richter
820bace734 s390/cpumf: Add extended counter set definitions for model 8561 and 8562
Add the extended counter set definitions for s390 machine types
8561 and  8262. They are identical with machine types 3906 and
3907.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-11 20:40:01 +02:00
Linus Torvalds
5450e8a316 Merge tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd updates from Christian Brauner:
 "This adds two main features.

   - First, it adds polling support for pidfds. This allows process
     managers to know when a (non-parent) process dies in a race-free
     way.

     The notification mechanism used follows the same logic that is
     currently used when the parent of a task is notified of a child's
     death. With this patchset it is possible to put pidfds in an
     {e}poll loop and get reliable notifications for process (i.e.
     thread-group) exit.

   - The second feature compliments the first one by making it possible
     to retrieve pollable pidfds for processes that were not created
     using CLONE_PIDFD.

     A lot of processes get created with traditional PID-based calls
     such as fork() or clone() (without CLONE_PIDFD). For these
     processes a caller can currently not create a pollable pidfd. This
     is a problem for Android's low memory killer (LMK) and service
     managers such as systemd.

  Both patchsets are accompanied by selftests.

  It's perhaps worth noting that the work done so far and the work done
  in this branch for pidfd_open() and polling support do already see
  some adoption:

   - Android is in the process of backporting this work to all their LTS
     kernels [1]

   - Service managers make use of pidfd_send_signal but will need to
     wait until we enable waiting on pidfds for full adoption.

   - And projects I maintain make use of both pidfd_send_signal and
     CLONE_PIDFD [2] and will use polling support and pidfd_open() too"

[1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22
    https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22
    https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22

[2] aab6e3eb73/src/lxc/start.c (L1753)

* tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add pidfd_open() tests
  arch: wire-up pidfd_open()
  pid: add pidfd_open()
  pidfd: add polling selftests
  pidfd: add polling support
2019-07-10 22:17:21 -07:00
Linus Torvalds
cf2d213e49 Merge tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "These update PCI and ACPI power management (improved handling of ACPI
  power resources and PCIe link delays, fixes related to corner cases,
  hibernation handling rework), fix and extend the operating performance
  points (OPP) framework, add new cpufreq drivers for Raspberry Pi and
  imx8m chips, update some other cpufreq drivers, clean up assorted
  pieces of PM code and documentation and update tools.

  Specifics:

   - Improve the handling of shared ACPI power resources in the PCI bus
     type layer (Mika Westerberg).

   - Make the PCI layer take link delays required by the PCIe spec into
     account as appropriate and avoid polling devices in D3cold for PME
     (Mika Westerberg).

   - Fix some corner case issues in ACPI device power management and in
     the PCI bus type layer, optimiza and clean up the handling of
     runtime-suspended PCI devices during system-wide transitions to
     sleep states (Rafael Wysocki).

   - Rework hibernation handling in the ACPI core and the PCI bus type
     to resume runtime-suspended devices before hibernation (which
     allows some functional problems to be avoided) and fix some ACPI
     power management issues related to hiberation (Rafael Wysocki).

   - Extend the operating performance points (OPP) framework to support
     a wider range of devices (Rajendra Nayak, Stehpen Boyd).

   - Fix issues related to genpd_virt_devs and issues with platforms
     using the set_opp() callback in the OPP framework (Viresh Kumar,
     Dmitry Osipenko).

   - Add new cpufreq driver for Raspberry Pi (Nicolas Saenz Julienne).

   - Add new cpufreq driver for imx8m and imx7d chips (Leonard Crestez).

   - Fix and clean up the pcc-cpufreq, brcmstb-avs-cpufreq, s5pv210, and
     armada-37xx cpufreq drivers (David Arcari, Florian Fainelli, Paweł
     Chmiel, YueHaibing).

   - Clean up and fix the cpufreq core (Viresh Kumar, Daniel Lezcano).

   - Fix minor issue in the ACPI system sleep support code and export
     one function from it (Lenny Szubowicz, Dexuan Cui).

   - Clean up assorted pieces of PM code and documentation (Kefeng Wang,
     Andy Shevchenko, Bart Van Assche, Greg Kroah-Hartman, Fuqian Huang,
     Geert Uytterhoeven, Mathieu Malaterre, Rafael Wysocki).

   - Update the pm-graph utility to v5.4 (Todd Brandt).

   - Fix and clean up the cpupower utility (Abhishek Goel, Nick Black)"

* tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (57 commits)
  ACPI: PM: Make acpi_sleep_state_supported() non-static
  PM: sleep: Drop dev_pm_skip_next_resume_phases()
  ACPI: PM: Unexport acpi_device_get_power()
  Documentation: ABI: power: Add missing newline at end of file
  ACPI: PM: Drop unused function and function header
  ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS
  ACPI: PM: Simplify and fix PM domain hibernation callbacks
  PCI: PM: Simplify bus-level hibernation callbacks
  PM: ACPI/PCI: Resume all devices during hibernation
  cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update()
  cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get()
  kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset()
  cpufreq: Don't skip frequency validation for has_target() drivers
  PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete()
  PCI / ACPI: Add _PR0 dependent devices
  ACPI / PM: Introduce concept of a _PR0 dependent device
  PCI / ACPI: Use cached ACPI device state to get PCI device power state
  ACPI: PM: Allow transitions to D0 to occur in special cases
  ACPI: PM: Avoid evaluating _PS3 on transitions from D3hot to D3cold
  cpufreq: Use has_target() instead of !setpolicy
  ...
2019-07-09 10:05:22 -07:00
Linus Torvalds
5ad18b2e60 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull force_sig() argument change from Eric Biederman:
 "A source of error over the years has been that force_sig has taken a
  task parameter when it is only safe to use force_sig with the current
  task.

  The force_sig function is built for delivering synchronous signals
  such as SIGSEGV where the userspace application caused a synchronous
  fault (such as a page fault) and the kernel responded with a signal.

  Because the name force_sig does not make this clear, and because the
  force_sig takes a task parameter the function force_sig has been
  abused for sending other kinds of signals over the years. Slowly those
  have been fixed when the oopses have been tracked down.

  This set of changes fixes the remaining abusers of force_sig and
  carefully rips out the task parameter from force_sig and friends
  making this kind of error almost impossible in the future"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
  signal: Remove the signal number and task parameters from force_sig_info
  signal: Factor force_sig_info_to_task out of force_sig_info
  signal: Generate the siginfo in force_sig
  signal: Move the computation of force into send_signal and correct it.
  signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
  signal: Remove the task parameter from force_sig_fault
  signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
  signal: Explicitly call force_sig_fault on current
  signal/unicore32: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from ptrace_break
  signal/nds32: Remove tsk parameter from send_sigtrap
  signal/riscv: Remove tsk parameter from do_trap
  signal/sh: Remove tsk parameter from force_sig_info_fault
  signal/um: Remove task parameter from send_sigtrap
  signal/x86: Remove task parameter from send_sigtrap
  signal: Remove task parameter from force_sig_mceerr
  signal: Remove task parameter from force_sig
  signal: Remove task parameter from force_sigsegv
  ...
2019-07-08 21:48:15 -07:00
Steffen Maier
0328e519a7 docs: s390: unify and update s390dbf kdocs at debug.c
For non-static-inlines, debug.c already had non-compliant function
header docs. So move the pure prototype kdocs of
("s390: include/asm/debug.h add kerneldoc markups")
from debug.h to debug.c and merge them with the old function docs.
Also, I had the impression that kdoc typically is at the implementation
in the compile unit rather than at the prototype in the header file.

While at it, update the short kdoc description to distinguish the
different functions. And a few more consistency cleanups.

Added a new kdoc for debug_set_critical() since debug.h comments it
as part of the API.

Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1562149189-1417-3-git-send-email-maier@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-05 13:42:22 +02:00
Vasily Gorbik
2095574632 s390/kasan: avoid false positives during stack unwind
Avoid kasan false positive when current task is interrupted in-between
stack frame allocation and backchain write instructions leaving new stack
frame backchain invalid. In particular if backchain is 0 the unwinder
tries to read pt_regs from the stack and might hit kasan poisoned bytes,
leading to kasan "stack-out-of-bounds" report.

Disable kasan instrumentation of unwinder stack reads, since this
limitation couldn't be handled otherwise with current backchain unwinder
implementation.

Fixes: 78c98f9074 ("s390/unwind: introduce stack unwind API")
Reported-by: Julian Wiedmann <jwi@linux.ibm.com>
Tested-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-02 16:00:27 +02:00
Christian Brauner
7615d9e178 arch: wire-up pidfd_open()
This wires up the pidfd_open() syscall into all arches at once.

Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
2019-06-28 12:17:55 +02:00
Heiko Carstens
4ecf0a43e7 processor: get rid of cpu_relax_yield
stop_machine is the only user left of cpu_relax_yield. Given that it
now has special semantics which are tied to stop_machine introduce a
weak stop_machine_yield function which architectures can override, and
get rid of the generic cpu_relax_yield implementation.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-15 12:25:55 +02:00
Martin Schwidefsky
38f2c691a4 s390: improve wait logic of stop_machine
The stop_machine loop to advance the state machine and to wait for all
affected CPUs to check-in calls cpu_relax_yield in a tight loop until
the last missing CPUs acknowledged the state transition.

On a virtual system where not all logical CPUs are backed by real CPUs
all the time it can take a while for all CPUs to check-in. With the
current definition of cpu_relax_yield a diagnose 0x44 is done which
tells the hypervisor to schedule *some* other CPU. That can be any
CPU and not necessarily one of the CPUs that need to run in order to
advance the state machine. This can lead to a pretty bad diagnose 0x44
storm until the last missing CPU finally checked-in.

Replace the undirected cpu_relax_yield based on diagnose 0x44 with a
directed yield. Each CPU in the wait loop will pick up the next CPU
in the cpumask of stop_machine. The diagnose 0x9c is used to tell the
hypervisor to run this next CPU instead of the current one. If there
is only a limited number of real CPUs backing the virtual CPUs we
end up with the real CPUs passed around in a round-robin fashion.

[heiko.carstens@de.ibm.com]:
    Use cpumask_next_wrap as suggested by Peter Zijlstra.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-15 12:25:52 +02:00
Vasily Gorbik
b4e3133b65 s390/traps: simplify data exception handler
Simplify conditions and remove unnecessary variable in data exception
handler.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-15 12:25:45 +02:00
Mathieu Malaterre
1ec0cd8286 PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
The declaration for pfn_is_nosave is only available in
kernel/power/power.h. Since this function can be override in arch,
expose it globally. Having a prototype will make sure to avoid warning
(sometime treated as error with W=1) such as:

  arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 'pfn_is_nosave' [-Werror=missing-prototypes]

This moves the declaration into a globally visible header file and add
missing include to avoid a warning on powerpc.

Also remove the duplicated prototypes since not required anymore.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-14 10:48:56 +02:00
Heiko Carstens
2980ba6ae8 s390/kdump: get rid of compile warning
Move the CONFIG_CRASH_DUMP ifdef to get rid of this:

arch/s390/kernel/machine_kexec.c:146:22: warning: 'do_start_kdump' defined but not used [-Wunused-function]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-11 09:48:39 +02:00
Heiko Carstens
6887560c03 s390/jump_label: remove unused structure definition
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-07 10:10:10 +02:00
Heiko Carstens
3e8eb22fae s390: enforce CONFIG_HOTPLUG_CPU
x86 and powerpc (partially) enforce already CONFIG_HOTPLUG_CPU. On
s390 it is enabled on all distributions by default since ages.
The only exception is our zfcpdump kernel.

However to simplify testing, enforce HOTPLUG_CPU. This was suggested
by Paul McKenney, since his rcutorture test environments for CONFIG_SMP=y
only support HOTPLUG_CPU=y.

Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-07 10:09:42 +02:00
Heiko Carstens
67626fadd2 s390: enforce CONFIG_SMP
There never have been distributions that shiped with CONFIG_SMP=n for
s390. In addition the kernel currently doesn't even compile with
CONFIG_SMP=n for s390. Most likely it wouldn't even work, even if we
fix the compile error, since nobody tests it, since there is no use
case that I can think of.
Therefore simply enforce CONFIG_SMP and get rid of some more or
less unused code.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-07 10:09:37 +02:00
Martin Schwidefsky
fc20f0c1d7 s390/disassembler: update opcode table
Sync with binutils and add a couple of missing instructions.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-04 15:03:58 +02:00
Martin Schwidefsky
a646ef398e s390/jump_label: replace stop_machine with smp_call_function
The use of stop_machine to replace the mask bits of the jump label branch
is a very heavy-weight operation. This is in fact not necessary, the
mask of the branch can simply be updated, followed by a signal processor
to all the other CPUs to force them to pick up the modified instruction.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[heiko.carstens@de.ibm.com]: Change jump_label_make_nop() so we get
                             brcl 0,offset instead of brcl 0,0. This
                             makes sure that only the mask part of the
                             instruction gets changed when updated.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-04 15:03:12 +02:00