Commit Graph

1881 Commits

Author SHA1 Message Date
David Hildenbrand
ce2e4f0b75 KVM: s390: count vcpu wakeups in stat.halt_wakeup
This patch introduces the halt_wakeup counter used by common code and uses it to
count vcpu wakeups done in s390 arch specific code.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-10-01 14:42:14 +02:00
Heiko Carstens
cfb0b24143 s390/mm: make use of ipte range facility
Invalidate several pte entries at once if the ipte range facility
is available. Currently this works only for DEBUG_PAGE_ALLOC where
several up to 2 ^ MAX_ORDER may be invalidated at once.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-30 10:19:29 +02:00
Jan Willeke
2a0a5b2299 s390/uprobes: architecture backend for uprobes
Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:17 +02:00
Jan Willeke
975fab1739 s390/uprobes: common library for kprobes and uprobes
This patch moves common functions from kprobes.c to probes.c.
Thus its possible for uprobes to use them without enabling kprobes.

Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:14 +02:00
Martin Schwidefsky
bbae71bf9c s390/rwlock: use the interlocked-access facility 1 instructions
Make use of the load-and-add, load-and-or and load-and-and instructions
to atomically update the read-write lock without a compare-and-swap loop.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:13 +02:00
Martin Schwidefsky
2684e73a86 s390/rwlock: remove interrupt-enabling rwlock variant.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:10 +02:00
Heiko Carstens
6a5c1482e2 s390/mm: remove change bit override support
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:09 +02:00
Martin Schwidefsky
d59b93da5e s390/rwlock: use directed yield for write-locked rwlocks
Add an owner field to the arch_rwlock_t to be able to pass the timeslice
of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock
is write-locked. The undirected yield in case the rwlock is acquired
writable but the lock is read-locked is removed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:05 +02:00
Ralf Hoppe
8f933b1043 s390/hmcdrv: HMC drive CD/DVD access
This device driver allows accessing a HMC drive CD/DVD-ROM.
It can be used in a LPAR and z/VM environment.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ralf Hoppe <rhoppe@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:02 +02:00
Peter Zijlstra
c5c38ef3d7 irq_work: Introduce arch_irq_work_has_interrupt()
The nohz full code needs irq work to trigger its own interrupt so that
the subsystem can work even when the tick is stopped.

Lets introduce arch_irq_work_has_interrupt() that archs can override to
tell about their support for this ability.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-09-13 18:38:07 +02:00
Tony Krowiak
5102ee8795 KVM: CPACF: Enable MSA4 instructions for kvm guest
We have to provide a per guest crypto block for the CPUs to
enable MSA4 instructions. According to icainfo on z196 or
later this enables CCM-AES-128, CMAC-AES-128, CMAC-AES-192
and CMAC-AES-256.

Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[split MSA4/protected key into two patches]
2014-09-10 12:19:05 +02:00
Heiko Carstens
4423028203 s390/spinlock: optimize spin_unlock code
Use a memory barrier + store sequence instead of a load + compare and swap
sequence to unlock a spinlock and an rw lock.
For the spinlock case this saves us two memory reads and a not needed cpu
serialization after the compare and swap instruction stored the new value.

The kernel size (performance_defconfig) gets reduced by ~14k.

Average execution time of a tight inlined spin_unlock loop drops from
5.8ns to 0.7ns on a zEC12 machine.

An artificial stress test case where several counters are protected with
a single spinlock and which are only incremented while holding the spinlock
shows ~30% improvement on a 4 cpu machine.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:30 +02:00
Heiko Carstens
3d1e220d08 s390/ftrace: optimize mcount code
Reduce the number of executed instructions within the mcount block if
function tracing is enabled. We achieve that by using a non-standard
C function call ABI. Since the called function is also written in
assembler this is not a problem.
This also allows to replace the unconditional store at the beginning
of the mcount block with a larl instruction, which doesn't touch
memory.

In theory we could also patch the first instruction of the mcount block
to enable and disable function tracing. However this would break kprobes.
This could be fixed with implementing the "kprobes_on_ftrace" feature;
however keeping the odd jprobes working seems not to be possible without
a lot of code churn. Therefore keep the code easy and simply accept one
wasted 1-cycle "larl" instruction per function prologue.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:30 +02:00
Heiko Carstens
10dec7dbd5 s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_REGS support
This code is based on a patch from Vojtech Pavlik.
http://marc.info/?l=linux-s390&m=140438885114413&w=2

The actual implementation now differs significantly:
Instead of adding a second function "ftrace_regs_caller" which would be nearly
identical to the existing ftrace_caller function, the current ftrace_caller
function is now an alias to ftrace_regs_caller and always passes the needed
pt_regs structure and function_trace_op parameters unconditionally.

Besides that also use asm offsets to correctly allocate and access the new
struct pt_regs on the stack.

While at it we can make use of new instruction to get rid of some indirect
loads if compiled for new machines.

The passed struct pt_regs can be changed by the called function and it's new
contents will replace the current contents.

Note: to change the return address the embedded psw member of the pt_regs
structure must be changed. The psw member is right now incomplete, since
the mask part is missing. For all current use cases this should be sufficent.
Providing and restoring a sane mask would mean we need to add an epsw/lpswe
pair to the mcount code. Only these two instruction would cost us ~120 cycles
which currently seems not necessary.

Cc: Vojtech Pavlik <vojtech@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:28 +02:00
Heiko Carstens
2481a87b02 s390/ftrace: optimize function graph caller code
When the function graph tracer is disabled we can skip three additional
instructions. So let's just do this.

So if function tracing is enabled but function graph tracing is
runtime disabled, we get away with a single unconditional branch.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:28 +02:00
Martin Schwidefsky
b7eacb59cd s390/vdso: add vdso support for coarse clocks
Add CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE optimization to
the 64-bit and 31-bit vdso.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:27 +02:00
Heiko Carstens
b7d5006de1 s390: remove unused MACHINE_FLAG_RRBM
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:26 +02:00
Linus Torvalds
35af25616c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A bug fix for the vdso code, the loadparm for booting from SCSI is
  added and the access permissions for the dasd module parameters are
  corrected"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/vdso: remove NULL pointer check from clock_gettime
  s390/ipl: Add missing SCSI loadparm attributes to /sys/firmware
  s390/dasd: Make module parameter visible in sysfs
2014-09-08 08:27:00 -07:00
Linus Torvalds
2b12164b55 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A smattering of bug fixes across most architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  powerpc/kvm/cma: Fix panic introduces by signed shift operation
  KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
  KVM: s390/mm: Fix storage key corruption during swapping
  arm/arm64: KVM: Complete WFI/WFE instructions
  ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU
  KVM: s390/mm: try a cow on read only pages for key ops
  KVM: s390: Fix user triggerable bug in dead code
2014-09-06 16:42:12 -07:00
Christian Borntraeger
1951497d90 KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
commit 0944fe3f4a ("s390/mm: implement software referenced bits")
triggered another paging/storage key corruption. There is an
unhandled invalid->valid pte change where we have to set the real
storage key from the pgste.
When doing paging a guest page might be swapcache or swap and when
faulted in it might be read-only and due to a parallel scan old.
An do_wp_page will make it writeable and young. Due to software
reference tracking this page was invalid and now becomes valid.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org # v3.12+
2014-09-02 10:30:43 +02:00
Christian Borntraeger
3e03d4c46d KVM: s390/mm: Fix storage key corruption during swapping
Since 3.12 or more precisely  commit 0944fe3f4a ("s390/mm:
implement software referenced bits") guest storage keys get
corrupted during paging. This commit added another valid->invalid
translation for page tables - namely ptep_test_and_clear_young.
We have to transfer the storage key into the pgste in that case.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org # v3.12+
2014-09-02 10:30:43 +02:00
Michael Holzheu
6992860167 s390/ipl: Add missing SCSI loadparm attributes to /sys/firmware
Currently the loadparm is only supported for CCW IPL. But also for SCSI
IPL it can be specified either on the HMC load panel respectively
z/VM console or via diagnose 308.

So fix this for SCSI and add the required sysfs attributes for reading the
IPL loadparm and for setting the loadparm for re-IPL.

With this patch the following two sysfs attributes are introduced:

 - /sys/firmware/ipl/loadparm (for system that have been IPLed from SCSI)
 - /sys/firmware/reipl/fcp/loadparm

Because the loadparm is now available for SCSI and CCW it is moved
now from "struct ipl_block_ccw" to the generic "struct ipl_list_hdr".

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-01 09:56:29 +02:00
Radim Krčmář
13a34e067e KVM: remove garbage arg to *hardware_{en,dis}able
In the beggining was on_each_cpu(), which required an unused argument to
kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten.

Remove unnecessary arguments that stem from this.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-29 16:35:55 +02:00
Radim Krčmář
0865e636ae KVM: static inline empty kvm_arch functions
Using static inline is going to save few bytes and cycles.
For example on powerpc, the difference is 700 B after stripping.
(5 kB before)

This patch also deals with two overlooked empty functions:
kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c
  2df72e9bc KVM: split kvm_arch_flush_shadow
and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c.
  e790d9ef6 KVM: add kvm_arch_sched_in

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-29 16:35:55 +02:00
Paolo Bonzini
656473003b KVM: forward declare structs in kvm_types.h
Opaque KVM structs are useful for prototypes in asm/kvm_host.h, to avoid
"'struct foo' declared inside parameter list" warnings (and consequent
breakage due to conflicting types).

Move them from individual files to a generic place in linux/kvm_types.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-29 16:35:53 +02:00
Christoph Lameter
eb7e7d7663 s390: Replace __get_cpu_var uses
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: linux390@de.ibm.com
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-26 13:45:52 -04:00
Martin Schwidefsky
f079e95214 KVM: s390/mm: remove outdated gmap data structures
The radix tree rework removed all code that uses the gmap_rmap
and gmap_pgtable data structures. Remove these outdated definitions.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-26 10:09:03 +02:00
Martin Schwidefsky
c6c956b80b KVM: s390/mm: support gmap page tables with less than 5 levels
Add an addressing limit to the gmap address spaces and only allocate
the page table levels that are needed for the given limit. The limit
is fixed and can not be changed after a gmap has been created.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-26 10:09:03 +02:00
Martin Schwidefsky
527e30b41d KVM: s390/mm: use radix trees for guest to host mappings
Store the target address for the gmap segments in a radix tree
instead of using invalid segment table entries. gmap_translate
becomes a simple radix_tree_lookup, gmap_fault is split into the
address translation with gmap_translate and the part that does
the linking of the gmap shadow page table with the process page
table.
A second radix tree is used to keep the pointers to the segment
table entries for segments that are mapped in the guest address
space. On unmap of a segment the pointer is retrieved from the
radix tree and is used to carry out the segment invalidation in
the gmap shadow page table. As the radix tree can only store one
pointer, each host segment may only be mapped to exactly one
guest location.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-26 10:09:02 +02:00
Thierry Reding
90114d65fe s390: Implement dma_{alloc,free}_attrs()
The S390 architecture advertises support for HAVE_DMA_ATTRS when PCI is
enabled. Patches to unify some of the DMA API would like to rely on the
dma_alloc_attrs() and dma_free_attrs() functions to be provided when an
architecture supports DMA attributes.

Rename dma_alloc_coherent() and dma_free_coherent() to dma_alloc_attrs()
and dma_free_attrs() since they are functionally equivalent and alias
the former to the latter for compatibility.

For consistency with other architectures, also reuse the existing symbol
HAVE_DMA_ATTRS defined in arch/Kconfig instead of providing a duplicate.
Select it when PCI is enabled.

While at it, drop a redundant 'default n' from the PCI Kconfig symbol.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-By: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2014-08-26 07:39:12 +02:00
Martin Schwidefsky
6e0a0431bf KVM: s390/mm: cleanup gmap function arguments, variable names
Make the order of arguments for the gmap calls more consistent,
if the gmap pointer is passed it is always the first argument.
In addition distinguish between guest address and user address
by naming the variables gaddr for a guest address and vmaddr for
a user address.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-25 14:35:58 +02:00
Martin Schwidefsky
9da4e38076 KVM: s390/mm: readd address parameter to gmap_do_ipte_notify
Revert git commit c3a23b9874c1 ("remove unnecessary parameter from
gmap_do_ipte_notify").

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-25 14:35:57 +02:00
Martin Schwidefsky
55dbbdd9a8 KVM: s390/mm: readd address parameter to pgste_ipte_notify
Revert git commit 1b7fd6952063 ("remove unecessary parameter from
pgste_ipte_notify")

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-25 14:35:57 +02:00
David Hildenbrand
b028ee3edd KVM: s390: synchronize more registers with kvm_run
In order to reduce the number of syscalls when dropping to user space, this
patch enables the synchronization of the following "registers" with kvm_run:
- ARCH0: CPU timer, clock comparator, TOD programmable register,
         guest breaking-event register, program parameter
- PFAULT: pfault parameters (token, select, compare)

The registers are grouped to reduce the overhead when syncing.

As this grows the number of sync registers quite a bit, let's move the code
synchronizing registers with kvm_run from kvm_arch_vcpu_ioctl_run() into
separate helper routines.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-25 14:35:53 +02:00
Heiko Carstens
7bb1cdbfe2 s390: wire up memfd_create syscall
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-08-12 13:00:08 +02:00
Heiko Carstens
b8a7a990ed s390: wire up seccomp and getrandom syscalls
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-08-12 13:00:06 +02:00
Andy Lutomirski
a6c19dfe39 arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area
The core mm code will provide a default gate area based on
FIXADDR_USER_START and FIXADDR_USER_END if
!defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR).

This default is only useful for ia64.  arm64, ppc, s390, sh, tile, 64-bit
UML, and x86_32 have their own code just to disable it.  arm, 32-bit UML,
and x86_64 have gate areas, but they have their own implementations.

This gets rid of the default and moves the code into ia64.

This should save some code on architectures without a gate area: it's now
possible to inline the gate_area functions in the default case.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle]
Acked-by: Richard Weinberger <richard@nod.at> [for um]
Acked-by: Will Deacon <will.deacon@arm.com> [for arm64]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Lynch <Nathan_Lynch@mentor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 15:57:27 -07:00
Laura Abbott
308c09f17d lib/scatterlist: make ARCH_HAS_SG_CHAIN an actual Kconfig
Rather than have architectures #define ARCH_HAS_SG_CHAIN in an
architecture specific scatterlist.h, make it a proper Kconfig option and
use that instead.  At same time, remove the header files are are now
mostly useless and just include asm-generic/scatterlist.h.

[sfr@canb.auug.org.au: powerpc files now need asm/dma.h]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>			[x86]
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	[powerpc]
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 15:57:26 -07:00
Linus Torvalds
ebb067d2f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "Mostly cleanups and bug-fixes, with two exceptions.

  The first is lazy flushing of I/O-TLBs for PCI to improve performance,
  the second is software dirty bits in the pmd for the madvise-free
  implementation"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits)
  s390/locking: Reenable optimistic spinning
  s390/mm: implement dirty bits for large segment table entries
  KVM: s390/mm: Fix page table locking vs. split pmd lock
  s390/dasd: fix camel case
  s390/3215: fix hanging console issue
  s390/irq: improve displayed interrupt order in /proc/interrupts
  s390/seccomp: fix error return for filtered system calls
  s390/pci: introduce lazy IOTLB flushing for DMA unmap
  dasd: fix error recovery for alias devices during format
  dasd: fix list_del corruption during format
  dasd: fix unresponsive device during format
  dasd: use aliases for formatted devices during format
  s390/pci: fix kmsg component
  s390/kdump: Return NOTIFY_OK for all actions other than MEM_GOING_OFFLINE
  s390/watchdog: Fix module name in Kconfig help text
  s390/dasd: replace seq_printf by seq_puts
  s390/dasd: replace pr_warning by pr_warn
  s390/dasd: Move EXPORT_SYMBOL after function/variable
  s390/dasd: remove unnecessary null test before debugfs_remove
  s390/zfcp: use qdio buffer helpers
  ...
2014-08-07 08:41:00 -07:00
Linus Torvalds
ef35ad26f8 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar:
 "Kernel side changes:

   - Consolidate the PMU interrupt-disabled code amongst architectures
     (Vince Weaver)

   - misc fixes

  Tooling changes (new features, user visible changes):

   - Add support for pagefault tracing in 'trace', please see multiple
     examples in the changeset messages (Stanislav Fomichev).

   - Add pagefault statistics in 'trace' (Stanislav Fomichev)

   - Add header for columns in 'top' and 'report' TUI browsers (Jiri
     Olsa)

   - Add pagefault statistics in 'trace' (Stanislav Fomichev)

   - Add IO mode into timechart command (Stanislav Fomichev)

   - Fallback to syscalls:* when raw_syscalls:* is not available in the
     perl and python perf scripts.  (Daniel Bristot de Oliveira)

   - Add --repeat global option to 'perf bench' to be used in benchmarks
     such as the existing 'futex' one, that was modified to use it
     instead of a local option.  (Davidlohr Bueso)

   - Fix fd -> pathname resolution in 'trace', be it using /proc or a
     vfs_getname probe point.  (Arnaldo Carvalho de Melo)

   - Add suggestion of how to set perf_event_paranoid sysctl, to help
     non-root users trying tools like 'trace' to get a working
     environment.  (Arnaldo Carvalho de Melo)

   - Updates from trace-cmd for traceevent plugin_kvm plus args cleanup
     (Steven Rostedt, Jan Kiszka)

   - Support S/390 in 'perf kvm stat' (Alexander Yarygin)

  Tooling infrastructure changes:

   - Allow reserving a row for header purposes in the hists browser
     (Arnaldo Carvalho de Melo)

   - Various fixes and prep work related to supporting Intel PT (Adrian
     Hunter)

   - Introduce multiple debug variables control (Jiri Olsa)

   - Add callchain and additional sample information for python scripts
     (Joseph Schuchart)

   - More prep work to support Intel PT: (Adrian Hunter)
     - Polishing 'script' BTS output
     - 'inject' can specify --kallsym
     - VDSO is per machine, not a global var
     - Expose data addr lookup functions previously private to 'script'
     - Large mmap fixes in events processing

   - Include standard stringify macros in power pc code (Sukadev
     Bhattiprolu)

  Tooling cleanups:

   - Convert open coded equivalents to asprintf() (Andy Shevchenko)

   - Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo)

   - Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de
     Melo)

   - No need to reimplement err() in 'perf bench sched-messaging', drop
     barf().  (Davidlohr Bueso).

   - Remove ev_name argument from perf_evsel__hists_browse, can be
     obtained from the other parameters.  (Jiri Olsa)

  Tooling fixes:

   - Fix memory leak in the 'sched-messaging' perf bench test.
     (Davidlohr Bueso)

   - The -o and -n 'perf bench mem' options are mutually exclusive, emit
     error when both are specified.  (Davidlohr Bueso)

   - Fix scrollbar refresh row index in the ui browser, problem exposed
     now that headers will be added and will be allowed to be switched
     on/off.  (Jiri Olsa)

   - Handle the num array type in python properly (Sebastian Andrzej
     Siewior)

   - Fix wrong condition for allocation failure (Jiri Olsa)

   - Adjust callchain based on DWARF debug info on powerpc (Sukadev
     Bhattiprolu)

   - Fix a risk for doing free on uninitialized pointer in traceevent
     lib (Rickard Strandqvist)

   - Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa)

   - Enable close-on-exec flag on perf file descriptor (Yann Droneaud)

   - Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo)

   - Event ordering fixes (Jiri Olsa)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits)
  Revert "perf tools: Fix jump label always changing during tracing"
  perf tools: Fix perf usage string leftover
  perf: Check permission only for parent tracepoint event
  perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds
  perf record: Always force PERF_RECORD_FINISHED_ROUND event
  perf inject: Add --kallsyms parameter
  perf tools: Expose 'addr' functions so they can be reused
  perf session: Fix accounting of ordered samples queue
  perf powerpc: Include util/util.h and remove stringify macros
  perf tools: Fix build on gcc 4.4.7
  perf tools: Add thread parameter to vdso__dso_findnew()
  perf tools: Add dso__type()
  perf tools: Separate the VDSO map name from the VDSO dso name
  perf tools: Add vdso__new()
  perf machine: Fix the lifetime of the VDSO temporary file
  perf tools: Group VDSO global variables into a structure
  perf session: Add ability to skip 4GiB or more
  perf session: Add ability to 'skip' a non-piped event stream
  perf tools: Pass machine to vdso__dso_findnew()
  perf tools: Add dso__data_size()
  ...
2014-08-04 16:09:53 -07:00
Linus Torvalds
8efb90cf1e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle are:

   - big rtmutex and futex cleanup and robustification from Thomas
     Gleixner
   - mutex optimizations and refinements from Jason Low
   - arch_mutex_cpu_relax() removal and related cleanups
   - smaller lockdep tweaks"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  arch, locking: Ciao arch_mutex_cpu_relax()
  locking/lockdep: Only ask for /proc/lock_stat output when available
  locking/mutexes: Optimize mutex trylock slowpath
  locking/mutexes: Try to acquire mutex only if it is unlocked
  locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro
  locking/mutexes: Correct documentation on mutex optimistic spinning
  rtmutex: Make the rtmutex tester depend on BROKEN
  futex: Simplify futex_lock_pi_atomic() and make it more robust
  futex: Split out the first waiter attachment from lookup_pi_state()
  futex: Split out the waiter check from lookup_pi_state()
  futex: Use futex_top_waiter() in lookup_pi_state()
  futex: Make unlock_pi more robust
  rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  rtmutex: Cleanup deadlock detector debug logic
  rtmutex: Confine deadlock logic to futex
  rtmutex: Simplify remove_waiter()
  rtmutex: Document pi chain walk
  rtmutex: Clarify the boost/deboost part
  rtmutex: No need to keep task ref for lock owner check
  rtmutex: Simplify and document try_to_take_rtmutex()
  ...
2014-08-04 16:09:06 -07:00
Linus Torvalds
8533ce7271 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
 "These are the x86, MIPS and s390 changes; PPC and ARM will come in a
  few days.

  MIPS and s390 have little going on this release; just bugfixes, some
  small, some larger.

  The highlights for x86 are nested VMX improvements (Jan Kiszka),
  optimizations for old processor (up to Nehalem, by me and Bandan Das),
  and a lot of x86 emulator bugfixes (Nadav Amit).

  Stephen Rothwell reported a trivial conflict with the tracing branch"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (104 commits)
  x86/kvm: Resolve shadow warnings in macro expansion
  KVM: s390: rework broken SIGP STOP interrupt handling
  KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
  KVM: vmx: remove duplicate vmx_mpx_supported() prototype
  KVM: s390: Fix memory leak on busy SIGP stop
  x86/kvm: Resolve shadow warning from min macro
  kvm: Resolve missing-field-initializers warnings
  Replace NR_VMX_MSR with its definition
  KVM: x86: Assertions to check no overrun in MSR lists
  KVM: x86: set rflags.rf during fault injection
  KVM: x86: Setting rflags.rf during rep-string emulation
  KVM: x86: DR6/7.RTM cannot be written
  KVM: nVMX: clean up nested_release_vmcs12 and code around it
  KVM: nVMX: fix lifetime issues for vmcs02
  KVM: x86: Defining missing x86 vectors
  KVM: x86: emulator injects #DB when RFLAGS.RF is set
  KVM: x86: Cleanup of rflags.rf cleaning
  KVM: x86: Clear rflags.rf on emulated instructions
  KVM: x86: popf emulation should not change RF
  KVM: x86: Clearing rflags.rf upon skipped emulated instruction
  ...
2014-08-04 12:16:46 -07:00
Martin Schwidefsky
152125b7a8 s390/mm: implement dirty bits for large segment table entries
The large segment table entry format has block of bits for the
ACC/F values for the large page. These bits are valid only if
another bit (AV bit 0x10000) of the segment table entry is set.
The ACC/F bits do not have a meaning if the AV bit is off.
This allows to put the THP splitting bit, the segment young bit
and the new segment dirty bit into the ACC/F bits as long as
the AV bit stays off. The dirty and young information is only
available if the pmd is large.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-08-01 14:55:05 +02:00
Jan Willeke
dc295880c6 s390/seccomp: fix error return for filtered system calls
The syscall_set_return_value function of s390 negates the error argument
before storing the value to the return register gpr2. This is incorrect,
the seccomp code already passes the negative error value.
Store the unmodified error value to gpr2.

Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-28 10:02:31 +02:00
Ingo Molnar
5030c69755 Merge tag 'v3.16-rc7' into perf/core, to merge in the latest fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-28 10:00:33 +02:00
Sebastian Ott
5245c924c2 s390/qdio: add helpers to manage qdio buffers
Users of qdio buffers employ different strategies to manage these
buffers. The qeth driver uses huge contiguous buffers which leads
to high order allocations with all their downsides.

This patch provides helpers to allocate, free, and reset arrays of
qdio buffers using non contiguous pages.

Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-22 09:26:13 +02:00
Christian Borntraeger
e59d120f96 KVM: s390: add ipte to trace event decoding
IPTE intercept can happen, let's decode that.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-07-21 13:22:47 +02:00
David Hildenbrand
ea74c0ea1b KVM: s390: remove the tasklet used by the hrtimer
We can get rid of the tasklet used for waking up a VCPU in the hrtimer
code but wakeup the VCPU directly.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-21 13:22:42 +02:00
David Hildenbrand
0759d0681c KVM: s390: cleanup handle_wait by reusing kvm_vcpu_block
This patch cleans up the code in handle_wait by reusing the common code
function kvm_vcpu_block.

signal_pending(), kvm_cpu_has_pending_timer() and kvm_arch_vcpu_runnable() are
sufficient for checking if we need to wake-up that VCPU. kvm_vcpu_block
uses these functions, so no checks are lost.

The flag "timer_due" can be removed - kvm_cpu_has_pending_timer() tests whether
the timer is pending, thus the vcpu is correctly woken up.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-21 13:22:16 +02:00
Davidlohr Bueso
3a6bfbc91d arch, locking: Ciao arch_mutex_cpu_relax()
The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.

This patch (i) renames the call to cpu_relax_lowlatency  ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bharat Bhushan <r65777@freescale.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Stratos Karafotis <stratosk@semaphore.gr>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-cris-kernel@axis.com
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux@lists.openrisc.net
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-metag@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-17 12:32:47 +02:00