Commit Graph

4769 Commits

Author SHA1 Message Date
Linus Torvalds
39c1573748 Merge tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Fixes marked for stable:
   - Set missing wakeup bit in LPCR on POWER9
   - Fix the early OPAL console wrappers
   - Fixup kernel read only mapping

  Fixes for code merged this cycle:
   - Fix missing CRCs, add more asm-prototypes.h declarations"

* tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fixup kernel read only mapping
  powerpc/boot: Fix the early OPAL console wrappers
  powerpc: Fix missing CRCs, add more asm-prototypes.h declarations
  powerpc: Set missing wakeup bit in LPCR on POWER9
2016-11-26 11:24:03 -08:00
Aneesh Kumar K.V
984d7a1ec6 powerpc/mm: Fixup kernel read only mapping
With commit e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO") we
started using the ppp value 0b110 to map kernel readonly. But that
facility was only added as part of ISA 2.04. For earlier ISA version
only supported ppp bit value for readonly mapping is 0b011. (This
implies both user and kernel get mapped using the same ppp bit value for
readonly mapping.).
Update the code such that for earlier architecture version we use ppp
value 0b011 for readonly mapping. We don't differentiate between power5+
and power5 here and apply the new ppp bits only from power6 (ISA 2.05).
This keep the changes minimal.

This fixes issue with PS3 spu usage reported at
https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org

Fixes: e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: stable@vger.kernel.org # v4.7+
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 14:18:25 +11:00
Michael Ellerman
da58b23cb9 powerpc: Fix __cmpxchg() to take a volatile ptr again
In commit d0563a1297 ("powerpc: Implement {cmp}xchg for u8 and u16")
we removed the volatile from __cmpxchg().

This is leading to warnings such as:

  drivers/gpu/drm/drm_lock.c: In function ‘drm_lock_take’:
  arch/powerpc/include/asm/cmpxchg.h:484:37: warning: passing argument 1
  of ‘__cmpxchg’ discards ‘volatile’ qualifier from pointer target
     (__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_,   \

There doesn't seem to be consensus across architectures whether the
argument is volatile or not, so at least for now put the volatile back.

Fixes: d0563a1297 ("powerpc: Implement {cmp}xchg for u8 and u16")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 14:07:50 +11:00
Michael Ellerman
ddbefe7e77 Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we're sharing with the kvm-ppc tree.
2016-11-24 22:14:52 +11:00
Paul Mackerras
84f7139c06 KVM: PPC: Book3S HV: Enable hypervisor virtualization interrupts while in guest
The new XIVE interrupt controller on POWER9 can direct external
interrupts to the hypervisor or the guest.  The interrupts directed to
the hypervisor are controlled by an LPCR bit called LPCR_HVICE, and
come in as a "hypervisor virtualization interrupt".  This sets the
LPCR bit so that hypervisor virtualization interrupts can occur while
we are in the guest.  We then also need to cope with exiting the guest
because of a hypervisor virtualization interrupt.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-24 09:24:23 +11:00
Paul Mackerras
f725758b89 KVM: PPC: Book3S HV: Use OPAL XICS emulation on POWER9
POWER9 includes a new interrupt controller, called XIVE, which is
quite different from the XICS interrupt controller on POWER7 and
POWER8 machines.  KVM-HV accesses the XICS directly in several places
in order to send and clear IPIs and handle interrupts from PCI
devices being passed through to the guest.

In order to make the transition to XIVE easier, OPAL firmware will
include an emulation of XICS on top of XIVE.  Access to the emulated
XICS is via OPAL calls.  The one complication is that the EOI
(end-of-interrupt) function can now return a value indicating that
another interrupt is pending; in this case, the XIVE will not signal
an interrupt in hardware to the CPU, and software is supposed to
acknowledge the new interrupt without waiting for another interrupt
to be delivered in hardware.

This adapts KVM-HV to use the OPAL calls on machines where there is
no XICS hardware.  When there is no XICS, we look for a device-tree
node with "ibm,opal-intc" in its compatible property, which is how
OPAL indicates that it provides XICS emulation.

In order to handle the EOI return value, kvmppc_read_intr() has
become kvmppc_read_one_intr(), with a boolean variable passed by
reference which can be set by the EOI functions to indicate that
another interrupt is pending.  The new kvmppc_read_intr() keeps
calling kvmppc_read_one_intr() until there are no more interrupts
to process.  The return value from kvmppc_read_intr() is the
largest non-zero value of the returns from kvmppc_read_one_intr().

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-24 09:24:23 +11:00
Paul Mackerras
7c5b06cadf KVM: PPC: Book3S HV: Adapt TLB invalidations to work on POWER9
POWER9 adds new capabilities to the tlbie (TLB invalidate entry)
and tlbiel (local tlbie) instructions.  Both instructions get a
set of new parameters (RIC, PRS and R) which appear as bits in the
instruction word.  The tlbiel instruction now has a second register
operand, which contains a PID and/or LPID value if needed, and
should otherwise contain 0.

This adapts KVM-HV's usage of tlbie and tlbiel to work on POWER9
as well as older processors.  Since we only handle HPT guests so
far, we need RIC=0 PRS=0 R=0, which ends up with the same instruction
word as on previous processors, so we don't need to conditionally
execute different instructions depending on the processor.

The local flush on first entry to a guest in book3s_hv_rmhandlers.S
is a loop which depends on the number of TLB sets.  Rather than
using feature sections to set the number of iterations based on
which CPU we're on, we now work out this number at VM creation time
and store it in the kvm_arch struct.  That will make it possible to
get the number from the device tree in future, which will help with
compatibility with future processors.

Since mmu_partition_table_set_entry() does a global flush of the
whole LPID, we don't need to do the TLB flush on first entry to the
guest on each processor.  Therefore we don't set all bits in the
tlb_need_flush bitmap on VM startup on POWER9.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-24 09:24:23 +11:00
Paul Mackerras
e9cf1e0856 KVM: PPC: Book3S HV: Add new POWER9 guest-accessible SPRs
This adds code to handle two new guest-accessible special-purpose
registers on POWER9: TIDR (thread ID register) and PSSCR (processor
stop status and control register).  They are context-switched
between host and guest, and the guest values can be read and set
via the one_reg interface.

The PSSCR contains some fields which are guest-accessible and some
which are only accessible in hypervisor mode.  We only allow the
guest-accessible fields to be read or set by userspace.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-24 09:24:23 +11:00
Paul Mackerras
bc33b1fc83 Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in the ppc-kvm topic branch to get changes to
arch/powerpc code that are necessary for adding POWER9 KVM support.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-24 09:22:28 +11:00
Christophe Leroy
6533b7c16e powerpc: Initial stack protector (-fstack-protector) support
Partialy copied from commit c743f38013 ("ARM: initial stack protector
(-fstack-protector) support")

This is the very basic stuff without the changing canary upon
task switch yet.  Just the Kconfig option and a constant canary
value initialized at boot time.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 22:57:15 +11:00
Pan Xinhui
d0563a1297 powerpc: Implement {cmp}xchg for u8 and u16
Implement xchg{u8,u16}{local,relaxed}, and
cmpxchg{u8,u16}{,local,acquire,relaxed}.

It works on all ppc.

remove volatile of first parameter in __cmpxchg_local and __cmpxchg

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 22:56:26 +11:00
Naveen N. Rao
6cc89bad60 powerpc/kprobes: Invoke handlers directly
Invoke the kprobe handlers directly rather than through notify_die(), to
reduce path taken for handling kprobes. Similar to commit 6f6343f53d
("kprobes/x86: Call exception handlers directly from do_int3/do_debug").

While at it, rename post_kprobe_handler() to kprobe_post_handler() for
more uniform naming.

Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 22:56:25 +11:00
Naveen N. Rao
82de5797a2 powerpc: Remove extraneous header from asm-prototypes.h
Commit 03465f899b ("powerpc: Use kprobe blacklist for exception
handlers") removed __kprobes annotation from some of the prototypes,
but left the kprobes header include directive unchanged. Remove it as it
is no longer needed.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 22:56:21 +11:00
Ingo Molnar
ec84f00567 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-23 10:23:09 +01:00
Michael Neuling
02ed21aeda powerpc/powernv: Define and set POWER9 HFSCR doorbell bit
Define and set the POWER9 HFSCR doorbell bit so that guests can use
msgsndp.

ISA 3.0 calls this MSGP, so name it accordingly in the code.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 11:18:22 +11:00
Michael Ellerman
1f0f2e7227 powerpc/reg: Add definition for LPCR_PECE_HVEE
ISA 3.0 defines a new PECE (Power-saving mode Exit Cause Enable) field
in the LPCR (Logical Partitioning Control Register), called
LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable).

KVM code will need to know about this bit, so add a definition for it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:12 +11:00
Suraj Jitindar Singh
9dd17e8517 powerpc/64: Define new ISA v3.00 logical PVR value and PCR register value
ISA 3.00 adds the logical PVR value 0x0f000005, so add a definition for
this.

Define PCR_ARCH_207 to reflect ISA 2.07 compatibility mode in the processor
compatibility register (PCR).

[paulus@ozlabs.org - moved dummy PCR_ARCH_300 value into next patch]

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:12 +11:00
Paul Mackerras
ffe6d810fe powerpc/powernv: Define real-mode versions of OPAL XICS accessors
This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.

It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:11 +11:00
Paul Mackerras
9d66195807 powerpc/64: Provide functions for accessing POWER9 partition table
POWER9 requires the host to set up a partition table, which is a
table in memory indexed by logical partition ID (LPID) which
contains the pointers to page tables and process tables for the
host and each guest.

This factors out the initialization of the partition table into
a single function.  This code was previously duplicated between
hash_utils_64.c and pgtable-radix.c.

This provides a function for setting a partition table entry,
which is used in early MMU initialization, and will be used by
KVM whenever a guest is created.  This function includes a tlbie
instruction which will flush all TLB entries for the LPID and
all caches of the partition table entry for the LPID, across the
system.

This also moves a call to memblock_set_current_limit(), which was
in radix_init_partition_table(), but has nothing to do with the
partition table.  By analogy with the similar code for hash, the
call gets moved to near the end of radix__early_init_mmu().  It
now gets called when running as a guest, whereas previously it
would only be called if the kernel is running as the host.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:11 +11:00
Pan Xinhui
41946c8687 locking/core, powerpc: Implement vcpu_is_preempted(cpu)
Optimize spinlock and mutex busy-loops by providing a vcpu_is_preempted(cpu)
function on pSeries. We do not support PowerNV.

All this can be achieved by using lppaca->yield_count, which is zero on PowerNV.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-5-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:06 +01:00
Ingo Molnar
02cb689b2c Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:37:38 +01:00
Nicholas Piggin
9e5f688422 powerpc: Fix missing CRCs, add more asm-prototypes.h declarations
After patch 4efca4ed0 ("kbuild: modversions for EXPORT_SYMBOL() for asm"),
asm exports can get modversions CRCs generated if they have C definitions
in asm-prototypes.h. This patch adds missing definitions for 32 and 64 bit
allmodconfig builds.

Fixes: 9445aa1a30 ("ppc: move exports to definitions")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 22:25:41 +11:00
Benjamin Herrenschmidt
7a43906f5c powerpc: Set missing wakeup bit in LPCR on POWER9
There is a new bit, LPCR_PECE_HVEE (Hypervisor Virtualization Exit
Enable), which controls wakeup from STOP states on Hypervisor
Virtualization Interrupts (which happen to also be all external
interrupts in host or bare metal mode).

It needs to be set or we will miss wakeups.

Fixes: 9baaef0a22 ("powerpc/irq: Add support for HV virtualization interrupts")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rename it to HVEE to match the name in the ISA]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 14:53:27 +11:00
Russell Currey
6654c9368a powerpc/eeh: Refactor EEH PE reset functions
eeh_pe_reset and eeh_reset_pe are two different functions in the same
file which do mostly the same thing.  Not only is this confusing, but
potentially causes disrepancies in functionality, notably eeh_reset_pe
as it does not check return values for failure.

Refactor this into the following:

 - eeh_pe_reset(): stays as is, performs a single operation, exported
 - eeh_pe_reset_full(): new, full reset process that calls eeh_pe_reset()
 - eeh_reset_pe(): removed and replaced by eeh_pe_reset_full()
 - eeh_reset_pe_once(): removed

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 11:57:08 +11:00
Paul Mackerras
7fd317f8c3 powerpc/64: Add some more SPRs and SPR bits for POWER9
These definitions will be needed by KVM.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 10:19:31 +11:00
Paul Mackerras
0d808df06a KVM: PPC: Book3S HV: Save/restore XER in checkpointed register state
When switching from/to a guest that has a transaction in progress,
we need to save/restore the checkpointed register state.  Although
XER is part of the CPU state that gets checkpointed, the code that
does this saving and restoring doesn't save/restore XER.

This fixes it by saving and restoring the XER.  To allow userspace
to read/write the checkpointed XER value, we also add a new ONE_REG
specifier.

The visible effect of this bug is that the guest may see its XER
value being corrupted when it uses transactions.

Fixes: e4e3812150 ("KVM: PPC: Book3S HV: Add transactional memory support")
Fixes: 0a8eccefcb ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-21 15:17:55 +11:00
Yongji Xie
a56ee9f8f0 KVM: PPC: Book3S HV: Add a per vcpu cache for recently page faulted MMIO entries
This keeps a per vcpu cache for recently page faulted MMIO entries.
On a page fault, if the entry exists in the cache, we can avoid some
time-consuming paths, for example, looking up HPT, locking HPTE twice
and searching mmio gfn from memslots, then directly call
kvmppc_hv_emulate_mmio().

In current implenment, we limit the size of cache to four. We think
it's enough to cover the high-frequency MMIO HPTEs in most case.
For example, considering the case of using virtio device, for virtio
legacy devices, one HPTE could handle notifications from up to
1024 (64K page / 64 byte Port IO register) devices, so one cache entry
is enough; for virtio modern devices, we always need one HPTE to handle
notification for each device because modern device would use a 8M MMIO
register to notify host instead of Port IO register, typically the
system's configuration should not exceed four virtio devices per
vcpu, four cache entry is also enough in this case. Of course, if needed,
we could also modify the macro to a module parameter in the future.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-21 15:17:55 +11:00
Daniel Axtens
ebe4535fbe KVM: PPC: Book3S HV: sparse: prototypes for functions called from assembler
A bunch of KVM functions are only called from assembler.
Give them prototypes in asm-prototypes.h
This reduces sparse warnings.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-11-21 15:17:54 +11:00
Linus Torvalds
f6918382c7 Merge tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Fixes marked for stable:
   - fix system reset interrupt winkle wakeups
   - fix setting of AIL in hypervisor mode

  Fixes for code merged this cycle:
   - fix exception vector build with 2.23 era binutils
   - fix missing update of HID register on secondary CPUs

  Other:
   - fix missing pr_cont()s
   - invalidate ERAT on tlbiel for POWER9 DD1"

* tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fix missing update of HID register on secondary CPUs
  powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1
  powerpc/64: Fix setting of AIL in hypervisor mode
  powerpc/oops: Fix missing pr_cont()s in instruction dump
  powerpc/oops: Fix missing pr_cont()s in show_regs()
  powerpc/oops: Fix missing pr_cont()s in print_msr_bits() et. al.
  powerpc/oops: Fix missing pr_cont()s in show_stack()
  powerpc: Fix exception vector build with 2.23 era binutils
  powerpc/64s: Fix system reset interrupt winkle wakeups
2016-11-19 11:21:59 -08:00
Christophe Leroy
c05f69a3dc powerpc/64: get rid of MIN_HUGEPTE_SHIFT
MIN_HUGEPTE_SHIFT hasn't been used since commit d1837cba5d
("powerpc/mm: Cleanup initialization of hugepages on powerpc")

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 22:40:37 +11:00
Michael Neuling
96ed1fe511 powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1
On POWER9 DD1, when we do a local TLB invalidate we also need to explicitly
invalidate the ERAT.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 15:12:24 +11:00
Christian Borntraeger
6d0d287891 locking/core: Provide common cpu_relax_yield() definition
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:17:36 +01:00
Michael Ellerman
8f272a5dd6 powerpc/pseries: Move CMO code from plapr_wrappers.h to platforms/pseries
Currently there's some CMO (Cooperative Memory Overcommit) code, in
plpar_wrappers.h. Some of it is #ifdef CONFIG_PSERIES and some of it
isn't. The end result being if a file includes plpar_wrappers.h it won't
build with CONFIG_PSERIES=n.

Fix it by moving the CMO code into platforms/pseries. The two hcall
wrappers can just be moved into their only caller, cmm.c, and the
accessors can go in pseries.h.

Note we need the accessors because cmm.c can be built as a module, so
there needs to be a split between the built-in code vs the module, and
that's achieved by using those accessors.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-17 17:11:46 +11:00
Christian Borntraeger
5bd0b85ba8 locking/core, arch: Remove cpu_relax_lowlatency()
As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:11 +01:00
Christian Borntraeger
79ab11cdb9 locking/core: Introduce cpu_relax_yield()
For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:09 +01:00
Paul Mackerras
6b243fcfb5 powerpc/64: Simplify adaptation to new ISA v3.00 HPTE format
This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit.  This is OK because the
old format contains all the information that is in the new format.

This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format.  At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.

This fixes and partially reverts commit 50de596de8
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).

Fixes: 50de596de8 ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-16 15:43:25 +11:00
Stanislaw Gruszka
981ee2d444 sched/cputime, powerpc: Remove cputime_to_scaled()
Currently cputime_to_scaled() just return it's argument on
all implementations, we don't need to call this function.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479175612-14718-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-15 09:51:04 +01:00
Stanislaw Gruszka
7008eb997b sched/cputime, powerpc: Remove cputime_last_delta global variable
Since commit:

  cf9efce0ce ("powerpc: Account time using timebase rather than PURR")

cputime_last_delta is not initialized to other value than 0, hence it's
not used except zero check and cputime_to_scaled() just returns
the argument.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479175612-14718-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-15 09:51:04 +01:00
Michael Neuling
29a969b764 powerpc: Revert Load Monitor Register Support
Load monitored is no longer supported on POWER9 so let's remove the
code.

This reverts commit bd3ea317fd ("powerpc: Load Monitor Register
Support").

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 20:05:57 +11:00
Anton Blanchard
5246adec59 powerpc/pseries: Use H_CLEAR_HPT to clear MMU hash table during kexec
An hcall was recently added that does exactly what we need during kexec
- it clears the entire MMU hash table, ignoring any VRMA mappings.

Try it and fall back to the old method if we get a failure.

On a POWER8 box with 5TB of memory, this reduces the time it takes to
kexec a new kernel from from 4 minutes to 1 minute.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
[mpe: Split into separate functions and tweak function naming]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin
c0a5149105 powerpc: Make _ASM_NOKPROBE_SYMBOL a noop when KPROBES not defined
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin
5b9ff02785 powerpc: Build-time sort the exception table
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin
61a92f7031 powerpc: Add support for relative exception tables
This halves the exception table size on 64-bit builds, and it allows
build-time sorting of exception tables to work on relocated kernels.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Minor asm fixups and bits to keep the selftests working]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin
24bfa6a9e0 powerpc: EX_TABLE macro for exception tables
This macro is taken from s390, and allows more flexibility in
changing exception table format.

mpe: Put it in ppc_asm.h and only define one version using
stringinfy_in_c(). Add some empty definitions and headers to keep the
selftests happy.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Michael Ellerman
e3f2c6c393 powerpc/asm: Allow including ppc_asm.h in asm files
There's no reason to #error if we include ppc_asm.h in asm files, the
ifdef already prevents any problems.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin
f4329f2ecb powerpc/64s: Reduce exception alignment
Exception handlers are aligned to 128 bytes (L1 cache) on 64s, which is
overkill. It can reduce the icache footprint of any individual exception
path. However taken as a whole, the expansion in icache footprint seems
likely to be counter-productive and cause more total misses.

Create IFETCH_ALIGN_SHIFT/BYTES, which should give optimal ifetch
alignment with much more reasonable alignment. This saves 1792 bytes
from head_64.o text with an allmodconfig build.

Other subarchitectures should define appropriate IFETCH_ALIGN_SHIFT
values if this becomes more widely used.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Rui Teng
fda0440d82 powerpc: Remove suspect CONFIG_PPC_BOOK3E #ifdefs in nohash/64/pgtable.h
There are three #ifdef CONFIG_PPC_BOOK3E sections in nohash/64/pgtable.h.
And there should be no configurations possible which use nohash/64/pgtable.h
but don't also enable CONFIG_PPC_BOOK3E.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Hugh Dickins
e6740ae631 powerpc: Fix exception vector build with 2.23 era binutils
The changes to use gas sections for constructing the exception vectors
causes a build break when using binutils 2.23:

  arch/powerpc/kernel/exceptions-64s.S:770: Error: operand out of range
  (0xffffffffffff8100 is not between 0x0000000000000000 and 0x000000000000ffff)

And so on.

Reported by Hugh with binutils-2.23.2-8.1.4.ppc64 from openSUSE 13.1 and
also Naveen & Denis using 2.23.52.0.1-26.el7 from RHEL 7. Strangely
binutils 2.22 (what I test with) is not affected.

This is caused by the use of @l in LOAD_HANDLER(). The @l was only
recently added in commit a24553dd02 ("powerpc/pseries: Remove
unnecessary syscall trampoline").

Luckily the gas section changes split out the LOAD_SYSCALL_HANDLER()
macro, which means we actually *don't* need to use @l in LOAD_HANDLER()
any more, only in LOAD_SYSCALL_HANDLER().

So drop the @l from LOAD_HANDLER().

Fixes: 57f266497d ("powerpc: Use gas sections for arranging exception vectors")
Signed-off-by: Hugh Dickins <hughd@google.com>
[mpe: Add gory details to change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-12 20:12:49 +11:00
Nicholas Piggin
f23ed166f2 powerpc/64s: Fix system reset interrupt winkle wakeups
Wakeups from winkle set the low bit of the HSPRG0 register, to
distinguish it from other sleep states. This is also the PACA pointer.
The system reset exception handler fails to mask this bit away before
using this value before using it as the PACA pointer.

Fix this by adding a new type of exception prolog macro where we already
have the PACA set in r13, and have the system reset vector mask it out.
The winkle wakeup handler will store the masked value back into HSPRG0.

Fixes: fb479e44a9 ("powerpc/64s: relocation, register save fixes for system reset interrupt")
Cc: stable@vger.kernel.org # v3.0+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-12 20:12:42 +11:00
Ingo Molnar
4c8ee71620 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11 08:25:07 +01:00