Commit Graph

6340 Commits

Author SHA1 Message Date
Nicholas Piggin
771d4304d0 powerpc/64s/idle: Process interrupts from system reset wakeup
When the CPU wakes from low power state, it begins at the system reset
interrupt with the exception that caused the wakeup encoded in SRR1.

Today, powernv idle wakeup ignores the wakeup reason (except a special
case for HMI), and the regular interrupt corresponding to the
exception will fire after the idle wakeup exits.

Change this to replay the interrupt from the idle wakeup before
interrupts are hard-enabled.

Test on POWER8 of context_switch selftests benchmark with polling idle
disabled (e.g., always nap, giving cross-CPU IPIs) gives the following
results:

                                original         wakeup direct
Different threads, same core:   315k/s           264k/s
Different cores:                235k/s           242k/s

There is a slowdown for doorbell IPI (same core) case because system
reset wakeup does not clear the message and the doorbell interrupt
fires again needlessly.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19 19:46:27 +10:00
Nicholas Piggin
2201f994a5 powerpc/64s/idle: Move soft interrupt mask logic into C code
This simplifies the asm and fixes irq-off tracing over sleep
instructions.

Also move powersave_nap check for POWER8 into C code, and move
PSSCR register value calculation for POWER9 into C.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19 19:46:26 +10:00
Paul Mackerras
579006944e KVM: PPC: Book3S HV: Virtualize doorbell facility on POWER9
On POWER9, we no longer have the restriction that we had on POWER8
where all threads in a core have to be in the same partition, so
the CPU threads are now independent.  However, we still want to be
able to run guests with a virtual SMT topology, if only to allow
migration of guests from POWER8 systems to POWER9.

A guest that has a virtual SMT mode greater than 1 will expect to
be able to use the doorbell facility; it will expect the msgsndp
and msgclrp instructions to work appropriately and to be able to read
sensible values from the TIR (thread identification register) and
DPDES (directed privileged doorbell exception status) special-purpose
registers.  However, since each CPU thread is a separate sub-processor
in POWER9, these instructions and registers can only be used within
a single CPU thread.

In order for these instructions to appear to act correctly according
to the guest's virtual SMT mode, we have to trap and emulate them.
We cause them to trap by clearing the HFSCR_MSGP bit in the HFSCR
register.  The emulation is triggered by the hypervisor facility
unavailable interrupt that occurs when the guest uses them.

To cause a doorbell interrupt to occur within the guest, we set the
DPDES register to 1.  If the guest has interrupts enabled, the CPU
will generate a doorbell interrupt and clear the DPDES register in
hardware.  The DPDES hardware register for the guest is saved in the
vcpu->arch.vcore->dpdes field.  Since this gets written by the guest
exit code, other VCPUs wishing to cause a doorbell interrupt don't
write that field directly, but instead set a vcpu->arch.doorbell_request
flag.  This is consumed and set to 0 by the guest entry code, which
then sets DPDES to 1.

Emulating reads of the DPDES register is somewhat involved, because
it requires reading the doorbell pending interrupt status of all of the
VCPU threads in the virtual core, and if any of those VCPUs are
running, their doorbell status is only up-to-date in the hardware
DPDES registers of the CPUs where they are running.  In order to get
a reasonable approximation of the current doorbell status, we send
those CPUs an IPI, causing an exit from the guest which will update
the vcpu->arch.vcore->dpdes field.  We then use that value in
constructing the emulated DPDES register value.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-19 14:34:37 +10:00
Paul Mackerras
769377f77c KVM: PPC: Book3S HV: Context-switch HFSCR between host and guest on POWER9
This adds code to allow us to use a different value for the HFSCR
(Hypervisor Facilities Status and Control Register) when running the
guest from that which applies in the host.  The reason for doing this
is to allow us to trap the msgsndp instruction and related operations
in future so that they can be virtualized.  We also save the value of
HFSCR when a hypervisor facility unavailable interrupt occurs, because
the high byte of HFSCR indicates which facility the guest attempted to
access.

We save and restore the host value on guest entry/exit because some
bits of it affect host userspace execution.

We only do all this on POWER9, not on POWER8, because we are not
intending to virtualize any of the facilities controlled by HFSCR on
POWER8.  In particular, the HFSCR bit that controls execution of
msgsndp and related operations does not exist on POWER8.  The HFSCR
doesn't exist at all on POWER7.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-19 14:08:02 +10:00
Naveen N. Rao
d89ba5353f powerpc/64s: Handle data breakpoints in Radix mode
On Power9, trying to use data breakpoints throws the splat shown
below. This is because the check for a data breakpoint in DSISR is in
do_hash_page(), which is not called when in Radix mode.

  Unable to handle kernel paging request for data at address 0xc000000000e19218
  Faulting instruction address: 0xc0000000001155e8
  cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20]
  pc: c0000000001155e8: find_pid_ns+0x48/0xe0
  lr: c000000000116ac4: find_task_by_vpid+0x44/0x90
  sp: c0000000ef1e7da0
  msr: 9000000000009033
  dar: c000000000e19218
  dsisr: 400000

Move the check to handle_page_fault() so as to catch data breakpoints
in both Hash and Radix MMU modes.

We have to change the check in do_hash_page() against 0xa410 to use
0xa450, so as to include the value of (DSISR_DABRMATCH << 16).

There are two sites that call handle_page_fault() when in Radix, both
already pass DSISR in r4.

Fixes: caca285e5a ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Fix the fall-through case on hash, we need to reload DSISR]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16 19:49:43 +10:00
Naveen N. Rao
c05b8c4474 powerpc/kprobes: Skip livepatch_handler() for jprobes
ftrace_caller() depends on a modified regs->nip to detect if a certain
function has been livepatched. However, with KPROBES_ON_FTRACE, it is
possible for regs->nip to have been modified by the kprobes pre_handler
(jprobes, for instance). In this case, we do not want to invoke the
livepatch_handler so as not to consume the livepatch stack.

To distinguish between the two (kprobes and livepatch), we check if
there is an active kprobe on the current function. If there is, then we
know for sure that it must have modified the NIP as we don't support
livepatching a kprobe'd function. In this case, we simply skip the
livepatch_handler and branch to the new NIP. Otherwise, the
livepatch_handler is invoked.

Fixes: ead514d5fb ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16 19:49:43 +10:00
Naveen N. Rao
a4979a7e71 powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
For DYNAMIC_FTRACE_WITH_REGS, we should be passing-in the original set
of registers in pt_regs, to capture the state _before_ ftrace_caller.
However, we are instead passing the stack pointer *after* allocating a
stack frame in ftrace_caller. Fix this by saving the proper value of r1
in pt_regs. Also, use SAVE_10GPRS() to simplify the code.

Fixes: 153086644f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16 19:49:43 +10:00
Naveen N. Rao
a9f8553e93 powerpc/kprobes: Pause function_graph tracing during jprobes handling
This fixes a crash when function_graph and jprobes are used together.
This is essentially commit 237d28db03 ("ftrace/jprobes/x86: Fix
conflict between jprobes and function graph tracing"), but for powerpc.

Jprobes breaks function_graph tracing since the jprobe hook needs to use
jprobe_return(), which never returns back to the hook, but instead to
the original jprobe'd function. The solution is to momentarily pause
function_graph tracing before invoking the jprobe hook and re-enable it
when returning back to the original jprobe'd function.

Fixes: 6794c78243 ("powerpc64: port of the function graph tracer")
Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16 19:49:43 +10:00
Nicholas Piggin
07d2a628bc powerpc/64s: Avoid cpabort in context switch when possible
The ISA v3.0B copy-paste facility only requires cpabort when switching
to a process that has foreign real addresses mapped (direct access to
accelerators), to clear a potential copy buffer filled by a previous
thread. There is no accelerator driver implemented yet, so cpabort can
be removed. It can be be re-added when a driver is implemented.

POWER9 DD1 requires the copy buffer to always be cleared on context
switch, but if accelerators are not in use, then an unpaired copy from
a dummy region is sufficient to clear data out of the copy buffer.

This increases context switch performance by about 5% on POWER9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
9145effd62 powerpc/64: Drop explicit hwsync in context switch
The sync (aka. hwsync, aka. heavyweight sync) in the context switch
code to prevent MMIO access being reordered from the point of view of
a single process if it gets migrated to a different CPU is not
required because there is an hwsync performed earlier in the context
switch path.

Comment this so it's clear enough if anything changes on the scheduler
or the powerpc sides. Remove the hwsync from _switch.

This improves context switch performance by 2-3% on POWER8.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
837e72f78a powerpc/64: Drop reservation-clearing ldarx in context switch
There is no need to explicitly break the reservation in _switch,
because we are guaranteed that the context switch path will include a
larx/stcx.

Comment the guarantee and remove the reservation clear from _switch.

This is worth 1-2% in context switch performance.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
e4c0fc5f72 powerpc/64s: Leave interrupts hard enabled in context switch for radix
Commit 4387e9ff25 ("[POWERPC] Fix PMU + soft interrupt disable bug")
hard disabled interrupts over the low level context switch, because
the SLB management can't cope with a PMU interrupt accesing the stack
in that window.

Radix based kernel mapping does not use the SLB so it does not require
interrupts hard disabled here.

This is worth 1-2% in context switch performance on POWER9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
bc4f65e4cf powerpc/64: Avoid restore_math call if possible in syscall exit
The syscall exit code that branches to restore_math is quite heavy on
Book3S, consisting of 2 mtmsr instructions. Threads that don't use both
FP and vector can get caught here if the kernel ever uses FP or vector.
Lazy-FP/vec context switching also trips this case.

So check for lazy FP and vector before switching RI for restore_math.
Move most of this case out of line.

For threads that do want to restore math registers, the MSR switches are
still suboptimal. Future direction may be to use a soft-RI bit to avoid
MSR switches in kernel (similar to soft-EE), but for now at least the
no-restore

POWER9 context switch rate increases by about 5% due to sched_yield(2)
return performance. I haven't constructed a test to measure the syscall
cost.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
acd7d8cef0 powerpc/64s: Optimize hypercall/syscall entry
After bc3551257a ("powerpc/64: Allow for relocation-on interrupts from
guest to host"), a getppid() system call goes from 307 cycles to 358
cycles (+17%) on POWER8. This is due significantly to the scratch SPR
used by the hypercall check.

It turns out there are a some volatile registers common to both system
call and hypercall (in particular, r12, cr0, ctr), which can be used to
avoid the SPR and some other overheads. This brings getppid to 320 cycles
(+4%).

Testing hcall entry performance by running "sc 1" in guest userspace
before this patch is 854 cycles, afterwards is 826. Also a small win
there.

POWER9 syscall is improved by about the same amount, hcall not tested.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Michael Ellerman
0edc2ca9cc Revert "powerpc: Handle simultaneous interrupts at once"
This reverts commit 45cb08f479.

For some reason this is causing IRQ problems on Freescale Book3E
machines, eg on my p5020ds:

  irq 25: nobody cared (try booting with the "irqpoll" option)
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc3-gcc-6.3.1-00037-g45cb08f4791c #624
  Call Trace:
  [c0000000fffdbb10] [c00000000049962c] .dump_stack+0xa8/0xe8 (unreliable)
  [c0000000fffdbba0] [c0000000000babf4] .__report_bad_irq+0x54/0x140
  [c0000000fffdbc40] [c0000000000bb11c] .note_interrupt+0x324/0x380
  [c0000000fffdbd00] [c0000000000b7110] .handle_irq_event_percpu+0x68/0x88
  [c0000000fffdbd90] [c0000000000b718c] .handle_irq_event+0x5c/0xa8
  [c0000000fffdbe10] [c0000000000bc01c] .handle_fasteoi_irq+0xe4/0x298
  [c0000000fffdbe90] [c0000000000b59c4] .generic_handle_irq+0x50/0x74
  [c0000000fffdbf10] [c0000000000075d8] .__do_irq+0x74/0x1f0
  [c0000000fffdbf90] [c0000000000189f8] .call_do_irq+0x14/0x24
  [c0000000f7173060] [c0000000000077e4] .do_IRQ+0x90/0x120
  [c0000000f7173100] [c00000000001d93c] exc_0x500_common+0xfc/0x100
  --- interrupt: 501 at .prepare_to_wait_event+0xc/0x14c
      LR = .fsl_elbc_run_command+0xc8/0x23c
  [c0000000f71734d0] [c00000000065f418] .nand_reset+0xb8/0x168
  [c0000000f7173560] [c00000000065fec4] .nand_scan_ident+0x2b0/0x1638
  [c0000000f7173650] [c000000000666cd8] .fsl_elbc_nand_probe+0x34c/0x5f0
  ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
  [c0000000f7173750] [c0000000005a3c60] .platform_drv_probe+0x64/0xb0
  [c0000000f71737d0] [c0000000005a12e0] .really_probe+0x290/0x334
  [c0000000f7173870] [c0000000005a14a0] .__driver_attach+0x11c/0x120
  [c0000000f7173900] [c00000000059e6a0] .bus_for_each_dev+0x98/0xfc
  [c0000000f71739a0] [c0000000005a0b3c] .driver_attach+0x34/0x4c
  [c0000000f7173a20] [c0000000005a04b0] .bus_add_driver+0x1ac/0x2e0
  [c0000000f7173ac0] [c0000000005a2170] .driver_register+0x94/0x160
  [c0000000f7173b40] [c0000000005a3be0] .__platform_driver_register+0x60/0x7c
  [c0000000f7173bc0] [c000000000d6aab4] .fsl_elbc_nand_driver_init+0x24/0x38
  [c0000000f7173c30] [c000000000001934] .do_one_initcall+0x68/0x1b8
  [c0000000f7173d00] [c000000000d210f8] .kernel_init_freeable+0x260/0x338
  [c0000000f7173db0] [c0000000000021b0] .kernel_init+0x20/0xe70
  [c0000000f7173e30] [c0000000000009bc] .ret_from_kernel_thread+0x58/0x9c
  handlers:
  [<c000000000ed85c8>] .fsl_lbc_ctrl_irq
  Disabling IRQ #25

Ben also had concerns with the implementation being potentially slow on
some PICs, so revert it for now.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:20:46 +10:00
Aneesh Kumar K.V
92d9dfda8b powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space
Supporting 512TB requires us to do a order 3 allocation for level 1 page
table (pgd). This results in page allocation failures with certain workloads.
For now limit 4k linux page size config to 64TB.

Fixes: f6eedbba7a ("powerpc/mm/hash: Increase VA range to 128TB")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-08 20:42:56 +10:00
Michael Ellerman
ba4a648f12 powerpc/numa: Fix percpu allocations to be NUMA aware
In commit 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
switched to the generic implementation of cpu_to_node(), which uses a percpu
variable to hold the NUMA node for each CPU.

Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
of our percpu areas, leading to a chicken and egg problem. In practice what
happens is when we are setting up the percpu areas, cpu_to_node() reports that
all CPUs are on node 0, so we allocate all percpu areas on node 0.

This is visible in the dmesg output, as all pcpu allocs being in group 0:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
  pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
  pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47

To fix it we need an early_cpu_to_node() which can run prior to percpu being
setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
in. With the patch dmesg output shows two groups, 0 and 1:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
  pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
  pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47

We can also check the data_offset in the paca of various CPUs, with the fix we
see:

  CPU 0:  data_offset = 0x0ffe8b0000
  CPU 24: data_offset = 0x1ffe5b0000

And we can see from dmesg that CPU 24 has an allocation on node 1:

  node   0: [mem 0x0000000000000000-0x0000000fffffffff]
  node   1: [mem 0x0000001000000000-0x0000001fffffffff]

Cc: stable@vger.kernel.org # v3.16+
Fixes: 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06 21:19:46 +10:00
Nicholas Piggin
90df4bfb4d powerpc/64s: Machine check handle ifetch from foreign real address for POWER9
The i-side 0111b machine check, which is "Instruction Fetch to foreign
address space", was missed by 7b9f71f974 ("powerpc/64s: POWER9 machine
check handler").

    The POWER9 processor core considers host real addresses with a
    nonzero value in RA(8:12) as foreign address space, accessible only
    by the copy and paste instructions. The copy and paste instruction
    pair can be used to invoke the Nest accelerators via the Virtual
    Accelerator Switchboard (VAS).

It is an error for any regular load/store or ifetch to go to a foreign
addresses. When relocation is on, this causes an MMU exception. When
relocation is off, a machine check exception. It is possible to trigger
this machine check by branching to a foreign address with MSR[IR]=0.

Fixes: 7b9f71f974 ("powerpc/64s: POWER9 machine check handler")
Reported-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06 21:17:15 +10:00
Breno Leitao
7f22ced437 powerpc/kernel: Initialize load_tm on task creation
Currently tsk->thread.load_tm is not initialized in the task creation
and can contain garbage on a new task.

This is an undesired behaviour, since it affects the timing to enable
and disable the transactional memory laziness (disabling and enabling
the MSR TM bit, which affects TM reclaim and recheckpoint in the
scheduling process).

Fixes: 5d176f751e ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06 19:09:22 +10:00
Breno Leitao
1195892c09 powerpc/kernel: Fix FP and vector register restoration
Currently tsk->thread->load_vec and load_fp are not initialized during
task creation, which can lead to garbage values in these variables (non-zero
values).

These variables will be checked later in restore_math() to validate if the
FP and vector registers are being utilized. Since these values might be
non-zero, the restore_math() will continue to save the FP and vectors even if
they were never utilized by the userspace application. load_fp and load_vec
counters will then overflow (they wrap at 255) and the FP and Altivec will be
finally disabled, but before that condition is reached (counter overflow)
several context switches will have restored FP and vector registers without
need, causing a performance degradation.

Fixes: 70fe3d980f ("powerpc: Restore FPU/VEC/VSX if previously used")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-05 15:55:30 +10:00
Hari Bathini
48a316e350 powerpc/fadump: Set an upper limit for boot memory size
By default, 5% of system RAM is reserved for preserving boot memory.
Alternatively, a user can specify the amount of memory to reserve.
See Documentation/powerpc/firmware-assisted-dump.txt for details. In
addition to the memory reserved for preserving boot memory, some more
memory is reserved, to save HPTE region, CPU state data and ELF core
headers.

Memory Reservation during first kernel looks like below:

  Low memory                                        Top of memory
  0      boot memory size                                       |
  |           |                       |<--Reserved dump area -->|
  V           V                       |   Permanent Reservation V
  +-----------+----------/ /----------+---+----+-----------+----+
  |           |                       |CPU|HPTE|  DUMP     |ELF |
  +-----------+----------/ /----------+---+----+-----------+----+
        |                                           ^
        |                                           |
        \                                           /
         -------------------------------------------
          Boot memory content gets transferred to
          reserved area by firmware at the time of
          crash

This implicitly means that the sum of the sizes of boot memory, CPU
state data, HPTE region, DUMP preserving area and ELF core headers
can't be greater than the total memory size. But currently, a user is
allowed to specify any value as boot memory size. So, the above rule
is violated when a boot memory size around 50% of the total available
memory is specified. As the kernel is not handling this currently, it
may lead to undefined behavior. Fix it by setting an upper limit for
boot memory size to 25% of the total available memory. Also, instead
of using memblock_end_of_DRAM(), which doesn't take the holes, if any,
in the memory layout into account, use memblock_phys_mem_size() to
calculate the percentage of total available memory.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 20:16:50 +10:00
Hari Bathini
e7467dc694 powerpc/fadump: Update comment about offset where fadump is reserved
With commit f6e6bedb77 ("powerpc/fadump: Reserve memory at an offset
closer to bottom of RAM"), memory for fadump is no longer reserved at
the top of RAM. But there are still a few places which say so. Change
them appropriately.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 20:16:49 +10:00
Hari Bathini
81d9eca502 powerpc/fadump: Add a warning when 'fadump_reserve_mem=' is used
With commit 11550dc0a0 ("powerpc/fadump: reuse crashkernel parameter
for fadump memory reservation"), 'fadump_reserve_mem=' parameter is
deprecated in favor of 'crashkernel=' parameter. Add a warning if
'fadump_reserve_mem=' is still used.

Fixes: 11550dc0a0 ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation")
Suggested-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Unsplit long printk strings]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 20:16:35 +10:00
Michal Suchanek
98b8cd7f75 powerpc/fadump: Return error when fadump registration fails
- log an error message when registration fails and no error code listed
   in the switch is returned
 - translate the hv error code to posix error code and return it from
   fw_register
 - return the posix error code from fw_register to the process writing
   to sysfs
 - return EEXIST on re-registration
 - return success on deregistration when fadump is not registered
 - return ENODEV when no memory is reserved for fadump

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Tested-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Use pr_err() to shrink the error print]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 19:23:57 +10:00
Christophe Leroy
45cb08f479 powerpc: Handle simultaneous interrupts at once
It often happens to have simultaneous interrupts, for instance
when having double Ethernet attachment. With the current
implementation, we suffer the cost of kernel entry/exit for each
interrupt.

This patch introduces a loop in __do_irq() to handle all interrupts
at once before returning.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 19:20:44 +10:00
Christophe Leroy
362957c27e powerpc/40x: Clear MSR_DR in one insn instead of two
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02 19:20:43 +10:00
Michael Ellerman
0e5e7f5e97 powerpc/64: Reclaim CPU_FTR_SUBCORE
We are running low on CPU feature bits, so we only want to use them when
it's really necessary.

CPU_FTR_SUBCORE is only used in one place, and only in C, so we don't
need it in order to make asm patching work. It can only be set on
"Power8" CPUs, which in practice means POWER8, POWER8E and POWER8NVL.
There are no plans to implement it on future CPUs, but if there ever
were we could retrofit it then.

Although KVM uses subcores, it never looks at the CPU feature, it either
looks at the ISA level or the threads_per_subcore value.

So drop the CPU feature and do a PVR check instead. Drop the device tree
"subcore" feature as we no longer support doing anything with it, and we
will drop it from skiboot too.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-01 19:56:28 +10:00
Nicholas Piggin
a2b05b7aa6 powerpc/64s: Add dt_cpu_ftrs boot time setup option
Provide a dt_cpu_ftrs= cmdline option to disable the dt_cpu_ftrs CPU
feature discovery, and fall back to the "cputable" based version.

Also allow control of advertising unknown features to userspace and
with this parameter, and remove the clunky CONFIG option.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add explicit early check of bootargs in dt_cpu_ftrs_init()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-01 19:54:33 +10:00
Nicholas Piggin
83a092cf95 powerpc: Link warning for orphan sections
Add --orphan-handling=warn to final link flags. This ensures we can
handle all sections explicitly. This would have caught subtle breakage
such as 7de3b27bac at build-time.

Also bring existing orphan sections into the fold:
- .text.hot and .text.unlikely are compiler generated sections.
- .sdata2, .dynsbss, .plt are used by PPC32
- We previously did not specify DWARF_DEBUG or STABS_DEBUG
- DWARF_DEBUG did not include all DWARF sections that can be emitted
- A number of sections are unused and can be discarded.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Nicholas Piggin
c494adefef powerpc/64: Tool to check head sections location sanity
Use a tool to check that the location of "fixed sections" are where
we expected them to be, which catches cases the linker script can't
(stubs being added to start of .text section), and which ends up
being neater.

Sample output:

  ERROR: start_text address is c000000000008100, should be c000000000008000
  ERROR: see comments in arch/powerpc/tools/head_check.sh

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fold in fix from Nick for 4.6 era toolchains]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Nicholas Piggin
951eedebcd powerpc/64: Handle linker stubs in low .text code
Very large kernels may require linker stubs for branches from HEAD
text code. The linker may place these stubs before the HEAD text
sections, which breaks the assumption that HEAD text is located at 0
(or the .text section being located at 0x7000/0x8000 on Book3S
kernels).

Provide an option to create a small section just before the .text
section with an empty 256 - 4 bytes, and adjust the start of the .text
section to match. The linker will tend to put stubs in that section
and not break our relative-to-absolute offset assumptions.

This causes a small waste of space on common kernels, but allows large
kernels to build and boot. For now, it is an EXPERT config option,
defaulting to =n, but a reference is provided for it in the build-time
check for such breakage. This is good enough for allyesconfig and
custom users / hackers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Nicholas Piggin
e8c688251d powerpc/64: Place sfpr section explicitly with the linker script
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Ivan Mikhaylov
6e2f03e292 powerpc/[booke|4xx]: Don't clobber TCR[WP] when setting TCR[DIE]
Prevent a kernel panic caused by unintentionally clearing TCR watchdog
bits. At this point in the kernel boot, the watchdog may have already
been enabled by u-boot. The original code's attempt to write to the TCR
register results in an inadvertent clearing of the watchdog
configuration bits, causing the 476 to reset.

Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Gautham R. Shenoy
22c6663dc6 powerpc/powernv/idle: Use Requested Level for restoring state on P9 DD1
On Power9 DD1 due to a hardware bug the Power-Saving Level Status
field (PLS) of the PSSCR for a thread waking up from a deep state can
under-report if some other thread in the core is in a shallow stop
state. The scenario in which this can manifest is as follows:

   1) All the threads of the core are in deep stop.
   2) One of the threads is woken up. The PLS for this thread will
      correctly reflect that it is waking up from deep stop.
   3) The thread that has woken up now executes a shallow stop.
   4) When some other thread in the core is woken, its PLS will reflect
      the shallow stop state.

Thus, the subsequent thread for which the PLS is under-reporting the
wakeup state will not restore the hypervisor resources.

Hence, on DD1 systems, use the Requested Level (RL) field as a
workaround to restore the contents of the hypervisor resources on the
wakeup from the stop state.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Gautham R. Shenoy
cb0be7ec03 powerpc/powernv/idle: Restore LPCR on wakeup from deep-stop
On wakeup from a deep stop state which is supposed to lose the
hypervisor state, we don't restore the LPCR to the old value but set
it to a "sane" value via cur_cpu_spec->cpu_restore().

The problem is that the "sane" value doesn't include UPRT and the HR
bits which are required to run correctly in Radix mode.

Fix this on POWER9 onwards by restoring the LPCR value whatever it was
before executing the stop instruction.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Gautham R. Shenoy
ec48673552 powerpc/powernv/idle: Decouple Timebase restore & Per-core SPRs restore
On POWER8, in case of
   -  nap: both timebase and hypervisor state is retained.
   -  fast-sleep: timebase is lost. But the hypervisor state is retained.
   -  winkle: timebase and hypervisor state is lost.

Hence, the current code for handling exit from a idle state assumes
that if the timebase value is retained, then so is the hypervisor
state. Thus, the current code doesn't restore per-core hypervisor
state in such cases.

But that is no longer the case on POWER9 where we do have stop states
in which timebase value is retained, but the hypervisor state is
lost. So we have to ensure that the per-core hypervisor state gets
restored in such cases.

Fix this by ensuring that even in the case when timebase is retained,
we explicitly check if we are waking up from a deep stop that loses
per-core hypervisor state (indicated by cr4 being eq or gt), and if
this is the case, we restore the per-core hypervisor state.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Nicholas Piggin
a4700a2610 powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
Providing "scv" support to userspace requires kernel support, so it
must be advertised as independently to the base ISA 3 instruction set.

The darn instruction relies on firmware enablement, so it has been
decided to split this out from the core ISA 3 feature as well.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-25 23:07:45 +10:00
Michael Neuling
d957fb4d17 powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on
a P9. This is because we still set MMU_FTR_TYPE_RADIX via
ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching
in much of the asm code (ie. slb_miss_realmode)

This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being
set from ibm.pa-features.

We may eventually end up removing the CONFIG_PPC_RADIX_MMU option
completely but until then this fixes the issue.

Fixes: 17a3dd2f5f ("powerpc/mm/radix: Use firmware feature to enable Radix MMU")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-25 23:07:44 +10:00
Thomas Gleixner
a8fcfc1917 powerpc: Adjust system_state check
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.

Adjust the system_state check in smp_generic_cpu_bootable() to handle the
extra states.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170516184735.359536998@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-23 10:01:35 +02:00
Naveen N. Rao
d04c02f8aa powerpc/kprobes: Fix handling of instruction emulation on probe re-entry
Commit 22d8b3dec2 ("powerpc/kprobes: Emulate instructions on kprobe
handler re-entry") enabled emulating instructions on kprobe re-entry,
rather than single-stepping always. However, we didn't update the single
stepping code to only be run if the emulation fails. Also, we missed
re-enabling preemption if the instruction emulation was successful. Fix
those issues.

Fixes: 22d8b3dec2 ("powerpc/kprobes: Emulate instructions on kprobe handler re-entry")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-16 13:11:07 +10:00
Gautham R. Shenoy
bbb075ddf7 powerpc/powernv: Set NAPSTATELOST after recovering paca on P9 DD1
Commit 17ed4c8f81 ("powerpc/powernv: Recover correct PACA on wakeup
from a stop on P9 DD1") promises to set the NAPSTATELOST bit in paca
after recovering the correct paca for the thread waking up from stop1
on DD1, so that the GPRs can be correctly restored on the stop exit
path. However, it loads the value 1 into r3, but stores the value in
r0 into NAPSTATELOST(r13).

Fix this by correctly set the NAPSTATELOST bit in paca after
recovering the paca on POWER9 DD1.

Fixes: 17ed4c8f81 ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1")
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-16 13:03:21 +10:00
Michael Neuling
f48e91e87e powerpc/tm: Fix FP and VMX register corruption
In commit dc3106690b ("powerpc: tm: Always use fp_state and vr_state
to store live registers"), a section of code was removed that copied
the current state to checkpointed state. That code should not have been
removed.

When an FP (Floating Point) unavailable is taken inside a transaction,
we need to abort the transaction. This is because at the time of the
tbegin, the FP state is bogus so the state stored in the checkpointed
registers is incorrect. To fix this, we treclaim (to get the
checkpointed GPRs) and then copy the thread_struct FP live state into
the checkpointed state. We then trecheckpoint so that the FP state is
correctly restored into the CPU.

The copying of the FP registers from live to checkpointed is what was
missing.

This simplifies the logic slightly from the original patch.
tm_reclaim_thread() will now always write the checkpointed FP
state. Either the checkpointed FP state will be written as part of
the actual treclaim (in tm.S), or it'll be a copy of the live
state. Which one we use is based on MSR[FP] from userspace.

Similarly for VMX.

Fixes: dc3106690b ("powerpc: tm: Always use fp_state and vr_state to store live registers")
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: cyrilbur@gmail.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-15 19:31:38 +10:00
Linus Torvalds
dc2a248166 Merge tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
 "The change to the Linux page table geometry was delayed for more
  testing with 16G pages, and there's the new CPU features stuff which
  just needed one more polish before going in. Plus a few changes from
  Scott which came in a bit late. And then various fixes, mostly minor.

  Summary highlights:

   - rework the Linux page table geometry to lower memory usage on
     64-bit Book3S (IBM chips) using the Hash MMU.

   - support for a new device tree binding for discovering CPU features
     on future firmwares.

   - Freescale updates from Scott:
      "Includes a fix for a powerpc/next mm regression on 64e, a fix for
       a kernel hang on 64e when using a debugger inside a relocated
       kernel, a qman fix, and misc qe improvements."

  Thanks to: Christophe Leroy, Gavin Shan, Horia Geantă, LiuHailong,
  Nicholas Piggin, Roy Pledge, Scott Wood, Valentin Longchamp"

* tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Support new device tree binding for discovering CPU features
  powerpc: Don't print cpu_spec->cpu_name if it's NULL
  of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle
  powerpc/64s: Fix unnecessary machine check handler relocation branch
  powerpc/mm/book3s/64: Rework page table geometry for lower memory usage
  powerpc: Fix distclean with Makefile.postlink
  powerpc/64e: Don't place the stack beyond TASK_SIZE
  powerpc/powernv: Block PCI config access on BCM5718 during EEH recovery
  powerpc/8xx: Adding support of IRQ in MPC8xx GPIO
  soc/fsl/qbman: Disable IRQs for deferred QBMan work
  soc/fsl/qe: add EXPORT_SYMBOL for the 2 qe_tdm functions
  soc/fsl/qe: only apply QE_General4 workaround on affected SoCs
  soc/fsl/qe: round brg_freq to 1kHz granularity
  soc/fsl/qe: get rid of immrbar_virt_to_phys()
  net: ethernet: ucc_geth: fix MEM_PART_MURAM mode
  powerpc/64e: Fix hang when debugging programs with relocated kernel
2017-05-12 10:04:09 -07:00
Nicholas Piggin
5a61ef74f2 powerpc/64s: Support new device tree binding for discovering CPU features
The ibm,powerpc-cpu-features device tree binding describes CPU features with
ASCII names and extensible compatibility, privilege, and enablement metadata
that allows improved flexibility and compatibility with new hardware.

The interface is described in detail in ibm,powerpc-cpu-features.txt in this
patch.

Currently this code is not enabled by default, and there are no released
firmwares that provide the binding.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-09 23:42:55 +10:00
Nicholas Piggin
75bda95048 powerpc: Don't print cpu_spec->cpu_name if it's NULL
Currently we assume that if the cpu_spec has a pvr_mask then it must also have a
cpu_name. But that will change in a subsequent commit when we do CPU feature
discovery via the device tree, so check explicitly if cpu_name is NULL.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-09 22:56:03 +10:00
Michael Ellerman
0b382fb3d9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Includes a fix for a powerpc/next mm regression on 64e, a fix for a
kernel hang on 64e when using a debugger inside a relocated kernel, a
qman fix, and misc qe improvements."
2017-05-09 22:54:35 +10:00
Paolo Bonzini
4415b33528 Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
The main thing here is a new implementation of the in-kernel
XICS interrupt controller emulation for POWER9 machines, from Ben
Herrenschmidt.

POWER9 has a new interrupt controller called XIVE (eXternal Interrupt
Virtualization Engine) which is able to deliver interrupts directly
to guest virtual CPUs in hardware without hypervisor intervention.
With this new code, the guest still sees the old XICS interface but
performance is better because the XICS emulation in the host uses the
XIVE directly rather than going through a XICS emulation in firmware.

Conflicts:
	arch/powerpc/kernel/cpu_setup_power.S [cherry-picked fix]
	arch/powerpc/kvm/book3s_xive.c [include asm/debugfs.h]
2017-05-09 11:50:01 +02:00
Nicholas Piggin
6102c005a7 powerpc/64s: Fix unnecessary machine check handler relocation branch
Similarly to commit 2563a70c3b ("powerpc/64s: Remove unnecessary relocation
branch from idle handler"), the machine check handler has a BRANCH_TO from
relocated to relocated code, which is unnecessary.

It has also caused build errors with some toolchains:

  arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
  arch/powerpc/kernel/exceptions-64s.S:395: Error: operand out of range
  (0xffffffffffff8280 is not between 0x0000000000000000 and
  0x000000000000ffff)

Fixes: 1945bc4549 ("powerpc/64s: Fix POWER9 machine check handler from stop state")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-09 19:25:50 +10:00
Linus Torvalds
857f864014 Merge tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:

 - add framework for supporting PCIe devices in Endpoint mode (Kishon
   Vijay Abraham I)

 - use non-postable PCI config space mappings when possible (Lorenzo
   Pieralisi)

 - clean up and unify mmap of PCI BARs (David Woodhouse)

 - export and unify Function Level Reset support (Christoph Hellwig)

 - avoid FLR for Intel 82579 NICs (Sasha Neftin)

 - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)

 - short-circuit config access failures for disconnected devices (Keith
   Busch)

 - remove D3 sleep delay when possible (Adrian Hunter)

 - freeze PME scan before suspending devices (Lukas Wunner)

 - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)

 - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)

 - add arch-specific alignment control to improve device passthrough by
   avoiding multiple BARs in a page (Yongji Xie)

 - add sysfs sriov_drivers_autoprobe to control VF driver binding
   (Bodong Wang)

 - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)

 - fix crashes when unbinding host controllers that don't support
   removal (Brian Norris)

 - add driver for MicroSemi Switchtec management interface (Logan
   Gunthorpe)

 - add driver for Faraday Technology FTPCI100 host bridge (Linus
   Walleij)

 - add i.MX7D support (Andrey Smirnov)

 - use generic MSI support for Aardvark (Thomas Petazzoni)

 - make Rockchip driver modular (Brian Norris)

 - advertise 128-byte Read Completion Boundary support for Rockchip
   (Shawn Lin)

 - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)

 - convert atomic_t to refcount_t in HV driver (Elena Reshetova)

 - add CPU IRQ affinity in HV driver (K. Y. Srinivasan)

 - fix PCI bus removal in HV driver (Long Li)

 - add support for ThunderX2 DMA alias topology (Jayachandran C)

 - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)

 - add ITE 8893 bridge DMA alias quirk (Jarod Wilson)

 - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
   (Manish Jaggi)

* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
  PCI: Don't allow unbinding host controllers that aren't prepared
  ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
  MAINTAINERS: Add PCI Endpoint maintainer
  Documentation: PCI: Add userguide for PCI endpoint test function
  tools: PCI: Add sample test script to invoke pcitest
  tools: PCI: Add a userspace tool to test PCI endpoint
  Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
  misc: Add host side PCI driver for PCI test function device
  PCI: Add device IDs for DRA74x and DRA72x
  dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
  PCI: dwc: dra7xx: Workaround for errata id i870
  dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
  PCI: dwc: dra7xx: Add EP mode support
  PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
  dt-bindings: PCI: Add DT bindings for PCI designware EP mode
  PCI: dwc: designware: Add EP mode support
  Documentation: PCI: Add binding documentation for pci-test endpoint function
  ixgbe: Use pcie_flr() instead of duplicating it
  IB/hfi1: Use pcie_flr() instead of duplicating it
  PCI: imx6: Fix spelling mistake: "contol" -> "control"
  ...
2017-05-08 19:03:25 -07:00
Linus Torvalds
bf5f89463f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - the rest of MM

 - various misc things

 - procfs updates

 - lib/ updates

 - checkpatch updates

 - kdump/kexec updates

 - add kvmalloc helpers, use them

 - time helper updates for Y2038 issues. We're almost ready to remove
   current_fs_time() but that awaits a btrfs merge.

 - add tracepoints to DAX

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4
  selftests/vm: add a test for virtual address range mapping
  dax: add tracepoint to dax_insert_mapping()
  dax: add tracepoint to dax_writeback_one()
  dax: add tracepoints to dax_writeback_mapping_range()
  dax: add tracepoints to dax_load_hole()
  dax: add tracepoints to dax_pfn_mkwrite()
  dax: add tracepoints to dax_iomap_pte_fault()
  mtd: nand: nandsim: convert to memalloc_noreclaim_*()
  treewide: convert PF_MEMALLOC manipulations to new helpers
  mm: introduce memalloc_noreclaim_{save,restore}
  mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
  mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required
  mm/huge_memory.c: use zap_deposited_table() more
  time: delete CURRENT_TIME_SEC and CURRENT_TIME
  gfs2: replace CURRENT_TIME with current_time
  apparmorfs: replace CURRENT_TIME with current_time()
  lustre: replace CURRENT_TIME macro
  fs: ubifs: replace CURRENT_TIME_SEC with current_time
  fs: ufs: use ktime_get_real_ts64() for birthtime
  ...
2017-05-08 18:17:56 -07:00