The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Rework of the KVM HYP bounce page from Ard Biesheuvel. Subsequent arm64
idmap rework depends on this, so merge it here with Marc Zyngier's
blessing (kvm-arm co-maintainer).
init_mm isn't a normal mm: it has swapper_pg_dir as its pgd (which
contains kernel mappings) and is used as the active_mm for the idle
thread.
When restoring the pgd after an EFI call, we write current->active_mm
into TTBR0. If the current task is actually the idle thread (e.g. when
initialising the EFI RTC before entering userspace), then the TLB can
erroneously populate itself with junk global entries as a result of
speculative table walks.
When we do eventually return to userspace, the task can end up hitting
these junk mappings leading to lockups, corruption or crashes.
This patch fixes the problem in the same way as the CPU suspend code by
ensuring that we never switch to the init_mm in efi_set_pgd and instead
point TTBR0 at the zero page. A check is also added to cpu_switch_mm to
BUG if we get passed swapper_pg_dir.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: f3cdfd239d ("arm64/efi: move SetVirtualAddressMap() to UEFI stub")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
According to the arm64 boot protocol, registers x1 to x3 should be
zero upon kernel entry, and non-zero values are reserved for future
use. This future use is going to be problematic if we never enforce
the current rules, so start enforcing them now, by emitting a warning
if non-zero values are detected.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This removes the function __calc_phys_offset and all open coded
virtual to physical address translations using the offset kept
in x28.
Instead, just use absolute or PC-relative symbol references as
appropriate when referring to virtual or physical addresses,
respectively.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Enabling of the MMU is split into two functions, with an align and
a branch in the middle. On arm64, the entire kernel Image is ID mapped
so this is really not necessary, and we can just merge it into a
single function.
Also replaces an open coded adrp/add reference to __enable_mmu pair
with adr_l.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The global processor_id is assigned the MIDR_EL1 value of the boot
CPU in the early init code, but is never referenced afterwards.
As the relevance of the MIDR_EL1 value of the boot CPU is debatable
anyway, especially under big.LITTLE, let's remove it before anyone
starts using it.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
struct cpu_table is an artifact left from the (very) early days of
the arm64 port, and its only real use is to allow the most beautiful
"AArch64 Processor" string to be displayed at boot time.
Really? Yes, really.
Let's get rid of it. In order to avoid another BogoMips-gate, the
aforementioned string is preserved.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event->ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.
The ARM64 PMU driver does not take this fact into account, and when
validating groups assumes that it can call to_arm_pmu(event->pmu) for
any HW event. When the event in question is from another HW PMU this is
wrong, and results in dereferencing garbage.
This patch updates the ARM64 PMU driver to first test for and reject
events from other PMUs, moving the to_arm_pmu and related logic after
this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
a CCI PMU present:
Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
CPU: 0 PID: 1371 Comm: perf_fuzzer Not tainted 3.19.0+ #249
Hardware name: V2F-1XV7 Cortex-A53x2 SMM (DT)
task: ffffffc07c73a280 ti: ffffffc07b0a0000 task.ti: ffffffc07b0a0000
PC is at 0x0
LR is at validate_event+0x90/0xa8
pc : [<0000000000000000>] lr : [<ffffffc000090228>] pstate: 00000145
sp : ffffffc07b0a3ba0
[< (null)>] (null)
[<ffffffc0000907d8>] armpmu_event_init+0x174/0x3cc
[<ffffffc00015d870>] perf_try_init_event+0x34/0x70
[<ffffffc000164094>] perf_init_event+0xe0/0x10c
[<ffffffc000164348>] perf_event_alloc+0x288/0x358
[<ffffffc000164c5c>] SyS_perf_event_open+0x464/0x98c
Code: bad PC value
Also cleans up the code to use the arm_pmu only when we know
that we are dealing with an arm pmu event.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Peter Ziljstra (Intel) <peterz@infradead.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The HYP init bounce page is a runtime construct that ensures that the
HYP init code does not cross a page boundary. However, this is something
we can do perfectly well at build time, by aligning the code appropriately.
For arm64, we just align to 4 KB, and enforce that the code size is less
than 4 KB, regardless of the chosen page size.
For ARM, the whole code is less than 256 bytes, so we tweak the linker
script to align at a power of 2 upper bound of the code size
Note that this also fixes a benign off-by-one error in the original bounce
page code, where a bounce page would be allocated unnecessarily if the code
was exactly 1 page in size.
On ARM, it also fixes an issue with very large kernels reported by Arnd
Bergmann, where stub sections with linker emitted veneers could erroneously
trigger the size/alignment ASSERT() in the linker script.
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm mmap2 syscall takes the offset in units of 4K, thus with 64K pages
the offset needs to be scaled to units of pages.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[will: removed redundant lr parameter, localised PAGE_SHIFT #if check]
Signed-off-by: Will Deacon <will.deacon@arm.com>
We currently don't log the boot mode for arm64 as we do for arm, and
without KVM the user is provided with no indication as to which mode(s)
CPUs were booted in, which can seriously hinder debugging in some cases.
Add logging to the boot path once all CPUs are up. Where CPUs are
mismatched in violation of the boot protocol, WARN and set a taint (as
we do for CPU other CPU feature mismatches) given that the
firmware/bootloader is buggy and should be fixed.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 828e9834e9 ("arm64: head: create a new function for setting
the boot_cpu_mode flag") added BOOT_CPU_MODE_EL1, a nonzero value
replacing uses of zero. However it failed to update __boot_cpu_mode
appropriately.
A CPU booted at EL2 writes BOOT_CPU_MODE_EL2 to __boot_cpu_mode[0], and
a CPU booted at EL1 writes BOOT_CPU_MODE_EL1 to __boot_cpu_mode[1].
Later is_hyp_mode_mismatched() determines there to be a mismatch if
__boot_cpu_mode[0] != __boot_cpu_mode[1].
If all CPUs are booted at EL1, __boot_cpu_mode[0] will be set to
BOOT_CPU_MODE_EL1, but __boot_cpu_mode[1] will retain its initial value
of zero, and is_hyp_mode_mismatched will erroneously determine that the
boot modes are mismatched. This hasn't been a problem so far, but later
patches which will make use of is_hyp_mode_mismatched() expect it to
work correctly.
This patch initialises __boot_cpu_mode[1] to BOOT_CPU_MODE_EL1, fixing
the erroneous mismatch detection when all CPUs are booted at EL1.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently we only perform alternative patching for kernels built with
CONFIG_SMP, as we call apply_alternatives_all() in smp.c, which is only
built for CONFIG_SMP. Thus !SMP kernels may not have necessary
alternatives patched in.
This patch ensures that we call apply_alternatives_all() once all CPUs
are booted, even for !SMP kernels, by having the smp_init_cpus() stub
call this for !SMP kernels via up_late_init. A new wrapper,
do_post_cpus_up_work, is added so we can hook other calls here later
(e.g. boot mode logging).
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: e039ee4ee3 ("arm64: add alternative runtime patching")
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Another one for the big head.S spring cleaning: the label should
be after the .align or it may point to the padding.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
If UEFI Runtime Services are available, they are preferred over direct
PSCI calls or other methods to reset the system.
For the reset case, we need to hook into machine_restart(), as the
arm_pm_restart function pointer may be overwritten by modules.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The native (64-bit) sigval_t union contains sival_int (32-bit) and
sival_ptr (64-bit). When a compat application invokes a syscall that
takes a sigval_t value (as part of a larger structure, e.g.
compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
union is converted to the native sigval_t with sival_int overlapping
with either the least or the most significant half of sival_ptr,
depending on endianness. When the corresponding signal is delivered to a
compat application, on big endian the current (compat_uptr_t)sival_ptr
cast always returns 0 since sival_int corresponds to the top part of
sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
is copied to the compat_siginfo_t structure.
Cc: <stable@vger.kernel.org>
Reported-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Tested-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Patch 2f896d5866 ("arm64: use fixmap for text patching") changed
the way we patch the kernel text, using a fixmap when the kernel or
modules are flagged as read only.
Unfortunately, a flaw in the logic makes it fall over when patching
modules without CONFIG_DEBUG_SET_MODULE_RONX enabled:
[...]
[ 32.032636] Call trace:
[ 32.032716] [<fffffe00003da0dc>] __copy_to_user+0x2c/0x60
[ 32.032837] [<fffffe0000099f08>] __aarch64_insn_write+0x94/0xf8
[ 32.033027] [<fffffe000009a0a0>] aarch64_insn_patch_text_nosync+0x18/0x58
[ 32.033200] [<fffffe000009c3ec>] ftrace_modify_code+0x58/0x84
[ 32.033363] [<fffffe000009c4e4>] ftrace_make_nop+0x3c/0x58
[ 32.033532] [<fffffe0000164420>] ftrace_process_locs+0x3d0/0x5c8
[ 32.033709] [<fffffe00001661cc>] ftrace_module_init+0x28/0x34
[ 32.033882] [<fffffe0000135148>] load_module+0xbb8/0xfc4
[ 32.034044] [<fffffe0000135714>] SyS_finit_module+0x94/0xc4
[...]
This is triggered by the use of virt_to_page() on a module address,
which ends to pointing to Nowhereland if you're lucky, or corrupt
your precious data if not.
This patch fixes the logic by mimicking what is done on arm:
- If we're patching a module and CONFIG_DEBUG_SET_MODULE_RONX is set,
use vmalloc_to_page().
- If we're patching the kernel and CONFIG_DEBUG_RODATA is set,
use virt_to_page().
- Otherwise, use the provided address, as we can write to it directly.
Tested on 4.0-rc1 as a KVM guest.
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
An arm64 allmodconfig fails to build with GCC 5 due to __asmeq
assertions in the PSCI firmware calling code firing due to mcount
preambles breaking our assumptions about register allocation of function
arguments:
/tmp/ccDqJsJ6.s: Assembler messages:
/tmp/ccDqJsJ6.s:60: Error: .err encountered
/tmp/ccDqJsJ6.s:61: Error: .err encountered
/tmp/ccDqJsJ6.s:62: Error: .err encountered
/tmp/ccDqJsJ6.s:99: Error: .err encountered
/tmp/ccDqJsJ6.s💯 Error: .err encountered
/tmp/ccDqJsJ6.s:101: Error: .err encountered
This patch fixes the issue by moving the PSCI calls out-of-line into
their own assembly files, which are safe from the compiler's meddling
fingers.
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The vdso implementation of clock_getres currently returns 0 (success)
whenever a null timespec is provided by the caller, regardless of the
clock id supplied.
This behavior is incorrect. It should fall back to syscall when an
unrecognized clock id is passed, even when the timespec argument is
null. This ensures that clock_getres always returns an error for
invalid clock ids.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ftrace_enable_ftrace_graph_caller and ftrace_disable_ftrace_graph_caller
should replace B(jmp) instruction and not BL(call) instruction.
Commit 9f1ae7596aad("arm64: Correct ftrace calls to
aarch64_insn_gen_branch_imm()") had a typo and used
AARCH64_INSN_BRANCH_LINK instead of AARCH64_INSN_BRANCH_NOLINK.
Either instruction will work, as the link register is saved/restored
across the branch but this better matches the intention of the code.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
For instrumenting global variables KASan will shadow memory backing memory
for modules. So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().
__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area. Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole. So we could fail to allocate shadow
for module_alloc().
Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range(). Add new parameter 'vm_flags' to
__vmalloc_node_range() function.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull KVM update from Paolo Bonzini:
"Fairly small update, but there are some interesting new features.
Common:
Optional support for adding a small amount of polling on each HLT
instruction executed in the guest (or equivalent for other
architectures). This can improve latency up to 50% on some
scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This
also has to be enabled manually for now, but the plan is to
auto-tune this in the future.
ARM/ARM64:
The highlights are support for GICv3 emulation and dirty page
tracking
s390:
Several optimizations and bugfixes. Also a first: a feature
exposed by KVM (UUID and long guest name in /proc/sysinfo) before
it is available in IBM's hypervisor! :)
MIPS:
Bugfixes.
x86:
Support for PML (page modification logging, a new feature in
Broadwell Xeons that speeds up dirty page tracking), nested
virtualization improvements (nested APICv---a nice optimization),
usual round of emulation fixes.
There is also a new option to reduce latency of the TSC deadline
timer in the guest; this needs to be tuned manually.
Some commits are common between this pull and Catalin's; I see you
have already included his tree.
Powerpc:
Nothing yet.
The KVM/PPC changes will come in through the PPC maintainers,
because I haven't received them yet and I might end up being
offline for some part of next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
KVM: ia64: drop kvm.h from installed user headers
KVM: x86: fix build with !CONFIG_SMP
KVM: x86: emulate: correct page fault error code for NoWrite instructions
KVM: Disable compat ioctl for s390
KVM: s390: add cpu model support
KVM: s390: use facilities and cpu_id per KVM
KVM: s390/CPACF: Choose crypto control block format
s390/kernel: Update /proc/sysinfo file with Extended Name and UUID
KVM: s390: reenable LPP facility
KVM: s390: floating irqs: fix user triggerable endless loop
kvm: add halt_poll_ns module parameter
kvm: remove KVM_MMIO_SIZE
KVM: MIPS: Don't leak FPU/DSP to guest
KVM: MIPS: Disable HTW while in guest
KVM: nVMX: Enable nested posted interrupt processing
KVM: nVMX: Enable nested virtual interrupt delivery
KVM: nVMX: Enable nested apic register virtualization
KVM: nVMX: Make nested control MSRs per-cpu
KVM: nVMX: Enable nested virtualize x2apic mode
KVM: nVMX: Prepare for using hardware MSR bitmap
...
If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.
Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.
Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.
It's also a decent simplification, since the restart code is more or less
identical on all architectures.
[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull arm64 updates from Catalin Marinas:
"arm64 updates for 3.20:
- reimplementation of the virtual remapping of UEFI Runtime Services
in a way that is stable across kexec
- emulation of the "setend" instruction for 32-bit tasks (user
endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
accordingly)
- compat_sys_call_table implemented in C (from asm) and made it a
constant array together with sys_call_table
- export CPU cache information via /sys (like other architectures)
- DMA API implementation clean-up in preparation for IOMMU support
- macros clean-up for KVM
- dropped some unnecessary cache+tlb maintenance
- CONFIG_ARM64_CPU_SUSPEND clean-up
- defconfig update (CPU_IDLE)
The EFI changes going via the arm64 tree have been acked by Matt
Fleming. There is also a patch adding sys_*stat64 prototypes to
include/linux/syscalls.h, acked by Andrew Morton"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (47 commits)
arm64: compat: Remove incorrect comment in compat_siginfo
arm64: Fix section mismatch on alloc_init_p[mu]d()
arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros
arm64: mm: use *_sect to check for section maps
arm64: drop unnecessary cache+tlb maintenance
arm64:mm: free the useless initial page table
arm64: Enable CPU_IDLE in defconfig
arm64: kernel: remove ARM64_CPU_SUSPEND config option
arm64: make sys_call_table const
arm64: Remove asm/syscalls.h
arm64: Implement the compat_sys_call_table in C
syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64
compat: Declare compat_sys_sigpending and compat_sys_sigprocmask prototypes
arm64: uapi: expose our struct ucontext to the uapi headers
smp, ARM64: Kill SMP single function call interrupt
arm64: Emulate SETEND for AArch32 tasks
arm64: Consolidate hotplug notifier for instruction emulation
arm64: Track system support for mixed endian EL0
arm64: implement generic IOMMU configuration
arm64: Combine coherent and non-coherent swiotlb dma_ops
...
Pull EFI updates from Matt Fleming:
" - Move efivarfs from the misc filesystem section to pseudo filesystem,
since that's a more logical and accurate place - Leif Lindholm
- Update efibootmgr URL in Kconfig help - Peter Jones
- Improve accuracy of EFI guid function names - Borislav Petkov
- Expose firmware platform size in sysfs for the benefit of EFI boot
loader installers and other utilities - Steve McIntyre
- Cleanup __init annotations for arm64/efi code - Ard Biesheuvel
- Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel
- Fix memory leak in error code path of runtime map code - Dan Carpenter
- Improve robustness of get_memory_map() by removing assumptions on the
size of efi_memory_desc_t (which could change in future spec
versions) and querying the firmware instead of guessing about the
memmap size - Ard Biesheuvel
- Remove superfluous guid unparse calls - Ivan Khoronzhuk
- Delete unnecessary chosen@0 DT node FDT code since was duplicated
from code in drivers/of and is entirely unnecessary - Leif Lindholm
There's nothing super scary, mainly cleanups, and a merge from Ricardo who
kindly picked up some patches from the linux-efi mailing list while I
was out on annual leave in December.
Perhaps the biggest risk is the get_memory_map() change from Ard, which
changes the way that both the arm64 and x86 EFI boot stub build the
early memory map. It would be good to have it bake in linux-next for a
while.
"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARM64_CPU_SUSPEND config option was introduced to make code providing
context save/restore selectable only on platforms requiring power
management capabilities.
Currently ARM64_CPU_SUSPEND depends on the PM_SLEEP config option which
in turn is set by the SUSPEND config option.
The introduction of CPU_IDLE for arm64 requires that code configured
by ARM64_CPU_SUSPEND (context save/restore) should be compiled in
in order to enable the CPU idle driver to rely on CPU operations
carrying out context save/restore.
The ARM64_CPUIDLE config option (ARM64 generic idle driver) is therefore
forced to select ARM64_CPU_SUSPEND, even if there may be (ie PM_SLEEP)
failed dependencies, which is not a clean way of handling the kernel
configuration option.
For these reasons, this patch removes the ARM64_CPU_SUSPEND config option
and makes the context save/restore dependent on CPU_PM, which is selected
whenever either SUSPEND or CPU_IDLE are configured, cleaning up dependencies
in the process.
This way, code previously configured through ARM64_CPU_SUSPEND is
compiled in whenever a power management subsystem requires it to be
present in the kernel (SUSPEND || CPU_IDLE), which is the behaviour
expected on ARM64 kernels.
The cpu_suspend and cpu_init_idle CPU operations are added only if
CPU_IDLE is selected, since they are CPU_IDLE specific methods and
should be grouped and defined accordingly.
PSCI CPU operations are updated to reflect the introduced changes.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As with x86, mark the sys_call_table const such that it will be placed
in the .rodata section. This will cause attempts to modify the table
(accidental or deliberate) to fail when strict page permissions are in
place. In the absence of strict page permissions, there should be no
functional change.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch moves the sys_rt_sigreturn_wrapper prototype to
arch/arm64/kernel/sys.c and removes the asm/syscalls.h header.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Unlike the sys_call_table[], the compat one was implemented in sys32.S
making it impossible to notice discrepancies between the number of
compat syscalls and the __NR_compat_syscalls macro, the latter having to
be defined in asm/unistd.h as including asm/unistd32.h would cause
conflicts on __NR_* definitions. With this patch, incorrect
__NR_compat_syscalls values will result in a build-time error.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Commit 9a46ad6d6d "smp: make smp_call_function_many() use logic
similar to smp_call_function_single()" has unified the way to handle
single and multiple cross-CPU function calls. Now only one interrupt
is needed for architecture specific code to support generic SMP function
call interfaces, so kill the redundant single function call interrupt.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Emulate deprecated 'setend' instruction for AArch32 bit tasks.
setend [le/be] - Sets the endianness of EL0
On systems with CPUs which support mixed endian at EL0, the hardware
support for the instruction can be enabled by setting the SCTLR_EL1.SED
bit. Like the other emulated instructions it is controlled by an entry in
/proc/sys/abi/. For more information see :
Documentation/arm64/legacy_instructions.txt
The instruction is emulated by setting/clearing the SPSR_EL1.E bit, which
will be reflected in the PSTATE.E in AArch32 context.
This patch also restores the native endianness for the execution of signal
handlers, since the process could have changed the endianness.
Note: All CPUs on the system must have mixed endian support at EL0. Once the
handler is registered, hotplugging a CPU which doesn't support mixed endian,
could lead to unexpected results/behavior in applications.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As of now each insn_emulation has a cpu hotplug notifier that
enables/disables the CPU feature bit for the functionality. This
patch re-arranges the code, such that there is only one notifier
that runs through the list of registered emulation hooks and runs
their corresponding set_hw_mode.
We do nothing when a CPU is dying as we will set the appropriate bits
as it comes back online based on the state of the hooks.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Punit Agrawal <punit.agrawal@arm.com>
[catalin.marinas@arm.com: fix pr_warn compilation error]
[catalin.marinas@arm.com: remove unnecessary "insn" check]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch keeps track of the mixed endian EL0 support across
the system and provides helper functions to export it. The status
is a boolean indicating whether all the CPUs on the system supports
mixed endian at EL0.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
KVM/ARM changes for v3.20 including GICv3 emulation, dirty page logging, added
trace symbols, and adding an explicit VGIC init device control IOCTL.
Conflicts:
arch/arm64/include/asm/kvm_arm.h
arch/arm64/kvm/handle_exit.c
Now that the create_mapping() code in mm/mmu.c is able to support
setting up kernel page tables at initcall time, we can move the whole
virtmap creation to arm64_enable_runtime_services() instead of having
a distinct stage during early boot. This also allows us to drop the
arm64-specific EFI_VIRTMAP flag.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add page protections for arm64 similar to those in arm.
This is for security reasons to prevent certain classes
of exploits. The current method:
- Map all memory as either RWX or RW. We round to the nearest
section to avoid creating page tables before everything is mapped
- Once everything is mapped, if either end of the RWX section should
not be X, we split the PMD and remap as necessary
- When initmem is to be freed, we change the permissions back to
RW (using stop machine if necessary to flush the TLB)
- If CONFIG_DEBUG_RODATA is set, the read only sections are set
read only.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
GIC CPU interface for EL1 (guests). Currently we force it to 0, but
for proper GICv3 support we have to allow guests to use it (depending
on their selected virtual GIC model).
So add ICC_SRE_EL1 to the list of saved/restored registers on a
world switch, but actually disallow a guest to change it by only
restoring a fixed, once-initialized value.
This value depends on the GIC model userland has chosen for a guest.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
When booting with EFI, we acquire the EFI memory map after parsing the
early params. This unfortuantely renders the option useless as we call
memblock_enforce_memory_limit (which uses memblock_remove_range behind
the scenes) before we've added any memblocks. We end up removing
nothing, then adding all of memory later when efi_init calls
reserve_regions.
Instead, we can log the limit and apply this later when we do the rest
of the memblock work in memblock_init, which should work regardless of
the presence of EFI. At the same time we may as well move the early
parameter into arm64's mm/init.c, close to arm64_memblock_init.
Any memory which must be mapped (e.g. for use by EFI runtime services)
must be mapped explicitly reather than relying on the linear mapping,
which may be truncated as a result of a mem= option passed on the kernel
command line.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When remapping the UEFI memory map using ioremap_cache(), we
have to deal with potential failure. Note that, even if the
common case is for ioremap_cache() to return the existing linear
mapping of the memory map, we cannot rely on that to be always the
case, e.g., in the presence of a mem= kernel parameter.
At the same time, remove a stale comment and move the memmap code
together.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This ensures all stub component are freed when the kernel proper is
done booting, by prefixing the names of all ELF sections that have
the SHF_ALLOC attribute with ".init". This approach ensures that even
implicitly emitted allocated data (like initializer values and string
literals) are covered.
At the same time, remove some __init annotations in the stub that have
now become redundant, and add the __init annotation to handle_kernel_image
which will now trigger a section mismatch warning without it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>