Introduce support for CONFIG_PCI_DOMAINS_GENERIC, allowing for platforms
to make use of generic PCI domains instead of the MIPS-specific
implementation. The set_pci_need_domain_info function is introduced to
abstract away the removed need_domain_info field in struct
pci_controller, and pcibios_scanbus is adjusted to use the pci_domain_nr
accessor instead of directly accessing the index field.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14341/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The default i8259 polling function (i8259_irq) is nicely generic but is
fairly costly. Platforms often provide an alternative means of polling
for an i8259 interrupt, and when using the i8259 without device tree
have typically just chained its parent interrupt to their own handler
function. In order to allow for platform-specific polling functions to
be used in cases where the driver is probed via device tree, provide an
i8259_set_poll function that accepts a pointer to an alternative poll
function that will override the default.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14270/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
more victims of indirect include chains - au1200fb
lasat/picvue_proc and watchdog/ath79_wdt
... as well as tb0219, spotted by Sudip Mukherjee
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Probe the UARTs on SEAD3 boards using device tree rather than platform
code, in order to reduce the amount of the latter. This requires that
CONFIG_SERIAL_OF_PLATFORM be enabled, so enable it in sead3_defconfig.
The SEAD3 DT shim code is extended to read bootloader environment
variables to determine the appropriate UART & mode for kernel console
output & set the stdout-path property of the chosen node accordingly.
In contrast to the old platform code, which appears to have only ever
set "console=ttyS0,38400n8r" with the code in console_config never
having an effect, this will honor the "yamontty" environment variable to
select between the 2 UARTs on the board and then check the "modetty0" or
"modetty1" variable as appropriate to determine the UART configuration.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14048/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
flush_icache_range() is used for both user addresses (i.e.
cacheflush(2)), and kernel addresses (as the API documentation
describes).
This isn't really suitable however for Enhanced Virtual Addressing (EVA)
where cache operations on usermode addresses must use a different
instruction, and the protected cache ops assume user addresses, making
flush_icache_range() ineffective on kernel addresses.
Split out a new __flush_icache_user_range() and
__local_flush_icache_user_range() for users which actually want to flush
usermode addresses (note that flush_icache_user_range() already exists
on various architectures but with different arguments).
The implementation of flush_icache_range() will be changed in an
upcoming commit to use unprotected normal cache ops so as to always work
on the kernel mode address space.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14152/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When discovering the number of VPEs per core, smp_num_siblings will be
incorrect for kernels built without support for the MIPS MultiThreading
(MT) ASE running on systems which implement said ASE. This leads to
accesses to VPEs in secondary cores being performed incorrectly since
mips_cm_vp_id calculates the wrong ID to write to the local "other"
registers. Fix this by examining the number of VPEs in the core as
reported by the CM.
This patch presumes that the number of VPEs will be the same in each
core of the system. As this path only applies to systems with CM version
2.5 or lower, and this property is true of all such known systems, this
is likely to be fine but is described in a comment for good measure.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14338/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The FTLBP field in Config7 for the I6400 is intended as chicken bits for
debugging rather than as a field that software actually makes use of.
For best performance, FTLBP should be left at its default value of 0
with all TLB writes hitting the FTLB by default.
Additionally, since set_ftlb_enable is called from decode_configs before
decode_config4 which determines the size of the TLBs, this was
previously always setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio which
makes abysmal use of the available FTLB resources.
This effectively reverts b0c4e1b79d8a ("MIPS: Set up FTLB probability
for I6400").
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: b0c4e1b79d8a ("MIPS: Set up FTLB probability for I6400")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14021/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Generic kernel code implements a weak version of set_orig_insn that
moves cached 'insn' from arch_uprobe to the original code location when
the trap is removed.
MIPS variant used arch_uprobe->orig_inst which was never initialised
properly, so this code only inserted a nop instead of the original
instruction. With that change orig_inst can also be safely removed.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Fixes: 40e084a506 ('MIPS: Add uprobes support.')
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14299/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Invalidate host TLB mappings when the guest ASID is changed by
regenerating ASIDs, rather than flushing the entire host TLB except
entries in the guest KSeg0 range.
For the guest kernel mode ASID we regenerate on the spot when the guest
ASID is changed, as that will always take place while the guest is in
kernel mode.
However when the guest invalidates TLB entries the ASID will often by
changed temporarily as part of writing EntryHi without the guest
returning to user mode in between. We therefore regenerate the user mode
ASID lazily before entering the guest in user mode, if and only if the
guest ASID has actually changed since the last guest user mode entry.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Commit 1685ddbe35 ("MIPS: Octeon: Changes to support readq()/writeq()
usage.") added bitwise shift operations that assume that unsigned long
is always 64-bits. This broke the build of VDSO code, as it gets compiled
also in "faked" 32-bit mode. Althought the failing inline functions are
never executed in 32-bit mode, they still need to pass the compilation.
Fix by using 64-bit types explicitly.
The patch fixes the following build failure:
CC arch/mips/vdso/gettimeofday-o32.o
In file included from los/git/devel/linux/arch/mips/include/asm/io.h:32:0,
from los/git/devel/linux/arch/mips/include/asm/page.h:194,
from los/git/devel/linux/arch/mips/vdso/vdso.h:26,
from los/git/devel/linux/arch/mips/vdso/gettimeofday.c:11:
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h: In function '__should_swizzle_bits':
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h:19:40: error: right shift count >= width of type [-Werror=shift-count-overflow]
unsigned long did = ((unsigned long)a >> 40) & 0xff;
^~
Fixes: 1685ddbe35 ("MIPS: Octeon: Changes to support readq()/writeq() usage.")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull uaccess fixes from Al Viro:
"Fixes for broken uaccess primitives - mostly lack of proper zeroing
in copy_from_user()/get_user()/__get_user(), but for several
architectures there's more (broken clear_user() on frv and
strncpy_from_user() on hexagon)"
* 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
avr32: fix copy_from_user()
microblaze: fix __get_user()
microblaze: fix copy_from_user()
m32r: fix __get_user()
blackfin: fix copy_from_user()
sparc32: fix copy_from_user()
sh: fix copy_from_user()
sh64: failing __get_user() should zero
score: fix copy_from_user() and friends
score: fix __get_user/get_user
s390: get_user() should zero on failure
ppc32: fix copy_from_user()
parisc: fix copy_from_user()
openrisc: fix copy_from_user()
nios2: fix __get_user()
nios2: copy_from_user() should zero the tail of destination
mn10300: copy_from_user() should zero on access_ok() failure...
mn10300: failing __get_user() and get_user() should zero
mips: copy_from_user() must zero the destination on access_ok() failure
ARC: uaccess: get_user to zero out dest in cause of fault
...
This patch adds two new system calls:
int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
int pkey_free(int pkey);
These implement an "allocator" for the protection keys
themselves, which can be thought of as analogous to the allocator
that the kernel has for file descriptors. The kernel tracks
which numbers are in use, and only allows operations on keys that
are valid. A key which was not obtained by pkey_alloc() may not,
for instance, be passed to pkey_mprotect().
These system calls are also very important given the kernel's use
of pkeys to implement execute-only support. These help ensure
that userspace can never assume that it has control of a key
unless it first asks the kernel. The kernel does not promise to
preserve PKRU (right register) contents except for allocated
pkeys.
The 'init_access_rights' argument to pkey_alloc() specifies the
rights that will be established for the returned pkey. For
instance:
pkey = pkey_alloc(flags, PKEY_DENY_WRITE);
will allocate 'pkey', but also sets the bits in PKRU[1] such that
writing to 'pkey' is already denied.
The kernel does not prevent pkey_free() from successfully freeing
in-use pkeys (those still assigned to a memory range by
pkey_mprotect()). It would be expensive to implement the checks
for this, so we instead say, "Just don't do it" since sane
software will never do it anyway.
Any piece of userspace calling pkey_alloc() needs to be prepared
for it to fail. Why? pkey_alloc() returns the same error code
(ENOSPC) when there are no pkeys and when pkeys are unsupported.
They can be unsupported for a whole host of reasons, so apps must
be prepared for this. Also, libraries or LD_PRELOADs might steal
keys before an application gets access to them.
This allocation mechanism could be implemented in userspace.
Even if we did it in userspace, we would still need additional
user/kernel interfaces to tell userspace which keys are being
used by the kernel internally (such as for execute-only
mappings). Having the kernel provide this facility completely
removes the need for these additional interfaces, or having an
implementation of this in userspace at all.
Note that we have to make changes to all of the architectures
that do not use mman-common.h because we use the new
PKEY_DENY_ACCESS/WRITE macros in arch-independent code.
1. PKRU is the Protection Key Rights User register. It is a
usermode-accessible register that controls whether writes
and/or access to each individual pkey is allowed or denied.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20160729163015.444FE75F@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
MIPS Enhanced Virtual Addressing (EVA) allows the user mode and kernel
mode address spaces to overlap, breaking the assumption that PAGE_OFFSET
is an appropriate KVM HVA error value, since PAGE_OFFSET may be as low
as zero.
Fix this in the same way that s390 does in commit bf640876e2 ("KVM:
s390: Make KVM_HVA_ERR_BAD usable on s390"), by overriding
KVM_HVA_ERR_[RO_]BAD and kvm_is_error_hva() in asm/kvm_host.h.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
vms and vcpus have statistics associated with them which can be viewed
within the debugfs. Currently it is assumed within the vcpu_stat_get() and
vm_stat_get() functions that all of these statistics are represented as
u32s, however the next patch adds some u64 vcpu statistics.
Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
Since vcpu statistics are per vcpu, they will only be updated by a single
vcpu at a time so this shouldn't present a problem on 32-bit machines
which can't atomically increment 64-bit numbers. However vm statistics
could potentially be updated by multiple vcpus from that vm at a time.
To avoid the overhead of atomics make all vm statistics ulong such that
they are 64-bit on 64-bit systems where they can be atomically incremented
and are 32-bit on 32-bit systems which may not be able to atomically
increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.8. Also includes is a
minor SSB cleanup as SSB code traditionally is merged through the MIPS
tree:
ATH25:
- MIPS: Add default configuration for ath25
Boot:
- For zboot, copy appended dtb to the end of the kernel
- store the appended dtb address in a variable
BPF:
- Fix off by one error in offset allocation
Cobalt code:
- Fix typos
Core code:
- debugfs_create_file returns NULL on error, so don't use IS_ERR for
testing for errors.
- Fix double locking issue in RM7000 S-cache code. This would only
affect RM7000 ARC systems on reboot.
- Fix page table corruption on THP permission changes.
- Use compat_sys_keyctl for 32 bit userspace on 64 bit kernels.
David says, there are no compatibility issues raised by this fix.
- Move some signal code around.
- Rewrite r4k count/compare clockevent device registration such that
min_delta_ticks/max_delta_ticks files are guaranteed to be
initialized.
- Only register r4k count/compare as clockevent device if we can
assume the clock to be constant.
- Fix MSA asm warnings in control reg accessors
- uasm and tlbex fixes and tweaking.
- Print segment physical address when EU=1.
- Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO.
- CP: Allow booting by VP other than VP 0
- Cache handling fixes and optimizations for r4k class caches
- Add hotplug support for R6 processors
- Cleanup hotplug bits in kconfig
- traps: return correct si code for accessing nonmapped addresses
- Remove cpu_has_safe_index_cacheops
Lantiq:
- Register IRQ handler for virtual IRQ number
- Fix EIU interrupt loading code
- Use the real EXIN count
- Fix build error.
Loongson 3:
- Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES
Octeon:
- Delete built-in DTB pruning code for D-Link DSR-1000N.
- Clean up GPIO definitions in dlink_dsr-1000n.dts.
- Add more LEDs to the DSR-100n DTS
- Fix off by one in octeon_irq_gpio_map()
- Typo fixes
- Enable SATA by default in cavium_octeon_defconfig
- Support readq/writeq()
- Remove forced mappings of USB interrupts.
- Ensure DMA descriptors are always in the low 4GB
- Improve USB reset code for OCTEON II.
Pistachio:
- Add maintainers entry for pistachio SoC Support
- Remove plat_setup_iocoherency
Ralink:
- Fix pwm UART in spis group pinmux.
SSB:
- Change bare unsigned to unsigned int to suit coding style
Tools:
- Fix reloc tool compiler warnings.
Other:
- Delete use of ARCH_WANT_OPTIONAL_GPIOLIB"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (61 commits)
MIPS: mm: Fix definition of R6 cache instruction
MIPS: tools: Fix relocs tool compiler warnings
MIPS: Cobalt: Fix typo
MIPS: Octeon: Fix typo
MIPS: Lantiq: Fix build failure
MIPS: Use CPHYSADDR to implement mips32 __pa
MIPS: Octeon: Dlink_dsr-1000n.dts: add more leds.
MIPS: Octeon: Clean up GPIO definitions in dlink_dsr-1000n.dts.
MIPS: Octeon: Delete built-in DTB pruning code for D-Link DSR-1000N.
MIPS: store the appended dtb address in a variable
MIPS: ZBOOT: copy appended dtb to the end of the kernel
MIPS: ralink: fix spis group pinmux
MIPS: Factor o32 specific code into signal_o32.c
MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present
MIPS: Use per-mm page to execute branch delay slot instructions
MIPS: Modify error handling
MIPS: c-r4k: Use SMP calls for CM indexed cache ops
MIPS: c-r4k: Avoid small flush_icache_range SMP calls
MIPS: c-r4k: Local flush_icache_range cache op override
MIPS: c-r4k: Split r4k_flush_kernel_vmap_range()
...
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.
- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.
- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.
- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
Use CPHYSADDR to implement the __pa macro converting from a virtual to a
physical address for MIPS32, much as is already done for MIPS64 (though
without the complication of having both compatibility & XKPHYS
segments).
This allows for __pa to work regardless of whether the address being
translated is in kseg0 or kseg1, unlike the previous subtraction based
approach which only worked for addresses in kseg0. Working for kseg1
addresses is important if __pa is used on addresses allocated by
dma_alloc_coherent, where on systems with non-coherent I/O we provide
addresses in kseg1. If this address is then used with
dma_map_single_attrs then it is provided to virt_to_page, which in turn
calls virt_to_phys which is a wrapper around __pa. The result is that we
end up with a physical address 0x20000000 bytes (ie. the size of kseg0)
too high.
In addition to providing consistency with MIPS64 & fixing the kseg1 case
above this has the added bonus of generating smaller code for systems
implementing MIPS32r2 & beyond, where a single ext instruction can
extract the physical address rather than needing to load an immediate
into a temp register & subtract it. This results in ~1.3KB savings for a
boston_defconfig kernel adjusted to set CONFIG_32BIT=y.
This patch does not change the EVA case, which may or may not have
similar issues around handling both cached & uncached addresses but is
beyond the scope of this patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13836/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The stack and heap have both been executable by default on MIPS until
now. This patch changes the default to be non-executable, but only for
ELF binaries with a non-executable PT_GNU_STACK header present. This
does apply to both the heap & the stack, despite the name PT_GNU_STACK,
and this matches the behaviour of other architectures like ARM & x86.
Current MIPS toolchains do not produce the PT_GNU_STACK header, which
means that we can rely upon this patch not changing the behaviour of
existing binaries. The new default will only take effect for newly
compiled binaries once toolchains are updated to support PT_GNU_STACK,
and since those binaries are newly compiled they can be compiled
expecting the change in default behaviour. Again this matches the way in
which the ARM & x86 architectures handled their implementations of
non-executable memory.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: Maciej Rozycki <maciej.rozycki@imgtec.com>
Cc: Faraz Shahbazker <faraz.shahbazker@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13765/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In some cases the kernel needs to execute an instruction from the delay
slot of an emulated branch instruction. These cases include:
- Emulated floating point branch instructions (bc1[ft]l?) for systems
which don't include an FPU, or upon which the kernel is run with the
"nofpu" parameter.
- MIPSr6 systems running binaries targeting older revisions of the
architecture, which may include branch instructions whose encodings
are no longer valid in MIPSr6.
Executing instructions from such delay slots is done by writing the
instruction to memory followed by a trap, as part of an "emuframe", and
executing it. This avoids the requirement of an emulator for the entire
MIPS instruction set. Prior to this patch such emuframes are written to
the user stack and executed from there.
This patch moves FP branch delay emuframes off of the user stack and
into a per-mm page. Allocating a page per-mm leaves userland with access
to only what it had access to previously, and compared to other
solutions is relatively simple.
When a thread requires a delay slot emulation, it is allocated a frame.
A thread may only have one frame allocated at any one time, since it may
only ever be executing one instruction at any one time. In order to
ensure that we can free up allocated frame later, its index is recorded
in struct thread_struct. In the typical case, after executing the delay
slot instruction we'll execute a break instruction with the BRK_MEMU
code. This traps back to the kernel & leads to a call to do_dsemulret
which frees the allocated frame & moves the user PC back to the
instruction that would have executed following the emulated branch.
In some cases the delay slot instruction may be invalid, such as a
branch, or may trigger an exception. In these cases the BRK_MEMU break
instruction will not be hit. In order to ensure that frames are freed
this patch introduces dsemul_thread_cleanup() and calls it to free any
allocated frame upon thread exit. If the instruction generated an
exception & leads to a signal being delivered to the thread, or indeed
if a signal simply happens to be delivered to the thread whilst it is
executing from the struct emuframe, then we need to take care to exit
the frame appropriately. This is done by either rolling back the user PC
to the branch or advancing it to the continuation PC prior to signal
delivery, using dsemul_thread_rollback(). If this were not done then a
sigreturn would return to the struct emuframe, and if that frame had
meanwhile been used in response to an emulated branch instruction within
the signal handler then we would execute the wrong user code.
Whilst a user could theoretically place something like a compact branch
to self in a delay slot and cause their thread to become stuck in an
infinite loop with the frame never being deallocated, this would:
- Only affect the users single process.
- Be architecturally invalid since there would be a branch in the
delay slot, which is forbidden.
- Be extremely unlikely to happen by mistake, and provide a program
with no more ability to harm the system than a simple infinite loop
would.
If a thread requires a delay slot emulation & no frame is available to
it (ie. the process has enough other threads that all frames are
currently in use) then the thread joins a waitqueue. It will sleep until
a frame is freed by another thread in the process.
Since we now know whether a thread has an allocated frame due to our
tracking of its index, the cookie field of struct emuframe is removed as
we can be more certain whether we have a valid frame. Since a thread may
only ever have a single frame at any given time, the epc field of struct
emuframe is also removed & the PC to continue from is instead stored in
struct thread_struct. Together these changes simplify & shrink struct
emuframe somewhat, allowing twice as many frames to fit into the page
allocated for them.
The primary benefit of this patch is that we are now free to mark the
user stack non-executable where that is possible.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: Maciej Rozycki <maciej.rozycki@imgtec.com>
Cc: Faraz Shahbazker <faraz.shahbazker@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13764/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The KSEGX() macro is defined to 32-bit sign extend the address argument
and logically AND the result with 0xe0000000, with the final result
usually compared against one of the CKSEG macros. However the literal
0xe0000000 is unsigned as the high bit is set, and is therefore
zero-extended on 64-bit kernels, resulting in the sign extension bits of
the argument being masked to zero. This results in the odd situation
where:
KSEGX(CKSEG) != CKSEG
(0xffffffff80000000 & 0x00000000e0000000) != 0xffffffff80000000)
Fix this by 32-bit sign extending the 0xe0000000 literal using
_ACAST32_.
This will help some MIPS KVM code handling 32-bit guest addresses to
work on 64-bit host kernels, but will also affect KSEGX in
dec_kn01_be_backend() on a 64-bit DECstation kernel, and the SiByte DMA
page ops KSEGX check in clear_page() and copy_page() on 64-bit SB1
kernels, neither of which appear to be designed with 64-bit segments in
mind anyway.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When performing SMP calls to foreign cores, exclude sibling CPUs from
the provided map, as we already handle the local core on the current
CPU. This prevents an SMP call from for example core 0, VPE 1 to VPE 0
on the same core.
In the process the cpu_foreign_map cpumask is turned into an array of
cpumasks, so that each CPU has its own version of it which excludes
sibling CPUs. r4k_op_needs_ipi() is also updated to reflect that cache
management SMP calls are not needed when all CPUs are siblings (i.e.
there are no foreign CPUs according to the new cpu_foreign_map[]
semantics which exclude siblings).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Jayachandran C. <jchandra@broadcom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13801/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>