The only difference between test_and_set_bit() & test_and_set_bit_lock()
is memory ordering barrier semantics - the former provides a full
barrier whilst the latter only provides acquire semantics.
We can therefore implement test_and_set_bit() in terms of
test_and_set_bit_lock() with the addition of the extra memory barrier.
Do this in order to avoid duplicating logic.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
The start position for an ins instruction is always encoded as an
immediate, so allowing registers to be used by the inline asm makes no
sense. It should never happen anyway since a bit index should always be
small enough to be treated as an immediate, but remove the nonsensical
"r" for sanity.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
set_bit() can set bits 0-15 using an ori instruction, rather than
loading the value -1 into a register & then using an ins instruction.
That is, rather than the following:
li t0, -1
ll t1, 0(t2)
ins t1, t0, 4, 1
sc t1, 0(t2)
We can have the simpler:
ll t1, 0(t2)
ori t1, t1, 0x10
sc t1, 0(t2)
The or path already allows immediates to be used, so simply restricting
the ins path to bits that don't fit in immediates is sufficient to take
advantage of this.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Reorder conditions in our various bitops functions that check
kernel_uses_llsc such that they handle the !kernel_uses_llsc case first.
This allows us to avoid the need to duplicate the kernel_uses_llsc check
in all the other cases. For functions that don't involve barriers common
to the various implementations, we switch to returning from within each
if block making each case easier to read in isolation.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Remove the remaining duplication between 32b & 64b in asm/atomic.h by
making use of an ATOMIC_OPS() macro to generate:
- atomic_read()/atomic64_read()
- atomic_set()/atomic64_set()
- atomic_cmpxchg()/atomic64_cmpxchg()
- atomic_xchg()/atomic64_xchg()
This is consistent with the way all other functions in asm/atomic.h are
generated, and ensures consistency between the 32b & 64b functions.
Of note is that this results in the above now being static inline
functions rather than macros.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Unify the definitions of atomic_sub_if_positive() &
atomic64_sub_if_positive() using a macro like we do for most other
atomic functions. This allows us to share the implementation ensuring
consistency between the two. Notably this provides the appropriate
loongson3_war barriers in the atomic64_sub_if_positive() case which were
previously missing.
The code is rearranged a little to handle the !kernel_uses_llsc case
first in order to de-indent the LL/SC case & allow us not to go over 80
characters per line.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Use smp_mb__before_atomic() & smp_mb__after_atomic() in
atomic_sub_if_positive() rather than the equivalent
smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard &
this preps us for avoiding redundant duplicate barriers on Loongson3 in
a later patch.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Implement __sync() using the new __SYNC() infrastructure, which will
take care of not emitting an instruction for old R3k CPUs that don't
support it. The only behavioral difference is that __sync() will now
provide a compiler barrier on these old CPUs, but that seems like
reasonable behavior anyway.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in
both cases. Remove the #ifdef & simply expand to the __sync() macro.
Whilst here indent the strong ordering case definitions to match the
indentation of the weak ordering ones, helping readability.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Simplify our definitions of rmb() & wmb() using the new __SYNC()
infrastructure.
The fast_rmb() & fast_wmb() macros are removed, since they only provided
a level of indirection that made the code less readable & weren't
directly used anywhere in the kernel tree.
The Octeon #ifdef'ery is removed, since the "syncw" instruction
previously used is merely an alias for "sync 4" which __SYNC() will emit
for the wmb sync type when the kernel is configured for an Octeon CPU.
Similarly __SYNC() will emit nothing for the rmb sync type in Octeon
configurations.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Introduce an asm/sync.h header which provides infrastructure that can be
used to generate sync instructions of various types, and for various
reasons. For example if we need a sync instruction that provides a full
completion barrier but only on systems which have weak memory ordering,
we can generate the appropriate assembly code using:
__SYNC(full, weak_ordering)
When the kernel is configured to run on systems with weak memory
ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync
instruction. When the kernel is configured to run on systems with strong
memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit
nothing. The caller doesn't need to know which happened - it simply says
what it needs & when, with no concern for checking the kernel
configuration.
There are some scenarios in which we may want to emit code only when we
*didn't* emit a sync instruction. For example, some Loongson3 CPUs
suffer from a bug that requires us to emit a sync instruction prior to
each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In
cases where this bug workaround is enabled, it's wasteful to then have
more generic code emit another sync instruction to provide barriers we
need in general. A __SYNC_ELSE() macro allows for this, providing an
extra argument that contains code to be assembled only in cases where
the sync instruction was not emitted. For example if we have a scenario
in which we generally want to emit a release barrier but for affected
Loongson3 configurations upgrade that to a full completion barrier, we
can do that like so:
__SYNC_ELSE(full, loongson3_war, __SYNC(rl, always))
The assembly generated by these macros can be used either as inline
assembly or in assembly source files.
Differing types of sync as provided by MIPSr6 are defined, but currently
they all generate a full completion barrier except in kernels configured
for Cavium Octeon systems. There the wmb sync-type is used, and rmb
syncs are omitted, as has been the case since commit 6b07d38aaa
("MIPS: Octeon: Use optimized memory barrier primitives."). Using
__SYNC() with the wmb or rmb types will abstract away the Octeon
specific behavior and allow us to later clean up asm/barrier.h code that
currently includes a plethora of #ifdef's.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
We currently duplicate the definition of __scbeqz in asm/atomic.h &
asm/cmpxchg.h. Move it to asm/llsc.h & rename it to __SC_BEQZ to fit
better with the existing __SC macro provided there.
We include a tab in the string in order to avoid the need for users to
indent code any further to include whitespace of their own after the
instruction mnemonic.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-kernel@vger.kernel.org
Only build the checks for R4k errata workarounds if we expect that the
kernel might actually run on a system with an R4k CPU - ie.
CONFIG_SYS_HAS_CPU_R4X00=y & we're targeting a pre-MIPSr1 ISA revision.
Rename cpu-bugs64.c to r4k-bugs64.c to indicate the fact that the code
is specific to R4k CPUs.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Commit 171a9bae68 ("staging/octeon: Allow test build on !MIPS") moved
the inclusion of a bunch of headers by various files in the Octeon
ethernet driver into a common header, but in doing so it changed the
order in which those headers are included.
Prior to the referenced commit drivers/staging/octeon/ethernet.c
included asm/octeon/cvmx-pip.h before asm/octeon/cvmx-ipd.h, which makes
use of the CVMX_PIP_SFT_RST definition pulled in by the former. After
commit 171a9bae68 ("staging/octeon: Allow test build on !MIPS") we
pull in asm/octeon/cvmx-ipd.h first & builds fail with:
In file included from drivers/staging/octeon/octeon-ethernet.h:27,
from drivers/staging/octeon/ethernet.c:22:
arch/mips/include/asm/octeon/cvmx-ipd.h: In function 'cvmx_ipd_free_ptr':
arch/mips/include/asm/octeon/cvmx-ipd.h:330:27: error: storage size of
'pip_sft_rst' isn't known
union cvmx_pip_sft_rst pip_sft_rst;
^~~~~~~~~~~
arch/mips/include/asm/octeon/cvmx-ipd.h:331:36: error: 'CVMX_PIP_SFT_RST'
undeclared (first use in this function); did you mean 'CVMX_CIU_SOFT_RST'?
pip_sft_rst.u64 = cvmx_read_csr(CVMX_PIP_SFT_RST);
^~~~~~~~~~~~~~~~
CVMX_CIU_SOFT_RST
arch/mips/include/asm/octeon/cvmx-ipd.h:331:36: note: each undeclared
identifier is reported only once for each function it appears in
arch/mips/include/asm/octeon/cvmx-ipd.h:330:27: warning: unused variable
'pip_sft_rst' [-Wunused-variable]
union cvmx_pip_sft_rst pip_sft_rst;
^~~~~~~~~~~
make[4]: *** [scripts/Makefile.build:266: drivers/staging/octeon/ethernet.o]
Error 1
make[3]: *** [scripts/Makefile.build:509: drivers/staging/octeon] Error 2
Fix this by having asm/octeon/cvmx-ipd.h include the
asm/octeon/cvmx-pip-defs.h header that it is reliant upon, rather than
requiring its users to pull in that header before it.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 171a9bae68 ("staging/octeon: Allow test build on !MIPS")
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: linux-mips@vger.kernel.org
Cc: David S . Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
people, and until recently arm64 used these erroneously/pointlessly for
other levels of page table.
To make it incredibly clear that these only apply to the PTE level, and to
align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
to pgtable_pte_page_{ctor,dtor}().
These changes were generated with the following shell script:
----
git grep -lw 'pgtable_page_.tor' | while read FILE; do
sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
done
----
... with the documentation re-flowed to remain under 80 columns, and
whitespace fixed up in macros to keep backslashes aligned.
There should be no functional change as a result of this patch.
Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge updates from Andrew Morton:
- a few hot fixes
- ocfs2 updates
- almost all of -mm (slab-generic, slab, slub, kmemleak, kasan,
cleanups, debug, pagecache, memcg, gup, pagemap, memory-hotplug,
sparsemem, vmalloc, initialization, z3fold, compaction, mempolicy,
oom-kill, hugetlb, migration, thp, mmap, madvise, shmem, zswap,
zsmalloc)
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits)
mm/zsmalloc.c: fix a -Wunused-function warning
zswap: do not map same object twice
zswap: use movable memory if zpool support allocate movable memory
zpool: add malloc_support_movable to zpool_driver
shmem: fix obsolete comment in shmem_getpage_gfp()
mm/madvise: reduce code duplication in error handling paths
mm: mmap: increase sockets maximum memory size pgoff for 32bits
mm/mmap.c: refine find_vma_prev() with rb_last()
riscv: make mmap allocation top-down by default
mips: use generic mmap top-down layout and brk randomization
mips: replace arch specific way to determine 32bit task with generic version
mips: adjust brk randomization offset to fit generic version
mips: use STACK_TOP when computing mmap base address
mips: properly account for stack randomization and stack guard gap
arm: use generic mmap top-down layout and brk randomization
arm: use STACK_TOP when computing mmap base address
arm: properly account for stack randomization and stack guard gap
arm64, mm: make randomization selected by generic topdown mmap layout
arm64, mm: move generic mmap layout functions to mm
arm64: consider stack randomization for mmap base only when necessary
...
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem
cache for page table allocations on several architectures that do not use
PAGE_SIZE tables for one or more levels of the page table hierarchy.
Most architectures do not implement these functions and use __weak default
NOP implementation of pgd_cache_init(). Since there is no such default
for pgtable_cache_init(), its empty stub is duplicated among most
architectures.
Rename the definitions of pgd_cache_init() to pgtable_cache_init() and
drop empty stubs of pgtable_cache_init().
Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org> [arm64]
Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: remove quicklist page table caches".
A while ago Nicholas proposed to remove quicklist page table caches [1].
I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.
[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com
This patch (of 3):
Remove page table allocator "quicklists". These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.
The numbers in the initial commit look interesting but probably don't
apply anymore. If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.
Also it might be better to instead make more general improvements to page
allocator if this is still so slow.
Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MIPS updates from Paul Burton:
"Main MIPS changes:
- boot_mem_map is removed, providing a nice cleanup made possible by
the recent removal of bootmem.
- Some fixes to atomics, in general providing compiler barriers for
smp_mb__{before,after}_atomic plus fixes specific to Loongson CPUs
or MIPS32 systems using cmpxchg64().
- Conversion to the new generic VDSO infrastructure courtesy of
Vincenzo Frascino.
- Removal of undefined behavior in set_io_port_base(), fixing the
behavior of some MIPS kernel configurations when built with recent
clang versions.
- Initial MIPS32 huge page support, functional on at least Ingenic
SoCs.
- pte_special() is now supported for some configurations, allowing
among other things generic fast GUP to be used.
- Miscellaneous fixes & cleanups.
And platform specific changes:
- Major improvements to Ingenic SoC support from Paul Cercueil,
mostly enabled by the inclusion of the new TCU (timer-counter unit)
drivers he's spent a very patient year or so working on. Plus some
fixes for X1000 SoCs from Zhou Yanjie.
- Netgear R6200 v1 systems are now supported by the bcm47xx platform.
- DT updates for BMIPS, Lantiq & Microsemi Ocelot systems"
* tag 'mips_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (89 commits)
MIPS: Detect bad _PFN_SHIFT values
MIPS: Disable pte_special() for MIPS32 with RiXi
MIPS: ralink: deactivate PCI support for SOC_MT7621
mips: compat: vdso: Use legacy syscalls as fallback
MIPS: Drop Loongson _CACHE_* definitions
MIPS: tlbex: Remove cpu_has_local_ebase
MIPS: tlbex: Simplify r3k check
MIPS: Select R3k-style TLB in Kconfig
MIPS: PCI: refactor ioc3 special handling
mips: remove ioremap_cachable
mips/atomic: Fix smp_mb__{before,after}_atomic()
mips/atomic: Fix loongson_llsc_mb() wreckage
mips/atomic: Fix cmpxchg64 barriers
MIPS: Octeon: remove duplicated include from dma-octeon.c
firmware: bcm47xx_nvram: Allow COMPILE_TEST
firmware: bcm47xx_nvram: Correct size_t printf format
MIPS: Treat Loongson Extensions as ASEs
MIPS: Remove dev_err() usage after platform_get_irq()
MIPS: dts: mscc: describe the PTP ready interrupt
MIPS: dts: mscc: describe the PTP register range
...
Commit 61cbfff4b1 ("MIPS: pte_special()/pte_mkspecial() support")
added a _PAGE_SPECIAL bit to the pgprot bits of our PTEs. Unfortunately
for MIPS32 configurations with RiXi support this pushed the number of
pgprot bits to 13. Since the PFN field in EntryLo begins at bit 12 this
results in us shifting the most significant bit of the physical address
beyond the end of the PTE, leading any mapped access to a physical
address above 2GB to incorrectly access an address 2GB lower than
intended.
For now, disable the pte_special() support for MIPS32 configurations
that support RiXi.
Fixes: 61cbfff4b1 ("MIPS: pte_special()/pte_mkspecial() support")
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Dmitry Korotin <dkorotin@wavecomp.com>
Cc: linux-mips@vger.kernel.org
_CACHE_CACHABLE_NONCOHERENT is defined as 3<<_CACHE_SHIFT by default, so
there's no need to define it as such specifically for Loongson.
_CACHE_CACHABLE_COHERENT is not used anywhere in the kernel, so there's
no need to define it at all.
Finally the comment found alongside these definitions seems incorrect -
it suggests that we're defining _CACHE_CACHABLE_NONCOHERENT such that it
actually provides coherence, but the opposite seems to be true & instead
the unused _CACHE_CACHABLE_COHERENT is defined as the typically
incoherent value.
Delete the whole thing, which will have no effect on the compiled code
anyway.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@vger.kernel.org
The cpu_has_local_ebase macro is, confusingly, not used to indicate
whether the EBase register is local to a CPU or not. Instead it
indicates whether we want to generate the TLB refill exception vector
each time a CPU is brought online. Doing this makes little sense on any
system, since we always use the same value for EBase & thus we cannot
have different TLB refill exception handlers per CPU.
Regenerating the code is not only pointless but also can be actively
harmful, as commit 8759934e2b ("MIPS: Build uasm-generated code only
once to avoid CPU Hotplug problem") described. That commit introduced
cpu_has_local_ebase to disable the handler regeneration for Loongson
machines, but this is by no means a Loongson-specific problem.
Remove cpu_has_local_ebase & simply generate the TLB refill handler once
during boot, just like the rest of the TLB exception handlers.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: linux-mips@vger.kernel.org
Currently areas where we need to determine whether the TLB is R3k-style
need to check for either of CONFIG_CPU_R3000 || CONFIG_CPU_TX39XX.
Introduce a new CONFIG_CPU_R3K_TLB & select it from both of the above,
allowing us to simplify checks for R3k-style TLBs by only checking for
this new Kconfig option.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: linux-mips@vger.kernel.org
Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as MIPS
without WEAK_REORDERING_BEYOND_LLSC) fail for:
*x = 1;
atomic_inc(u);
smp_mb__after_atomic();
r0 = *y;
Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:
atomic_inc(u);
*x = 1;
smp_mb__after_atomic();
r0 = *y;
Which the CPU is then allowed to re-order (under TSO rules) like:
atomic_inc(u);
r0 = *y;
*x = 1;
And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.
Reported-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
The comment describing the loongson_llsc_mb() reorder case doesn't
make any sense what so ever. Instruction re-ordering is not an SMP
artifact, but rather a CPU local phenomenon. Clarify the comment by
explaining that these issue cause a coherence fail.
For the branch speculation case; if futex_atomic_cmpxchg_inatomic()
needs one at the bne branch target, then surely the normal
__cmpxch_asm() implementation does too. We cannot rely on the
barriers from cmpxchg() because cmpxchg_local() is implemented with
the same macro, and branch prediction and speculation are, too, CPU
local.
Fixes: e02e07e312 ("MIPS: Loongson: Introduce and use loongson_llsc_mb()")
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Huang Pei <huangpei@loongson.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
There were no memory barriers on the 32bit implementation of
cmpxchg64(). Fix this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Break up the big ioc3 register struct into functional pieces to
make use in sub-function drivers more straightforward. And while
doing that get rid of all volatile access by using readX/writeX.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in question is modifying a variable declared const through
pointer manipulation. Such code is explicitly undefined behavior, and
is the lone issue preventing malta_defconfig from booting when built
with Clang:
If an attempt is made to modify an object defined with a const-qualified
type through use of an lvalue with non-const-qualified type, the
behavior is undefined.
LLVM is removing such assignments. A simple fix is to not declare
variables const that you plan on modifying. Limiting the scope would be
a better method of preventing unwanted writes to such a variable.
Further, the code in question mentions "compiler bugs" without any links
to bug reports, so it is difficult to know if the issue is resolved in
GCC. The patch was authored in 2006, which would have been GCC 4.0.3 or
4.1.1. The minimal supported version of GCC in the Linux kernel is
currently 4.6.
For what its worth, there was UB before the commit in question, it just
added a barrier and got lucky IRT codegen. I don't think there's any
actual compiler bugs related, just runtime bugs due to UB.
Link: https://github.com/ClangBuiltLinux/linux/issues/610
Fixes: 966f4406d9 ("[MIPS] Work around bad code generation for <asm/io.h>.")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Debugged-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Eli Friedman <efriedma@quicinc.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: ralf@linux-mips.org
Cc: jhogan@kernel.org
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Hassan Naveed <hnaveed@wavecomp.com>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: clang-built-linux@googlegroups.com
boot_mem_map was introduced very early and cannot handle memory maps
with nid. Nowadays, memblock can exactly replace boot_mem_map.
Detect pfn info and setup resources with memblock maps.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
[paul.burton@mips.com: Fix size calculation in check_kernel_sections_mem]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: yasha.che3@gmail.com
Cc: aurelien@aurel32.net
Cc: sfr@canb.auug.org.au
Cc: fancer.lancer@gmail.com
Cc: matt.redfearn@mips.com
Cc: chenhc@lemote.com
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: cavium_octeon_defconfig mips):
arch/mips/include/asm/octeon/cvmx-sli-defs.h:47:6: warning: this statement
may fall through [-Wimplicit-fallthrough=]
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
clang warns:
arch/mips/include/asm/syscall.h:136:3: error: variable 'ret' is
uninitialized when used here [-Werror,-Wuninitialized]
ret |= mips_get_syscall_arg(args++, task, regs, i++);
^~~
arch/mips/include/asm/syscall.h:129:9: note: initialize the variable
'ret' to silence this warning
int ret;
^
= 0
1 error generated.
It's not wrong; however, it's not an issue in practice because ret is
only assigned to, not read from. ret could just be initialized to zero
but looking into it further, ret has been unused since it was first
added in 2012 so just get rid of it and update mips_get_syscall_arg's
return type since none of the return values are ever checked. If it is
ever needed again, this commit can be reverted and ret can be properly
initialized.
Fixes: c0ff3c53d4 ("MIPS: Enable HAVE_ARCH_TRACEHOOK.")
Link: https://github.com/ClangBuiltLinux/linux/issues/604
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: clang-built-linux@googlegroups.com