Move the prototypes for sched_ttwu_pending() and send_call_function_single_ipi()
into the newly created kernel/sched/smp.h header, to make sure they are all
the same, and to architectures happy that use -Wmissing-prototypes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Instead of warning when mutex_is_locked(), just use the lockdep
framework. The code is smaller and checks could be disabled for
production environments (it is useful only during development).
Put asserts at beginning of function, even before validating arguments.
The behavior of update_devfreq() is now changed because lockdep assert
will only print a warning, not return with EINVAL.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Fix inconsistent IS_ERR and PTR_ERR in imx_bus_init_icc().
The proper pointer to be passed as argument to PTR_ERR() is
priv->icc_pdev.
This bug was detected with the help of Coccinelle.
Fixes: 16c1d2f1b0bd ("PM / devfreq: imx: Register interconnect device")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
[cw00.choi: Edit the patch title from 'imx' to 'imx-bus']
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
GCC produces this warning when kernel compiled using `make W=1`:
warning: ‘strncpy’ specified bound 16 equals destination size [-Wstringop-truncation]
772 | strncpy(devfreq->governor_name, governor_name, DEVFREQ_NAME_LEN);
The strncpy doesn't take care of NULL-termination of the destination
buffer, while the strscpy does.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
There is no single device which can represent the imx interconnect.
Instead of adding a virtual one just make the main &noc act as the
global interconnect provider.
The imx interconnect provider driver will scale the NOC and DDRC based
on bandwidth request. More scalable nodes can be added in the future,
for example for audio/display/vpu/gpu NICs.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Add initial support for dynamic frequency switching on pieces of the imx
interconnect fabric.
All this driver does is set a clk rate based on an opp table, it does
not map register areas.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The function “platform_get_irq” can log an error already.
Thus omit a redundant message for the exception handling in the
calling function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
We're taking into account both HW memory-accesses + CPU activity based on
current CPU's frequency. For memory-accesses there is a kind of hysteresis
in a form of "boosting" which is managed by the tegra30-devfreq driver.
If current HW memory activity is higher than activity judged based of the
CPU's frequency, then there is no need to schedule cpufreq_update_work
because the result of the work will be a NO-OP. And thus,
tegra_actmon_cpufreq_contribution() should return 0, meaning that at the
moment CPU frequency doesn't contribute anything to the final decision
about required memory clock rate.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The recent commit: 90b5363acd ("sched: Clean up scheduler_ipi()")
got smp_call_function_single_async() subtly wrong. Even though it will
return -EBUSY when trying to re-use a csd, that condition is not
atomic and still requires external serialization.
The change in ttwu_queue_remote() got this wrong.
While on first reading ttwu_queue_remote() has an atomic test-and-set
that appears to serialize the use, the matching 'release' is not in
the right place to actually guarantee this serialization.
The actual race is vs the sched_ttwu_pending() call in the idle loop;
that can run the wakeup-list without consuming the CSD.
Instead of trying to chain the lists, merge them.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200526161908.129371594@infradead.org
Currently irq_work_queue_on() will issue an unconditional
arch_send_call_function_single_ipi() and has the handler do
irq_work_run().
This is unfortunate in that it makes the IPI handler look at a second
cacheline and it misses the opportunity to avoid the IPI. Instead note
that struct irq_work and struct __call_single_data are very similar in
layout, so use a few bits in the flags word to encode a type and stick
the irq_work on the call_single_queue list.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200526161908.011635912@infradead.org
The call_single_queue can contain (two) different callbacks,
synchronous and asynchronous. The current interrupt handler runs them
in-order, which means that remote CPUs that are waiting for their
synchronous call can be delayed by running asynchronous callbacks.
Rework the interrupt handler to first run the synchonous callbacks.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200526161907.836818381@infradead.org
The recent commit: 90b5363acd ("sched: Clean up scheduler_ipi()")
got smp_call_function_single_async() subtly wrong. Even though it will
return -EBUSY when trying to re-use a csd, that condition is not
atomic and still requires external serialization.
The change in kick_ilb() got this wrong.
While on first reading kick_ilb() has an atomic test-and-set that
appears to serialize the use, the matching 'release' is not in the
right place to actually guarantee this serialization.
Rework the nohz_idle_balance() trigger so that the release is in the
IPI callback and thus guarantees the required serialization for the
CSD.
Fixes: 90b5363acd ("sched: Clean up scheduler_ipi()")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Cc: mgorman@techsingularity.net
Link: https://lore.kernel.org/r/20200526161907.778543557@infradead.org
We are going to rely on the loosening of RCU callback semantics,
introduced by this commit:
806f04e9fd: ("rcu: Allow for smp_call_function() running callbacks from idle")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current RCU hard relies on smp_call_function() callbacks running from
interrupt context. A pending optimization is going to break that, it
will allow idle CPUs to run the callbacks from the idle loop. This
avoids raising the IPI on the requesting CPU and avoids handling an
exception on the receiving CPU.
Change rcu_is_cpu_rrupt_from_idle() to also accept task context,
provided it is the idle task.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20200527171236.GC706495@hirez.programming.kicks-ass.net
The zcomp driver uses per-CPU compression. The per-CPU data pointer is
acquired with get_cpu_ptr() which implicitly disables preemption.
It allocates memory inside the preempt disabled region which conflicts
with the PREEMPT_RT semantics.
Replace the implicit preemption control with an explicit local lock.
This allows RT kernels to substitute it with a real per CPU lock, which
serializes the access but keeps the code section preemptible. On non RT
kernels this maps to preempt_disable() as before, i.e. no functional
change.
[bigeasy: Use local_lock(), description, drop reordering]
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-8-bigeasy@linutronix.de
zcomp::stream is a per-CPU pointer, pointing to struct zcomp_strm
which contains two pointers. Having struct zcomp_strm allocated
directly as per-CPU memory would avoid one additional memory
allocation and a pointer dereference. This also simplifies the
addition of a local_lock to struct zcomp_strm.
Allocate zcomp::stream directly as per-CPU memory.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-7-bigeasy@linutronix.de
send_msg() disables preemption to avoid out-of-order messages. As the
code inside the preempt disabled section acquires regular spinlocks,
which are converted to 'sleeping' spinlocks on a PREEMPT_RT kernel and
eventually calls into a memory allocator, this conflicts with the RT
semantics.
Convert it to a local_lock which allows RT kernels to substitute them with
a real per CPU lock. On non RT kernels this maps to preempt_disable() as
before. No functional change.
[bigeasy: Patch description]
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-6-bigeasy@linutronix.de
The squashfs multi CPU decompressor makes use of get_cpu_ptr() to
acquire a pointer to per-CPU data. get_cpu_ptr() implicitly disables
preemption which serializes the access to the per-CPU data.
But decompression can take quite some time depending on the size. The
observed preempt disabled times in real world scenarios went up to 8ms,
causing massive wakeup latencies. This happens on all CPUs as the
decompression is fully parallelized.
Replace the implicit preemption control with an explicit local lock.
This allows RT kernels to substitute it with a real per CPU lock, which
serializes the access but keeps the code section preemptible. On non RT
kernels this maps to preempt_disable() as before, i.e. no functional
change.
[ bigeasy: Use local_lock(), patch description]
Reported-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-5-bigeasy@linutronix.de
The various struct pagevec per CPU variables are protected by disabling
either preemption or interrupts across the critical sections. Inside
these sections spinlocks have to be acquired.
These spinlocks are regular spinlock_t types which are converted to
"sleeping" spinlocks on PREEMPT_RT enabled kernels. Obviously sleeping
locks cannot be acquired in preemption or interrupt disabled sections.
local locks provide a trivial way to substitute preempt and interrupt
disable instances. On a non PREEMPT_RT enabled kernel local_lock() maps
to preempt_disable() and local_lock_irq() to local_irq_disable().
Create lru_rotate_pvecs containing the pagevec and the locallock.
Create lru_pvecs containing the remaining pagevecs and the locallock.
Add lru_add_drain_cpu_zone() which is used from compact_zone() to avoid
exporting the pvec structure.
Change the relevant call sites to acquire these locks instead of using
preempt_disable() / get_cpu() / get_cpu_var() and local_irq_disable() /
local_irq_save().
There is neither a functional change nor a change in the generated
binary code for non PREEMPT_RT enabled non-debug kernels.
When lockdep is enabled local locks have lockdep maps embedded. These
allow lockdep to validate the protections, i.e. inappropriate usage of a
preemption only protected sections would result in a lockdep warning
while the same problem would not be noticed with a plain
preempt_disable() based protection.
local locks also improve readability as they provide a named scope for
the protections while preempt/interrupt disable are opaque scopeless.
Finally local locks allow PREEMPT_RT to substitute them with real
locking primitives to ensure the correctness of operation in a fully
preemptible kernel.
[ bigeasy: Adopted to use local_lock ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-4-bigeasy@linutronix.de
The radix-tree and idr preload mechanisms use preempt_disable() to protect
the complete operation between xxx_preload() and xxx_preload_end().
As the code inside the preempt disabled section acquires regular spinlocks,
which are converted to 'sleeping' spinlocks on a PREEMPT_RT kernel and
eventually calls into a memory allocator, this conflicts with the RT
semantics.
Convert it to a local_lock which allows RT kernels to substitute them with
a real per CPU lock. On non RT kernels this maps to preempt_disable() as
before, but provides also lockdep coverage of the critical region.
No functional change.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-3-bigeasy@linutronix.de
preempt_disable() and local_irq_disable/save() are in principle per CPU big
kernel locks. This has several downsides:
- The protection scope is unknown
- Violation of protection rules is hard to detect by instrumentation
- For PREEMPT_RT such sections, unless in low level critical code, can
violate the preemptability constraints.
To address this PREEMPT_RT introduced the concept of local_locks which are
strictly per CPU.
The lock operations map to preempt_disable(), local_irq_disable/save() and
the enabling counterparts on non RT enabled kernels.
If lockdep is enabled local locks gain a lock map which tracks the usage
context. This will catch cases where an area is protected by
preempt_disable() but the access also happens from interrupt context. local
locks have identified quite a few such issues over the years, the most
recent example is:
b7d5dc2107 ("random: add a spinlock_t to struct batched_entropy")
Aside of the lockdep coverage this also improves code readability as it
precisely annotates the protection scope.
PREEMPT_RT substitutes these local locks with 'sleeping' spinlocks to
protect such sections while maintaining preemtability and CPU locality.
local locks can replace:
- preempt_enable()/disable() pairs
- local_irq_disable/enable() pairs
- local_irq_save/restore() pairs
They are also used to replace code which implicitly disables preemption
like:
- get_cpu()/put_cpu()
- get_cpu_var()/put_cpu_var()
with PREEMPT_RT friendly constructs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-2-bigeasy@linutronix.de
The Cypress cy15b104q and cy15v104q are 4Mbit serial SPI F-RAM devices.
Add support for them to the spi-nor driver.
The actual Device ID of this chip is 7f 7f 7f 7f 7f 7f c2 2c 04. That is
six times the continuation code 7f followed by c2 for Ramtron.
Unfortunately the chip sends the Device ID in reversed order, so the
continuation code is not at the beginning, but instead at the end. Even
more unfortunate is that when reading further the chip sends more 7f
codes which means we are not even able to count the continuation codes.
We can only hope that this reversed Device ID will never match any other
devices ID.
Collisions are improbable as of now, the solution from above is good
enough. In case of future collisions one can introduce an INFO9 macro,
with the downsize that struct flash_info would grow and we have lots of
flashes. A more elegant solution would be to introduce dedicated
flash ID tables for each bank in JESP106BA.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
[tudor.ambarus@microchip.com: amend commit description with possible
future solutions in case collisions occur.]
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
s25fs256s was identified as s25fl256s. Differentiate between them by
the Family ID using the INFO6 macro.
Fixes: b199489d37 ("mtd: spi-nor: add the framework for SPI NOR")
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Add support for Cypress s25fs128s1 flash. Previously the flash is
decoded as s25fl129p1 by mistake.
Add it in the flash info list to correctly decode. The flash also
needs a fixup for s25fs-s family. Further capability of the flash will
be parsed from bfpt.
The flash has been tested under SPI/DUAL/QUAD mode on hisi-sfc-v3xx
controller, all the write/read/erase works well.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
PCI_IOBASE is used to create VM maps for PCI I/O ports, it is
required by generic PCI drivers to make memory mapped I/O range
work.
To deal with legacy drivers that have fixed I/O ports range we
reserved 0x10000 in PCI_IOBASE, should be enough for i8259 i8042
stuff.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
If CONFIG_MIPS_MALTA is not set but CONFIG_LEGACY_BOARD_SEAD3 is set,
the subdir arch/mips/boot/dts/mti will not be built, so the sead3.dts
which depends on CONFIG_LEGACY_BOARD_SEAD3 in this subdir is also not
built, and then there exists the following build error, fix it.
LD .tmp_vmlinux.kallsyms1
arch/mips/generic/board-sead3.o:(.mips.machines.init+0x4): undefined reference to `__dtb_sead3_begin'
Makefile:1106: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1
Additionally, add CONFIG_FIT_IMAGE_FDT_BOSTON check for subdir img to
fix the following build error when CONFIG_MACH_PISTACHIO is not set but
CONFIG_FIT_IMAGE_FDT_BOSTON is set.
FATAL ERROR: Couldn't open "boot/dts/img/boston.dtb": No such file or directory
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 41528ba6af ("MIPS: DTS: Only build subdir of current platform")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
In order to be compatible with devices of different versions, V1 in the
accelerator driver is now isolated, and other versions are the previous
V2 processing flow.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now, in crypto-engine, if hardware queue is full (-ENOSPC),
requeue request regardless of MAY_BACKLOG flag.
If hardware throws any other error code (like -EIO, -EINVAL,
-ENOMEM, etc.) only MAY_BACKLOG requests are enqueued back into
crypto-engine's queue, since the others can be dropped.
The latter case can be fatal error, so those cannot be recovered from.
For example, in CAAM driver, -EIO is returned in case the job descriptor
is broken, so there is no possibility to fix the job descriptor.
Therefore, these errors might be fatal error, so we shouldn’t
requeue the request. This will just be pass back and forth between
crypto-engine and hardware.
Fixes: 6a89f492f8 ("crypto: engine - support for parallel requests based on retry mechanism")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add device tree compatible strings and create proper modalias structures
to let this driver load automatically if compiled as module, because
max14577 MFD driver creates MFD cells with such compatible strings.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This merges the MP2629 battery charge management immutable branch
between MFD, IIO and power-supply due for the v5.8 merge window
into power-supply for-next branch.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Spansion S25FS-S family has an issue in the Basic Flash Parameter Table
(BFPT): Dword-11 bits 7:4 specify a page size of 512 bytes. Actually
this is configurable in the vendor unique register (CR3V) and even the
factory default setting is to "wrap at 256 bytes", so blindly relying
on BFPT breaks the page writes on these chips. Add the post-BFPT fixup
which restores the default page size of 256 bytes -- to properly read
CR3V this early is quite intrusive and should better be done as a new
feature; Alexander Sverdlin had the patch doing that:
https://patchwork.ozlabs.org/project/linux-mtd/patch/20200227123657.26030-1-alexander.sverdlin@nokia.com/
Fixes: dfd2b74530 ("mtd: spi-nor: add Spansion S25FS512S ID")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Tested-by: Kuldeep Singh <kuldeep.singh@nxp.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
This patch enables AMD Fam17h RAPL support for the Package level metric.
The support is as per AMD Fam17h Model31h (Zen2) and model 00-ffh (Zen1) PPR.
The same output is available via the energy-pkg pseudo event:
$ perf stat -a -I 1000 --per-socket -e power/energy-pkg/
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200527224659.206129-6-eranian@google.com
This patch modifies perf_probe_msr() by allowing passing of
struct perf_msr array where some entries are not populated, i.e.,
they have either an msr address of 0 or no attribute_group pointer.
This helps with certain call paths, e.g., RAPL.
In case the grp is NULL, the default sysfs visibility rule
applies which is to make the group visible. Without the patch,
you would get a kernel crash with a NULL group.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200527224659.206129-5-eranian@google.com