In order to get a responsive pause operation, the resume/pause_cpus
are usually called within a RT task prio.
However only the lazy-resume and lazy-pause operations need to
happen quickly, the rest can happen at leisure. Running that
portion at high priority keeps the cpu away from important tasks.
Reduce the priority right after the lazy portion, and restore
it before returning, if it was at RT prio.
Bug: 203115740
Fixes: 683010f555 ("ANDROID: cpu/hotplug: add pause/resume_cpus interface")
Change-Id: I1f3394eb9b5fa1876330fef6e25a203da0fde670
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
cpu_down() checks for num_active_cpus() to ensure that at least one
cpu is left active. If there are two online CPUs, but one of these
is paused this check will fail indicating that only one active
CPU is available. This will prevent the online but inactive cpu
from being offlined.
Correct cpu_down() to ensure that if there is only one active CPU
and that is the CPU being requested, the offline is blocked, allowing
the second to last CPU that is inactive but online to be offlined.
Bug: 182362445
Change-Id: I5b26cb6c5fdba4f2e69e5201e25bfe987d30c405
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
It is possible that all the 32 bit CPUs are paused in
the system, which is not ideal for quickly launching
32 bit apps.
Detect if a pause operation is about to pause the
last 32 bit CPU, and prevent it from happening.
Bug: 175896474
Change-Id: I21b4dad7ba9f3ef9be460137098e6fb2c0e336e6
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Include a vendor hook for cpu_up and cpu_down to force the
rebuilding of scheduling domains prior to issuing a new
cpu up/down. Include a Kernel Export for
cpuset_wait_for_hotplug such that vendor hooks may refer
to this functionality, to ensure scheduling domains are
complete.
Bug: 176152285
Change-Id: I778dbc5e4f9d613f39b8c61f244c0f33020a3dd3
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Add a tracepoint for pause and resume which measures the
duration of time to perform the entire operation, the
cpus acted upon with this event, and the current state
of the active cpu mask. This should be sufficient
for testing pause performance.
Bug: 175959069
Change-Id: I9fc269c7d09ac78ec31612d3c552044b72b0e6e3
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Incorporate a vendor hook in the resume cpus path
so that vendor specific activities may take place.
Bug: 161210528
Change-Id: I74d03247491b004e891dbcfe06a478d00a95ba9f
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
In the resume_cpus() path, cpus cannot be taken
advantage of until the cpus write lock is acquired,
and cpus are activated and domains rebuilt. This
can incurr significant delay in the unpause operation.
Additionally, if scheduled through the kworker thread,
the wait time for rebuilding sched domains becomes
large due to a busy system that can prevent the kworker
from executing.
Activate the cpus and call the cpuset_hotplug_workfn
directly within resume_cpus prior to getting the cpus
write lock, thereby eliminating delays associated
with scheduling this activity.
Bug: 161210528
Change-Id: Ie2521f28ed9078b22d421d792f08413016d4dd62
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Signed-off-by: Todd Kjos <tkjos@google.com>
paused_cpus intending to force CPUs to go idle as quickly as possible,
adding a migration step, to drain the rq from any running task.
Two steps are actually needed. The first one, "lazy", will run before the
cpu_active_mask has been synced. The second one will run after. It is
possible for another CPU, to observe an outdated version of that mask and
to enqueue a task on a rq that has just been marked inactive. The second
migration is there to catch any of those spurious move, while the first
one will drain the rq as quickly as possible to let the CPU reach an idle
state.
Bug: 161210528
Change-Id: Ie26c2e4c42665dd61d41a899a84536e56bf2b887
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
pause_cpus intends to have a way to force a CPU to go idle and to resume
as quickly as possible, with as little disruption as possible on the
system. This is a way of saving energy or meet thermal constraints, for
which a full CPU hotunplug is too slow. A paused CPU is simply deactivated
from the scheduler point of view. This corresponds to the first hotunplug
step.
Each pause operation still needs some heavy synchronization. Allowing to
pause several CPUs in one go mitigate that issue.
Paused CPUs can be resumed with resume_cpus(), which also takes a cpumask
as an input.
Few limitations:
* It isn't possible to pause a CPU which is running SCHED_DEADLINE task.
* A paused CPU will be removed from any cpuset it is part of. Resuming
the CPU won't put back this CPU in the cpuset if using cgroup1.
Cgroup2 doesn't have this limitation.
* per-CPU kthreads are still allowed to run on a paused CPU.
Bug: 161210528
Change-Id: I1f5cb28190f8ec979bb8640a89b022f2f7266bcf
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
In the event of a partial _cpu_down, (i.e. _cpu_down(target) where
target > CPUHP_AP_OFFLINE), the cpu_online_mask won't be aligned with
cpu_active_mask. This is an issue when trying to offline the last CPU
from cpu_active_mask, while num_online_cpus() > 1.
Protect against this case by checking num_active_cpus() instead of
num_online_cpus().
Bug: 161210528
Change-Id: Ibe7d9ef69e5f91e99be0d98076614a7654bda094
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Commit c6e5f9d7cf ("ANDROID: cpu-hotplug: Always use real time
scheduling when hotplugging a CPU") tried to speed-up hotplug of
SCHED_NORMAL tasks by temporarily elevating them to SCHED_FIFO. But
while at it, it also prevented hotplug from SCHED_IDLE, SCHED_BATCH or
SCHED_DEADLINE for no apparent reason.
Since this is a userspace-visible change, and is unlikely to actually be
needed, change the patch logic to only optimize for SCHED_NORMAL tasks
and leave the others untouched.
Bug: 169238689
Fixes: c6e5f9d7cf ("ANDROID: cpu-hotplug: Always use real time
scheduling when hotplugging a CPU")
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I4d9e88b15fee56e7d234826e2eaea306a69328bb
CPU hotplug operations take place in preemptible context. This leaves
the hotplugging thread at the mercy of overall system load and CPU
availability. If the hotplugging thread does not get an opportunity
to execute after it has already begun a hotplug operation, CPUs can
end up being stuck in a quasi online state. In the worst case a CPU
can be stuck in a state where the migration thread is parked while
another task is executing and changing affinity in a loop. This
combination can result in unbounded execution time for the running
task until the hotplugging thread gets the chance to run to complete
the hotplug operation.
Fix the said problem by ensuring that hotplug can only occur from
threads belonging to the RT sched class. This allows the hotplugging
thread priority on the CPU no matter what the system load or the
number of available CPUs are. If a SCHED_NORMAL task attempts to
hotplug a CPU, we temporarily elevate it's scheduling policy to RT.
Furthermore, we disallow hotplugging operations to begin if the
calling task belongs to the idle and deadline classes or those that
use the SCHED_BATCH policy.
Bug: 169238689
Change-Id: Idbb1384626e6ddff46c0d2ce752eee68396c78af
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[psodagud@codeaurora.org: Fixed compilation issues]
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Pull scheduler updates from Ingo Molnar:
"The changes in this cycle are:
- Optimize the task wakeup CPU selection logic, to improve
scalability and reduce wakeup latency spikes
- PELT enhancements
- CFS bandwidth handling fixes
- Optimize the wakeup path by remove rq->wake_list and replacing it
with ->ttwu_pending
- Optimize IPI cross-calls by making flush_smp_call_function_queue()
process sync callbacks first.
- Misc fixes and enhancements"
* tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
irq_work: Define irq_work_single() on !CONFIG_IRQ_WORK too
sched/headers: Split out open-coded prototypes into kernel/sched/smp.h
sched: Replace rq::wake_list
sched: Add rq::ttwu_pending
irq_work, smp: Allow irq_work on call_single_queue
smp: Optimize send_call_function_single_ipi()
smp: Move irq_work_run() out of flush_smp_call_function_queue()
smp: Optimize flush_smp_call_function_queue()
sched: Fix smp_call_function_single_async() usage for ILB
sched/core: Offload wakee task activation if it the wakee is descheduling
sched/core: Optimize ttwu() spinning on p->on_cpu
sched: Defend cfs and rt bandwidth quota against overflow
sched/cpuacct: Fix charge cpuacct.usage_sys
sched/fair: Replace zero-length array with flexible-array
sched/pelt: Sync util/runnable_sum with PELT window when propagating
sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr()
sched/fair: Optimize enqueue_task_fair()
sched: Make scheduler_ipi inline
sched: Clean up scheduler_ipi()
sched/core: Simplify sched_init()
...
The single user could have called freeze_secondary_cpus() directly.
Since this function was a source of confusion, remove it as it's
just a pointless wrapper.
While at it, rename enable_nonboot_cpus() to thaw_secondary_cpus() to
preserve the naming symmetry.
Done automatically via:
git grep -l enable_nonboot_cpus | xargs sed -i 's/enable_nonboot_cpus/thaw_secondary_cpus/g'
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: https://lkml.kernel.org/r/20200430114004.17477-1-qais.yousef@arm.com
In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaining when mmdrop() uses RCU from either memcg or
debugobjects below.
Fix it by cleaning up the active_mm state from BP instead. Every arch
which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
from AP. The only exception is parisc because it switches them to
&init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
but the patch will still work there because it calls mmgrab(&init_mm) in
smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().
WARNING: suspicious RCU usage
-----------------------------
kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!
other info that might help us debug this:
RCU used illegally from offline CPU!
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
lockdep_rcu_suspicious+0x140/0x164
get_work_pool+0x110/0x150
__queue_work+0x1bc/0xca0
queue_work_on+0x114/0x120
css_release+0x9c/0xc0
percpu_ref_put_many+0x204/0x230
free_pcp_prepare+0x264/0x570
free_unref_page+0x38/0xf0
__mmdrop+0x21c/0x2c0
idle_task_exit+0x170/0x1b0
pnv_smp_cpu_kill_self+0x38/0x2e0
cpu_die+0x48/0x64
arch_cpu_idle_dead+0x30/0x50
do_idle+0x2f4/0x470
cpu_startup_entry+0x38/0x40
start_secondary+0x7a8/0xa80
start_secondary_resume+0x10/0x14
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Link: https://lkml.kernel.org/r/20200401214033.8448-1-cai@lca.pw
In a quest to make the huge -rc1 merge easier to handle and bisect,
merge the first chunk of 5.7-rc1 patches into android-mainline.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib54436e9515660a4c0c25c49c21bfb399eb57921
Pull core SMP updates from Thomas Gleixner:
"CPU (hotplug) updates:
- Support for locked CSD objects in smp_call_function_single_async()
which allows to simplify callsites in the scheduler core and MIPS
- Treewide consolidation of CPU hotplug functions which ensures the
consistency between the sysfs interface and kernel state. The low
level functions cpu_up/down() are now confined to the core code and
not longer accessible from random code"
* tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()
cpu/hotplug: Hide cpu_up/down()
cpu/hotplug: Move bringup of secondary CPUs out of smp_init()
torture: Replace cpu_up/down() with add/remove_cpu()
firmware: psci: Replace cpu_up/down() with add/remove_cpu()
xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()
parisc: Replace cpu_up/down() with add/remove_cpu()
sparc: Replace cpu_up/down() with add/remove_cpu()
powerpc: Replace cpu_up/down() with add/remove_cpu()
x86/smp: Replace cpu_up/down() with add/remove_cpu()
arm64: hibernate: Use bringup_hibernate_cpu()
cpu/hotplug: Provide bringup_hibernate_cpu()
arm64: Use reboot_cpu instead of hardconding it to 0
arm64: Don't use disable_nonboot_cpus()
ARM: Use reboot_cpu instead of hardcoding it to 0
ARM: Don't use disable_nonboot_cpus()
ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus()
cpu/hotplug: Create a new function to shutdown nonboot cpus
cpu/hotplug: Add new {add,remove}_cpu() functions
sched/core: Remove rq.hrtick_csd_pending
...
A recent change to freeze_secondary_cpus() which added an early abort if a
wakeup is pending missed the fact that the function is also invoked for
shutdown, reboot and kexec via disable_nonboot_cpus().
In case of disable_nonboot_cpus() the wakeup event needs to be ignored as
the purpose is to terminate the currently running kernel.
Add a 'suspend' argument which is only set when the freeze is in context of
a suspend operation. If not set then an eventually pending wakeup event is
ignored.
Fixes: a66d955e91 ("cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending")
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Pavankumar Kondeti <pkondeti@codeaurora.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/874kuaxdiz.fsf@nanos.tec.linutronix.de
Use separate functions for the device core to bring a CPU up and down.
Users outside the device core must use add/remove_cpu() which will take
care of extra housekeeping work like keeping sysfs in sync.
Make cpu_up/down() static and replace the extra layer of indirection.
[ tglx: Removed the extra wrapper functions and adjusted function names ]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200323135110.30522-18-qais.yousef@arm.com
This function will be used later in machine_shutdown() for some
architectures.
disable_nonboot_cpus() is not safe to use when doing machine_down(),
because it relies on freeze_secondary_cpus() which in turn is a
suspend/resume related freeze and could abort if the logic detects any
pending activities that can prevent finishing the offlining process.
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200323135110.30522-3-qais.yousef@arm.com
Baby steps in the 5.6-rc1 merge cycle to make things easier to review
and debug.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I005e68433be6b1d66bd56d7e1c8f44ab8e78bebe
Pull scheduler updates from Ingo Molnar:
"These were the main changes in this cycle:
- More -rt motivated separation of CONFIG_PREEMPT and
CONFIG_PREEMPTION.
- Add more low level scheduling topology sanity checks and warnings
to filter out nonsensical topologies that break scheduling.
- Extend uclamp constraints to influence wakeup CPU placement
- Make the RT scheduler more aware of asymmetric topologies and CPU
capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y
- Make idle CPU selection more consistent
- Various fixes, smaller cleanups, updates and enhancements - please
see the git log for details"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
sched/fair: Define sched_idle_cpu() only for SMP configurations
sched/topology: Assert non-NUMA topology masks don't (partially) overlap
idle: fix spelling mistake "iterrupts" -> "interrupts"
sched/fair: Remove redundant call to cpufreq_update_util()
sched/psi: create /proc/pressure and /proc/pressure/{io|memory|cpu} only when psi enabled
sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP
sched/fair: calculate delta runnable load only when it's needed
sched/cputime: move rq parameter in irqtime_account_process_tick
stop_machine: Make stop_cpus() static
sched/debug: Reset watchdog on all CPUs while processing sysrq-t
sched/core: Fix size of rq::uclamp initialization
sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
sched/fair: Load balance aggressively for SCHED_IDLE CPUs
sched/fair : Improve update_sd_pick_busiest for spare capacity case
watchdog: Remove soft_lockup_hrtimer_cnt and related code
sched/rt: Make RT capacity-aware
sched/fair: Make EAS wakeup placement consider uclamp restrictions
sched/fair: Make task_fits_capacity() consider uclamp restrictions
sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
sched/uclamp: Make uclamp util helpers use and return UL values
...
When CONFIG_SYSFS is disabled, but CONFIG_HOTPLUG_SMT is enabled,
the kernel fails to link:
arch/x86/power/cpu.o: In function `hibernate_resume_nonboot_cpu_disable':
(.text+0x38d): undefined reference to `cpuhp_smt_enable'
arch/x86/power/hibernate.o: In function `arch_resume_nosmt':
hibernate.c:(.text+0x291): undefined reference to `cpuhp_smt_enable'
hibernate.c:(.text+0x29c): undefined reference to `cpuhp_smt_disable'
Move the exported functions out of the #ifdef section into its
own with the correct conditions.
The patch that caused this is marked for stable backports, so
this one may need to be backported as well.
Fixes: ec527c3180 ("x86/power: Fix 'nosmt' vs hibernation triple fault during resume")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191210195614.786555-1-arnd@arndb.de
Paul reported a very sporadic, rcutorture induced, workqueue failure.
When the planets align, the workqueue rescuer's self-migrate fails and
then triggers a WARN for running a work on the wrong CPU.
Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call
could be ignored! When stopper->enabled is false, stop_machine will
insta complete the work, without actually doing the work. Worse, it
will not WARN about this (we really should fix this).
It turns out there is a small window where a freshly online'ed CPU is
marked 'online' but doesn't yet have the stopper task running:
BP AP
bringup_cpu()
__cpu_up(cpu, idle) --> start_secondary()
...
cpu_startup_entry()
bringup_wait_for_ap()
wait_for_ap_thread() <-- cpuhp_online_idle()
while (1)
do_idle()
... available to run kthreads ...
stop_machine_unpark()
stopper->enable = true;
Close this by moving the stop_machine_unpark() into
cpuhp_online_idle(), such that the stopper thread is ready before we
start the idle loop and schedule.
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Debugged-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- A comprehensive rewrite of the robust/PI futex code's exit handling
to fix various exit races. (Thomas Gleixner et al)
- Rework the generic REFCOUNT_FULL implementation using
atomic_fetch_* operations so that the performance impact of the
cmpxchg() loops is mitigated for common refcount operations.
With these performance improvements the generic implementation of
refcount_t should be good enough for everybody - and this got
confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and
REFCOUNT_FULL entirely, leaving the generic implementation enabled
unconditionally. (Will Deacon)
- Other misc changes, fixes, cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
lkdtm: Remove references to CONFIG_REFCOUNT_FULL
locking/refcount: Remove unused 'refcount_error_report()' function
locking/refcount: Consolidate implementations of refcount_t
locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
locking/refcount: Move saturation warnings out of line
locking/refcount: Improve performance of generic REFCOUNT_FULL code
locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header
locking/refcount: Remove unused refcount_*_checked() variants
locking/refcount: Ensure integer operands are treated as signed
locking/refcount: Define constants for saturation and max refcount values
futex: Prevent exit livelock
futex: Provide distinct return value when owner is exiting
futex: Add mutex around futex exit
futex: Provide state handling for exec() as well
futex: Sanitize exit state handling
futex: Mark the begin of futex exit explicitly
futex: Set task::futex_state to DEAD right after handling futex exit
futex: Split futex_mm_release() for exit/exec
exit/exec: Seperate mm_release()
futex: Replace PF_EXITPIDONE with a state
...
This is an intermediate (mid-week) merge of Linus's tree into
android-mainline to take all of the "big" security fixes that went into
there into the android-mainline tree to get testing happening sooner.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie4d7914776ac1f917de0436061e46295ad919ead
A kernel module may need to check the value of the "mitigations=" kernel
command line parameter as part of its setup when the module needs
to perform software mitigations for a CPU flaw.
Uninline and export the helper functions surrounding the cpu_mitigations
enum to allow for their usage from a module.
Lastly, privatize the enum and cpu_mitigations variable since the value of
cpu_mitigations can be checked with the exported helper functions.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
To make the 5.4-rc1 merge easier, merge at a prerelease point in time
before the final release happens.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I052c6a28528e10cdda89b6a20d320ac7562266b8
KVM needs to know if SMT is theoretically possible, this means it is
supported and not forcefully disabled ('nosmt=force'). Create and
export cpu_smt_possible() answering this question.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This merges Linus's tree as of commit b41dae061b ("Merge tag
'xfs-5.4-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
into android-mainline.
This "early" merge makes it easier to test and handle merge conflicts
instead of having to wait until the "end" of the merge window and handle
all 10000+ commits at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6bebf55e5e2353f814e3c87f5033607b1ae5d812
Re-evaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code paths
like the IPI delivery of x86 or the membarrier code.
Cache the number of online CPUs in the core and just return the cached
variable. The accessor function provides only a snapshot when used without
protection against concurrent CPU hotplug.
The storage needs to use an atomic_t because the kexec and reboot code
(ab)use set_cpu_online() in their 'shutdown' handlers without any form of
serialization as pointed out by Mathieu. Regular CPU hotplug usage is
properly serialized.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907091622590.1634@nanos.tec.linutronix.de
The booted once information which is required to deal with the MCE
broadcast issue on X86 correctly is stored in the per cpu hotplug state,
which is perfectly fine for the intended purpose.
X86 needs that information for supporting NMI broadcasting via shortcuts,
but retrieving it from per cpu data is cumbersome.
Move it to a cpumask so the information can be checked against the
cpu_present_mask quickly.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.818822855@linutronix.de
Pull SMP/hotplug updates from Thomas Gleixner:
"A small set of updates for SMP and CPU hotplug:
- Abort disabling secondary CPUs in the freezer when a wakeup is
pending instead of evaluating it only after all CPUs have been
offlined.
- Remove the shared annotation for the strict per CPU cfd_data in the
smp function call core code.
- Remove the return values of smp_call_function() and on_each_cpu()
as they are unconditionally 0. Fixup the few callers which actually
bothered to check the return value"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Remove smp_call_function() and on_each_cpu() return values
smp: Do not mark call_function_data as shared
cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
cpu/hotplug: Fix notify_cpu_starting() reference in bringup_wait_for_ap()