Commit Graph

218 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
075d0ec13d Merge c0842fbc1b ("random32: move the pseudo-random 32-bit definitions to prandom.h") into android-mainline
Baby steps on the way to 5.9-rc1.

Resolves merge issues with:
	arch/arm64/boot/dts/qcom/sdm845-db845c.dts
	drivers/soc/qcom/Kconfig
	kernel/sched/cpufreq_schedutil.c

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ieb3344a22f9cf0d55ec5fc0daebe7602a248ab53
2020-08-07 14:27:37 +02:00
Quentin Perret
10dd8573b0 cpufreq: Register governors at core_initcall
Currently, most CPUFreq governors are registered at the core_initcall
time when the given governor is the default one, and the module_init
time otherwise.

In preparation for letting users specify the default governor on the
kernel command line, change all of them to be registered at the
core_initcall unconditionally, as it is already the case for the
schedutil and performance governors. This will allow us to assume
that builtin governors have been registered before the built-in
CPUFreq drivers probe.

And since all governors have similar init/exit patterns now, introduce
two new macros, cpufreq_governor_{init,exit}(), to factorize the code.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-02 13:03:30 +02:00
Greg Kroah-Hartman
a33191d441 Merge 5.8-rc1 into android-mainline
Linux 5.8-rc1

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I00f2168bc9b6fd8e48c7c0776088d2c6cb8e1629
2020-06-25 14:25:32 +02:00
Greg Kroah-Hartman
48ef1614d0 Merge faa392181a ("Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm") into android-mainline
Tiny merge resolutions along the way to 5.8-rc1.

Change-Id: I0036e22af4e398b0bb3b9855ff90d60b9c0227ee
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2020-06-12 15:13:27 +02:00
Quentin Perret
bdfa61448f ANDROID: sched: Allow EAS without schedutil
This effectively reverts 531b5c9f5c ("sched/topology: Make Energy
Aware Scheduling depend on schedutil"), and updates the documentation
accordingly.

In the context of GKI 2.0, some partners need to use a custom cpufreq
govenor, but would like to keep EAS enabled.

Remove the hard dependency between them.

Bug: 157580690
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: Ied1c5aba6b692e26776b3e2c050780a67cc5bae1
2020-06-05 16:35:46 +00:00
Xiongfeng Wang
cf6fada715 cpufreq: change '.set_boost' to act on one policy
Macro 'for_each_active_policy()' is defined internally. To avoid some
cpufreq driver needing this macro to iterate over all the policies in
'.set_boost' callback, we redefine '.set_boost' to act on only one
policy and pass the policy as an argument.

'cpufreq_boost_trigger_state()' iterates over all the policies to set
boost for the system.

This is preparation for adding SW BOOST support for CPPC.

To protect Boost enable/disable by sysfs from CPU online/offline,
add 'cpu_hotplug_lock' before calling '.set_boost' for each CPU.

Also move the lock from 'set_boost()' to 'store_cpb()' in
acpi_cpufreq.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-05 14:20:02 +02:00
Wang Wenhu
2909438d4d cpufreq: fix minor typo in struct cpufreq_driver doc comment
Delete the duplicate "to", possibly double-typed.

Signed-off-by: Wang Wenhu <wenhu.wang@vivo.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-14 13:23:09 +02:00
Quentin Perret
c65b550185 Revert "ANDROID: cpufreq: arch_topology: implement max frequency capping"
This reverts commit 0cfe39fe40.

Upstream has an equivalent feature now, drop the Android-specific
version.

Bug: 120440300
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: Id1b2ad9249fb3441082cad34872d237b32b19b82
2020-04-07 07:12:50 +00:00
Greg Kroah-Hartman
6aea7b7129 Merge 1a323ea535 ("x86: get rid of 'errret' argument to __get_user_xyz() macross") into android-mainline
In a quest to divide up the 5.7-rc1 merge chunks into reviewable pieces.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2e5960415348c06e8f10e10cbefb3ee5c3745e73
2020-04-04 12:17:50 +02:00
Ionela Voinescu
bbce8eaa60 cpufreq: add function to get the hardware max frequency
Add weak function to return the hardware maximum frequency of a CPU,
with the default implementation returning cpuinfo.max_freq, which is
the best information we can generically get from the cpufreq framework.

The default can be overwritten by a strong function in platforms
that want to provide an alternative implementation, with more accurate
information, obtained either from hardware or firmware.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-06 16:02:50 +00:00
Greg Kroah-Hartman
57ba579f7f Merge 5.6-rc2 into android-mainline
Linux 5.6-rc2

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia8f34ab9a6f8a2a3b8b759da8d48ab035ac1cf83
2020-02-19 08:28:52 +01:00
Greg Kroah-Hartman
1f77f18bb8 Merge a45ad71e89 ("Merge tag 'rproc-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc") into android-mainline
Another "small" merge point to handle conflicts in a sane way.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5dc2f5f11275b29f3c9b5b8d4dd59864ceb6faf9
2020-02-08 13:32:37 +01:00
Yangtao Li
183edb20e6 cpufreq: Make cpufreq_global_kobject static
The cpufreq_global_kobject is only used internally by cpufreq.c
after commit 2361be2366 ("cpufreq: Don't create empty
/sys/devices/system/cpu/cpufreq directory").

Make it static.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
[ rjw: Add empty line after cpufreq_global_kobject definition ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-03 16:56:48 +01:00
Rafael J. Wysocki
1e4f63aecb cpufreq: Avoid creating excessively large stack frames
In the process of modifying a cpufreq policy, the cpufreq core makes
a copy of it including all of the internals which is stored on the
CPU stack.  Because struct cpufreq_policy is relatively large, this
may cause the size of the stack frame to exceed the 2 KB limit and
so the GCC complains when -Wframe-larger-than= is used.

In fact, it is not necessary to copy the entire policy structure
in order to modify it, however.

First, because cpufreq_set_policy() obtains the min and max policy
limits from frequency QoS now, it is not necessary to pass the limits
to it from the callers.  The only things that need to be passed to it
from there are the new governor pointer or (if there is a built-in
governor in the driver) the "policy" value representing the governor
choice.  They both can be passed as individual arguments, though, so
make cpufreq_set_policy() take them this way and rework its callers
accordingly.  This avoids making copies of cpufreq policies in the
callers of cpufreq_set_policy().

Second, cpufreq_set_policy() still needs to pass the new policy
data to the ->verify() callback of the cpufreq driver whose task
is to sanitize the min and max policy limits.  It still does not
need to make a full copy of struct cpufreq_policy for this purpose,
but it needs to pass a few items from it to the driver in case they
are needed (different drivers have different needs in that respect
and all of them have to be covered).  For this reason, introduce
struct cpufreq_policy_data to hold copies of the members of
struct cpufreq_policy used by the existing ->verify() driver
callbacks and pass a pointer to a temporary structure of that
type to ->verify() (instead of passing a pointer to full struct
cpufreq_policy to it).

While at it, notice that intel_pstate and longrun don't really need
to verify the "policy" value in struct cpufreq_policy, so drop those
check from them to avoid copying "policy" into struct
cpufreq_policy_data (which allows it to be slightly smaller).

Also while at it fix up white space in a couple of places and make
cpufreq_set_policy() static (as it can be so).

Fixes: 3000ce3c52 ("cpufreq: Use per-policy frequency QoS")
Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.com
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-01-27 10:33:33 +01:00
Greg Kroah-Hartman
e4ca73fb69 Merge 5.5-rc3 into android-mainline
Linux 5.5-rc3

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1735073980c0168a1ad8bc130fea534dc13f6091
2019-12-29 14:16:55 +01:00
Rafael J. Wysocki
85572c2c4a cpufreq: Avoid leaving stale IRQ work items during CPU offline
The scheduler code calling cpufreq_update_util() may run during CPU
offline on the target CPU after the IRQ work lists have been flushed
for it, so the target CPU should be prevented from running code that
may queue up an IRQ work item on it at that point.

Unfortunately, that may not be the case if dvfs_possible_from_any_cpu
is set for at least one cpufreq policy in the system, because that
allows the CPU going offline to run the utilization update callback
of the cpufreq governor on behalf of another (online) CPU in some
cases.

If that happens, the cpufreq governor callback may queue up an IRQ
work on the CPU running it, which is going offline, and the IRQ work
may not be flushed after that point.  Moreover, that IRQ work cannot
be flushed until the "offlining" CPU goes back online, so if any
other CPU calls irq_work_sync() to wait for the completion of that
IRQ work, it will have to wait until the "offlining" CPU is back
online and that may not happen forever.  In particular, a system-wide
deadlock may occur during CPU online as a result of that.

The failing scenario is as follows.  CPU0 is the boot CPU, so it
creates a cpufreq policy and becomes the "leader" of it
(policy->cpu).  It cannot go offline, because it is the boot CPU.
Next, other CPUs join the cpufreq policy as they go online and they
leave it when they go offline.  The last CPU to go offline, say CPU3,
may queue up an IRQ work while running the governor callback on
behalf of CPU0 after leaving the cpufreq policy because of the
dvfs_possible_from_any_cpu effect described above.  Then, CPU0 is
the only online CPU in the system and the stale IRQ work is still
queued on CPU3.  When, say, CPU1 goes back online, it will run
irq_work_sync() to wait for that IRQ work to complete and so it
will wait for CPU3 to go back online (which may never happen even
in principle), but (worse yet) CPU0 is waiting for CPU1 at that
point too and a system-wide deadlock occurs.

To address this problem notice that CPUs which cannot run cpufreq
utilization update code for themselves (for example, because they
have left the cpufreq policies that they belonged to), should also
be prevented from running that code on behalf of the other CPUs that
belong to a cpufreq policy with dvfs_possible_from_any_cpu set and so
in that case the cpufreq_update_util_data pointer of the CPU running
the code must not be NULL as well as for the CPU which is the target
of the cpufreq utilization update in progress.

Accordingly, change cpufreq_this_cpu_can_update() into a regular
function in kernel/sched/cpufreq.c (instead of a static inline in a
header file) and make it check the cpufreq_update_util_data pointer
of the local CPU if dvfs_possible_from_any_cpu is set for the target
cpufreq policy.

Also update the schedutil governor to do the
cpufreq_this_cpu_can_update() check in the non-fast-switch
case too to avoid the stale IRQ work issues.

Fixes: 99d14d0e16 ("cpufreq: Process remote callbacks from any CPU if the platform permits")
Link: https://lore.kernel.org/linux-pm/20191121093557.bycvdo4xyinbc5cb@vireshk-i7/
Reported-by: Anson Huang <anson.huang@nxp.com>
Tested-by: Anson Huang <anson.huang@nxp.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Peng Fan <peng.fan@nxp.com> (i.MX8QXP-MEK)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-12-12 17:59:43 +01:00
Greg Kroah-Hartman
444da424c1 Merge 5.4-rc5 into android-common
Linux 5.4-rc5

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib61881c64a2725c6229c26d2ce63f107b7215c47
2019-10-28 13:11:46 +01:00
Rafael J. Wysocki
3000ce3c52 cpufreq: Use per-policy frequency QoS
Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.

Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).

In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU.  Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.

The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.

After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.

Fixes: 67d874c3b2 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Diagnosed-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-10-21 02:05:21 +02:00
Dietmar Eggemann
0cfe39fe40 ANDROID: cpufreq: arch_topology: implement max frequency capping
Implements the Max Frequency Capping Engine (MFCE) getter function
topology_get_max_freq_scale() to provide the scheduler with a
maximum frequency scaling correction factor for more accurate cpu
capacity handling by being able to deal with max frequency capping.

This scaling factor describes the influence of running a cpu with a
current maximum frequency (policy) lower than the maximum possible
frequency (cpuinfo).

The factor is:

  policy_max_freq(cpu) << SCHED_CAPACITY_SHIFT / cpuinfo_max_freq(cpu)

It also implements the MFCE setter function arch_set_max_freq_scale()
which is called from cpufreq_set_policy().

Change-Id: I59e52861ee260755ab0518fe1f7183a2e4e3d0fc
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
[Trivial cherry-pick issue in cpufreq.c]
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2019-10-01 10:36:33 +01:00
Viresh Kumar
df0eea4488 cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events
No driver makes reference to these events now, remove them and the code
related to them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-09-02 22:44:04 +02:00
Viresh Kumar
6a1490367c cpufreq: Add policy create/remove notifiers back
This effectively reverts some changes made by commit f9f41e3ef9
("cpufreq: Remove policy create/remove notifiers").

We have a new use case for policy create/remove notifiers (for
allocating/freeing QoS requests per policy), so add them back.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-08-10 14:05:48 +02:00
Rafael J. Wysocki
918e162e6a Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: Make cpufreq_generic_init() return void
  cpufreq: imx-cpufreq-dt: Add i.MX8MN support
  cpufreq: Add QoS requests for userspace constraints
  cpufreq: intel_pstate: Reuse refresh_frequency_limits()
  cpufreq: Register notifiers with the PM QoS framework
  PM / QoS: Add support for MIN/MAX frequency constraints
  PM / QOS: Pass request type to dev_pm_qos_read_value()
  PM / QOS: Rename __dev_pm_qos_read_value() and dev_pm_qos_raw_read_value()
  PM / QOS: Pass request type to dev_pm_qos_{add|remove}_notifier()
2019-07-18 09:49:30 +02:00
Viresh Kumar
c4dcc8a162 cpufreq: Make cpufreq_generic_init() return void
It always returns 0 (success) and its return type should really be void.

Over that, many drivers have added error handling code based on its
return value, which is not required at all.

Change its return type to void and update all the callers.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-16 10:20:11 +02:00
Viresh Kumar
18c49926c4 cpufreq: Add QoS requests for userspace constraints
This implements QoS requests to manage userspace configuration of min
and max frequency.

Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: syzbot <syzbot+de771ae9390dffed7266@syzkaller.appspotmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-08 23:56:39 +02:00
Viresh Kumar
c57b25bdf7 cpufreq: intel_pstate: Reuse refresh_frequency_limits()
The implementation of intel_pstate_update_max_freq() is quite similar to
refresh_frequency_limits(), lets reuse it.

Finding minimum of policy->user_policy.max and policy->cpuinfo.max_freq
in intel_pstate_update_max_freq() is redundant as cpufreq_set_policy()
will call the ->verify() callback of intel-pstate driver, which will do
this comparison anyway and so dropping it from
intel_pstate_update_max_freq() doesn't harm.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-08 23:56:39 +02:00
Viresh Kumar
67d874c3b2 cpufreq: Register notifiers with the PM QoS framework
Register notifiers for min/max frequency constraints with the PM QoS
framework. The constraints are also taken into consideration in
cpufreq_set_policy().

This also relocates cpufreq_policy_put_kobj() as it is required to be
called from cpufreq_policy_alloc() now.

refresh_frequency_limits() is updated to avoid calling
cpufreq_set_policy() for inactive policies and handle_update() is
updated to have proper locking in place.

No constraints are added until now though.

Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-08 23:56:13 +02:00
Rafael J. Wysocki
586a07dca8 Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update()
  cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get()
  cpufreq: Don't skip frequency validation for has_target() drivers
  cpufreq: Use has_target() instead of !setpolicy
  cpufreq: Remove redundant !setpolicy check
  cpufreq: Move the IS_ENABLED(CPU_THERMAL) macro into a stub
  cpufreq: s5pv210: Don't flood kernel log after cpufreq change
  cpufreq: pcc-cpufreq: Fail initialization if driver cannot be registered
  cpufreq: add driver for Raspberry Pi
  cpufreq: Switch imx7d to imx-cpufreq-dt for speed grading
  cpufreq: imx-cpufreq-dt: Remove global platform match list
  cpufreq: brcmstb-avs-cpufreq: Fix types for voltage/frequency
  cpufreq: brcmstb-avs-cpufreq: Fix initial command check
  cpufreq: armada-37xx: Remove set but not used variable 'freq'
  cpufreq: imx-cpufreq-dt: Fix no OPPs available on unfused parts
  dt-bindings: imx-cpufreq-dt: Document opp-supported-hw usage
  cpufreq: Add imx-cpufreq-dt driver
2019-07-08 11:00:02 +02:00
Daniel Lezcano
bcc6156999 cpufreq: Move the IS_ENABLED(CPU_THERMAL) macro into a stub
cpufreq_online() and cpufreq_offline() [un]register the driver as
a cooling device. This is done if the driver is flagged as a cooling
device in addition with an IS_ENABLED() check to compile out the branching
code.

Group this test in a stub function added in the cpufreq header instead
of having the IS_ENABLED() in the code.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-26 10:59:57 +02:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Viresh Kumar
df24014abe cpufreq: Call transition notifier only once for each policy
Currently, the notifiers are called once for each CPU of the policy->cpus
cpumask. It would be more optimal if the notifier can be called only
once and all the relevant information be provided to it. Out of the 23
drivers that register for the transition notifiers today, only 4 of them
do per-cpu updates and the callback for the rest can be called only once
for the policy without any impact.

This would also avoid multiple function calls to the notifier callbacks
and reduce multiple iterations of notifier core's code (which does
locking as well).

This patch adds pointer to the cpufreq policy to the struct
cpufreq_freqs, so the notifier callback has all the information
available to it with a single call. The five drivers which perform
per-cpu updates are updated to use the cpufreq policy. The freqs->cpu
field is redundant now and is removed.

Acked-by: David S. Miller <davem@davemloft.net> (sparc)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-05-10 12:20:36 +02:00
Rafael J. Wysocki
9083e49861 cpufreq: intel_pstate: Update max frequency on global turbo changes
While the cpuinfo.max_freq value doesn't really matter for
intel_pstate in the active mode, in the passive mode it is used by
governors as the maximum physical frequency of the CPU and the
results of governor computations generally depend on it.  Also it
is made available to user space via sysfs and it should match the
current HW configuration.

For this reason, make intel_pstate update cpuinfo.max_freq for all
CPUs if it detects a global change of turbo frequency settings from
"disable" to "enable" or the other way associated with a _PPC change
notification from the platform firmware.

Note that policy_is_inactive(),  cpufreq_cpu_acquire(),
cpufreq_cpu_release(), and cpufreq_set_policy() need to be made
available to it for this purpose.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759
Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-04-08 11:26:09 +02:00
Rafael J. Wysocki
5a25e3f7cc cpufreq: intel_pstate: Driver-specific handling of _PPC updates
In some cases, the platform firmware disables or enables turbo
frequencies for all CPUs globally before triggering a _PPC change
notification for one of them.  Obviously, that global change affects
all CPUs, not just the notified one, and it needs to be acted upon by
cpufreq.

The intel_pstate driver is able to detect such global changes of
the settings, but it also needs to update policy limits for all
CPUs if that happens, in particular if turbo frequencies are
enabled globally - to allow them to be used.

For this reason, introduce a new cpufreq driver callback to be
invoked on _PPC notifications, if present, instead of simply
calling cpufreq_update_policy() for the notified CPU and make
intel_pstate use it to trigger policy updates for all CPUs
in the system if global settings change.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759
Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-04-01 23:43:05 +02:00
Viresh Kumar
91a12e91dc cpufreq: Allow light-weight tear down and bring up of CPUs
The cpufreq core doesn't remove the cpufreq policy anymore on CPU
offline operation, rather that happens when the CPU device gets
unregistered from the kernel. This allows faster recovery when the CPU
comes back online. This is also very useful during system wide
suspend/resume where we offline all non-boot CPUs during suspend and
then bring them back on resume.

This commit takes the same idea a step ahead to allow drivers to do
light weight tear-down and bring-up during CPU offline and online
operations.

A new set of callbacks is introduced, online/offline(). online() gets
called when the first CPU of an inactive policy is brought up and
offline() gets called when all the CPUs of a policy are offlined.

The existing init/exit() callback get called on policy
creation/destruction. They also get called instead of online/offline()
callbacks if the online/offline() callbacks aren't provided.

This also moves around some code to get executed only for the new-policy
case going forward.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-12 23:47:42 +01:00
Amit Kucheria
5c238a8b59 cpufreq: Auto-register the driver as a thermal cooling device if asked
All cpufreq drivers do similar things to register as a cooling device.
Provide a cpufreq driver flag so drivers can just ask the cpufreq core
to register the cooling device on their behalf. This allows us to get
rid of duplicated code in the drivers.

In order to allow this, we add a struct thermal_cooling_device pointer
to struct cpufreq_policy so that drivers don't need to store it in a
private data structure.

Suggested-by: Stephen Boyd <swboyd@chromium.org>
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-30 23:02:26 +01:00
Viresh Kumar
625c85a62c cpufreq: Use struct kobj_attribute instead of struct global_attr
The cpufreq_global_kobject is created using kobject_create_and_add()
helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store
routines are set to kobj_attr_show() and kobj_attr_store().

These routines pass struct kobj_attribute as an argument to the
show/store callbacks. But all the cpufreq files created using the
cpufreq_global_kobject expect the argument to be of type struct
attribute. Things work fine currently as no one accesses the "attr"
argument. We may not see issues even if the argument is used, as struct
kobj_attribute has struct attribute as its first element and so they
will both get same address.

But this is logically incorrect and we should rather use struct
kobj_attribute instead of struct global_attr in the cpufreq core and
drivers and the show/store callbacks should take struct kobj_attribute
as argument instead.

This bug is caught using CFI CLANG builds in android kernel which
catches mismatch in function prototypes for such callbacks.

Reported-by: Donghee Han <dh.han@samsung.com>
Reported-by: Sangkyu Kim <skwith.kim@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-29 11:44:30 +01:00
Amit Kucheria
8321be6a9d cpufreq: Replace open-coded << with BIT()
Minor clean-up to use BIT() and keep checkpatch happy. Clean up the
comment formatting while we're at it to make it easier to read.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-21 11:02:09 +01:00
Quentin Perret
531b5c9f5c sched/topology: Make Energy Aware Scheduling depend on schedutil
Energy Aware Scheduling (EAS) is designed with the assumption that
frequencies of CPUs follow their utilization value. When using a CPUFreq
governor other than schedutil, the chances of this assumption being true
are small, if any. When schedutil is being used, EAS' predictions are at
least consistent with the frequency requests. Although those requests
have no guarantees to be honored by the hardware, they should at least
guide DVFS in the right direction and provide some hope in regards to the
EAS model being accurate.

To make sure EAS is only used in a sane configuration, create a strong
dependency on schedutil being used. Since having sugov compiled-in does
not provide that guarantee, make CPUFreq call a scheduler function on
governor changes hence letting it rebuild the scheduling domains, check
the governors of the online CPUs, and enable/disable EAS accordingly.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: adharmap@codeaurora.org
Cc: chris.redpath@arm.com
Cc: currojerez@riseup.net
Cc: dietmar.eggemann@arm.com
Cc: edubezval@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: javi.merino@kernel.org
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: pkondeti@codeaurora.org
Cc: skannan@codeaurora.org
Cc: smuckle@google.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/20181203095628.11858-9-quentin.perret@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-11 15:17:00 +01:00
Viresh Kumar
036399782b cpufreq: Rename cpufreq_can_do_remote_dvfs()
This routine checks if the CPU running this code belongs to the policy
of the target CPU or if not, can it do remote DVFS for it remotely. But
the current name of it implies as if it is only about doing remote
updates.

Rename it to make it more relevant.

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-23 10:37:08 +02:00
Viresh Kumar
2dd0df8472 cpufreq: Drop cpufreq_table_validate_and_show()
This isn't used anymore. Remove the helper and update documentation
accordingly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10 08:40:45 +02:00
Viresh Kumar
d417e0691a cpufreq: Validate frequency table in the core
By design, cpufreq drivers are responsible for calling
cpufreq_frequency_table_cpuinfo() from their ->init()
callbacks to validate the frequency table.

However, if a cpufreq driver is buggy and fails to do so properly, it
lead to unexpected behavior of the driver or the cpufreq core at a
later point in time.  It would be better if the core could
validate the frequency table during driver initialization.

To that end, introduce cpufreq_table_validate_and_sort() and make
the cpufreq core call it right after invoking the ->init() callback
of the driver and destroy the cpufreq policy if the table is invalid.

For the time being the validation of the table happens twice, once
from the driver and then from the core.  The individual drivers will
be updated separately to drop table validation if they don't need it
for other reasons.

The frequency table is marked "sorted" or "unsorted" by the new helper
now instead of in cpufreq_table_validate_and_show(), as it should only
be done after validating the table (which the drivers won't do going
forward).

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject/changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-27 18:22:12 +01:00
Dominik Brodowski
ffd81dcfef cpufreq: Add and use cpufreq_for_each_{valid_,}entry_idx()
Pointer subtraction is slow and tedious. Therefore, replace all instances
where cpufreq_for_each_{valid_,}entry loops contained such substractions
with an iteration macro providing an index to the frequency_table entry.

Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/r/20180120020237.GM13338@ZenIV.linux.org.uk
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08 10:21:39 +01:00
Rafael J. Wysocki
7d5905dc14 x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
After commit 890da9cf09 (Revert "x86: do not use cpufreq_quick_get()
for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
on x86 can be either the nominal CPU frequency (which is constant)
or the frequency most recently requested by a scaling governor in
cpufreq, depending on the cpufreq configuration.  That is somewhat
inconsistent and is different from what it was before 4.13, so in
order to restore the previous behavior, make it report the current
CPU frequency like the scaling_cur_freq sysfs file in cpufreq.

To that end, modify the /proc/cpuinfo implementation on x86 to use
aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
registers, if available, and use their values to compute the CPU
frequency to be reported as "cpu MHz".

However, do that carefully enough to avoid accumulating delays that
lead to unacceptable access times for /proc/cpuinfo on systems with
many CPUs.  Run aperfmperf_snapshot_khz() once on all CPUs
asynchronously at the /proc/cpuinfo open time, add a single delay
upfront (if necessary) at that point and simply compute the current
frequency while running show_cpuinfo() for each individual CPU.

Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
the default delay between consecutive APERF and MPERF reads to 10 ms,
which should be sufficient to get large enough numbers for the
frequency computation in all cases.

Fixes: 890da9cf09 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
2017-11-15 19:46:50 +01:00
Dietmar Eggemann
e7d5459dfa cpufreq: provide default frequency-invariance setter function
Frequency-invariant accounting support based on the ratio of current
frequency and maximum supported frequency is an optional feature an arch
can implement.

Since there are cpufreq drivers (e.g. cpufreq-dt) which can be build for
different arch's a default implementation of the frequency-invariance
setter function arch_set_freq_scale() is needed.

This default implementation is an empty weak function which will be
overwritten by a strong function in case the arch provides one.

The setter function passes the cpumask of related (to the frequency
change) cpus (online and offline cpus), the (new) current frequency and
the maximum supported frequency.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-03 02:37:53 +02:00
Rafael J. Wysocki
08a10002be Merge branch 'pm-cpufreq-sched'
* pm-cpufreq-sched:
  cpufreq: schedutil: Always process remote callback with slow switching
  cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily
  cpufreq: Return 0 from ->fast_switch() on errors
  cpufreq: Simplify cpufreq_can_do_remote_dvfs()
  cpufreq: Process remote callbacks from any CPU if the platform permits
  sched: cpufreq: Allow remote cpufreq callbacks
  cpufreq: schedutil: Use unsigned int for iowait boost
  cpufreq: schedutil: Make iowait boost more energy efficient
2017-09-04 00:05:22 +02:00
Rafael J. Wysocki
d6344d4b56 cpufreq: Simplify cpufreq_can_do_remote_dvfs()
The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop
it and simply return the value of the expression under it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-08 17:09:02 +02:00
Viresh Kumar
99d14d0e16 cpufreq: Process remote callbacks from any CPU if the platform permits
On many platforms, CPUs can do DVFS across cpufreq policies. i.e CPU
from policy-A can change frequency of CPUs belonging to policy-B.

This is quite common in case of ARM platforms where we don't
configure any per-cpu register.

Add a flag to identify such platforms and update
cpufreq_can_do_remote_dvfs() to allow remote callbacks if this flag is
set.

Also enable the flag for cpufreq-dt driver which is used only on ARM
platforms currently.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-01 14:24:54 +02:00
Viresh Kumar
674e75411f sched: cpufreq: Allow remote cpufreq callbacks
With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently, callbacks
into cpufreq governors are only made from the scheduler if the target
CPU of the event is the same as the current CPU. This means there are
certain situations where a target CPU may not run the cpufreq governor
for some time.

One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that the new tasks should receive maximum
demand initially, this should result in CPU0 increasing frequency
immediately. But because of the above mentioned limitation though, this
does not occur.

This patch updates the scheduler core to call the cpufreq callbacks for
remote CPUs as well.

The schedutil, ondemand and conservative governors are updated to
process cpufreq utilization update hooks called for remote CPUs where
the remote CPU is managed by the cpufreq policy of the local CPU.

The intel_pstate driver is updated to always reject remote callbacks.

This is tested with couple of usecases (Android: hackbench, recentfling,
galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
octa-core, single policy). Only galleryfling showed minor improvements,
while others didn't had much deviation.

The reason being that this patch only targets a corner case, where
following are required to be true to improve performance and that
doesn't happen too often with these tests:

- Task is migrated to another CPU.
- The task has high demand, and should take the target CPU to higher
  OPPs.
- And the target CPU doesn't call into the cpufreq governor until the
  next tick.

Based on initial work from Steve Muckle.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-01 14:24:53 +02:00
Viresh Kumar
fe829ed8ef cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag
The policy->transition_latency field is used for multiple purposes
today and its not straight forward at all. This is how it is used:

A. Set the correct transition_latency value.

B. Set it to CPUFREQ_ETERNAL because:
   1. We don't want automatic dynamic switching (with
      ondemand/conservative) to happen at all.
   2. We don't know the transition latency.

This patch handles the B.1. case in a more readable way. A new flag for
the cpufreq drivers is added to disallow use of cpufreq governors which
have dynamic_switching flag set.

All the current cpufreq drivers which are setting transition_latency
unconditionally to CPUFREQ_ETERNAL are updated to use it. They don't
need to set transition_latency anymore.

There shouldn't be any functional change after this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-26 00:15:46 +02:00
Viresh Kumar
ed4676e254 cpufreq: Replace "max_transition_latency" with "dynamic_switching"
There is no limitation in the ondemand or conservative governors which
disallow the transition_latency to be greater than 10 ms.

The max_transition_latency field is rather used to disallow automatic
dynamic frequency switching for platforms which didn't wanted these
governors to run.

Replace max_transition_latency with a boolean (dynamic_switching) and
check for transition_latency == CPUFREQ_ETERNAL along with that. This
makes it pretty straight forward to read/understand now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-26 00:15:45 +02:00
Viresh Kumar
aa7519af45 cpufreq: Use transition_delay_us for legacy governors as well
The policy->transition_delay_us field is used only by the schedutil
governor currently, and this field describes how fast the driver wants
the cpufreq governor to change CPUs frequency. It should rather be a
common thing across all governors, as it doesn't have any schedutil
dependency here.

Create a new helper cpufreq_policy_transition_delay_us() to get the
transition delay across all governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-22 02:25:20 +02:00