* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Fix on resume, now preserves user policy min/max.
[CPUFREQ] Add Celeron Core support to p4-clockmod.
[CPUFREQ] add to speedstep-lib additional fsb values for core processors
[CPUFREQ] Disable sysfs ui for p4-clockmod.
[CPUFREQ] p4-clockmod: reduce noise
[CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage
Impact: use new cpumask API to reduce memory and stack usage
Allocate the following local cpumasks based on the number of cpus that
are present. References will use new cpumask API. (Currently only
modified for x86_64, x86_32 continues to use the *_map variants.)
cpu_callin_mask
cpu_callout_mask
cpu_initialized_mask
cpu_sibling_setup_mask
Provide the following accessor functions:
struct cpumask *cpu_sibling_mask(int cpu)
struct cpumask *cpu_core_mask(int cpu)
Other changes are when setting or clearing the cpu online, possible
or present maps, use the accessor functions.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
x86: setup_per_cpu_areas() cleanup
cpumask: fix compile error when CONFIG_NR_CPUS is not defined
cpumask: use alloc_cpumask_var_node where appropriate
cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
x86: use cpumask_var_t in acpi/boot.c
x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
sched: put back some stack hog changes that were undone in kernel/sched.c
x86: enable cpus display of kernel_max and offlined cpus
ia64: cpumask fix for is_affinity_mask_valid()
cpumask: convert RCU implementations, fix
xtensa: define __fls
mn10300: define __fls
m32r: define __fls
h8300: define __fls
frv: define __fls
cris: define __fls
cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
cpumask: zero extra bits in alloc_cpumask_var_node
cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
cpumask: convert mm/
...
Impact: Reduce inter-node memory traffic.
Reduces inter-node memory traffic (offloading the global system bus)
by allocating referenced struct cpumasks on the same node as the
referring struct.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Reduce memory usage, use new API.
This is part of an effort to reduce structure sizes for machines
configured with large NR_CPUS. cpumask_t gets replaced by
cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
struct cpumask * (large NR_CPUS).
(Changes to powernow-k* by <travis>.)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Reduce future system panics due to cpumask operations using NR_CPUS
Insure that code does not look at bits >= nr_cpu_ids as when cpumasks are
allocated based on nr_cpu_ids, these extra bits will not be defined.
Also some other minor updates:
* change in to use cpu accessor function set_cpu_present() instead of
directly accessing cpu_present_map w/cpu_clear() [arch/x86/kernel/reboot.c]
* use cpumask_of() instead of &cpumask_of_cpu() [arch/x86/kernel/reboot.c]
* optimize some cpu_mask_to_apicid_and functions.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
x86: export vector_used_by_percpu_irq
x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
sched: nominate preferred wakeup cpu, fix
x86: fix lguest used_vectors breakage, -v2
x86: fix warning in arch/x86/kernel/io_apic.c
sched: fix warning in kernel/sched.c
sched: move test_sd_parent() to an SMP section of sched.h
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
sched: activate active load balancing in new idle cpus
sched: bias task wakeups to preferred semi-idle packages
sched: nominate preferred wakeup cpu
sched: favour lower logical cpu number for sched_mc balance
sched: framework for sched_mc/smt_power_savings=N
sched: convert BALANCE_FOR_xx_POWER to inline functions
x86: use possible_cpus=NUM to extend the possible cpus allowed
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
x86: update io_apic.c to the new cpumask code
x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
x86: xen: use smp_call_function_many()
x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
...
Fixed up trivial conflict in kernel/time/tick-sched.c manually
For KVM can reuse the type define, and need them to support shadow MTRR.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Impact: fix section mismatch warning
Commit b2bb855491 ("x86: Remove cpumask games
in x86/kernel/cpu/intel_cacheinfo.c") introduced get_cpu_leaves(), which
references __cpuinit cpuid4_cache_lookup().
Mark get_cpu_leaves() with a __cpuinit annotation.
Signed-off-by: Sergio Luis <sergio@larces.uece.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
sched, trace: update trace_sched_wakeup()
tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Revert "x86: disable X86_PTRACE_BTS"
ring-buffer: prevent false positive warning
ring-buffer: fix dangling commit race
ftrace: enable format arguments checking
x86, bts: memory accounting
x86, bts: add fork and exit handling
ftrace: introduce tracing_reset_online_cpus() helper
tracing: fix warnings in kernel/trace/trace_sched_switch.c
tracing: fix warning in kernel/trace/trace.c
tracing/ring-buffer: remove unused ring_buffer size
trace: fix task state printout
ftrace: add not to regex on filtering functions
trace: better use of stack_trace_enabled for boot up code
trace: add a way to enable or disable the stack tracer
x86: entry_64 - introduce FTRACE_ frame macro v2
tracing/ftrace: add the printk-msg-only option
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
x86, bts: correctly report invalid bts records
...
Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
Impact: cleanup
enable_mtrr_cleanup is static, and is never set to anything but 0 or 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: fix a crash/hard-reboot on certain configs while enabling cpu runtime
On some archs, the boot of a secondary cpu can have an early fragile state.
On x86-64, the pda is not initialized on the first stage of a cpu boot but
it is needed to get the cpu number and the current task pointer. This data
is needed during tracing. As they were dereferenced at this stage, we got a
crash while tracing a cpu being enabled at runtime.
Some other archs like ia64 can have such kind of issue too.
Changes on v2:
We dropped the previous solution of a per-arch called function to guess the
current state of a cpu. That could slow down the tracing.
This patch removes the -pg flag on arch/x86/kernel/cpu/common.c where
the low level cpu boot functions exist, on start_secondary() and a helper
function used at this stage.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend performance counter support on x86 Intel CPUs
Modern Intel CPUs have 3 "fixed-function" performance counters, which
count these hardware events:
Instr_Retired.Any
CPU_CLK_Unhalted.Core
CPU_CLK_Unhalted.Ref
Add support for them to the performance counters subsystem.
Their use is transparent to user-space: the counter scheduler is
extended to automatically recognize the cases where a fixed-function
PMC can be utilized instead of a generic PMC. In such cases the
generic PMC is kept available for more counters.
The above fixed-function events map to these generic counter hw events:
PERF_COUNT_INSTRUCTIONS
PERF_COUNT_CPU_CYCLES
PERF_COUNT_BUS_CYCLES
(The 'bus' cycles are in reality often CPU-ish cycles, just with a fixed
frequency.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Generalize "bus cycles" hw events - and map them to CPU_CLK_Unhalted.Ref
on x86. (which is a good enough approximation)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow lowlevel ->enable() op to return an error if a counter can not be
added. This can be used to handle counter constraints.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Enumerate fixed-mode PMCs based on CPUID, and feed that into the
perfcounter code.
Does not use fixed-mode PMCs yet.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: refactor the x86 code for fixed-mode PMCs
Extend the data structures and rename the existing facilities
to allow for a 'generic' versus 'fixed' counter distinction.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: rename include file
We'll be providing an asm/perf_counter.h to the generic perfcounter code,
so use the already existing x86 file for this purpose and rename it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit
Fixes these sparse warnings:
arch/x86/kernel/cpu/common.c:869:6: warning: symbol 'boot_cpu_stack' was not declared. Should it be static?
arch/x86/kernel/cpu/common.c:910:6: warning: symbol 'boot_exception_stacks' was not declared. Should it be static?
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit
Fixes these sparse warnings:
arch/x86/kernel/cpu/perf_counter.c:44:11: warning: symbol 'intel_perfmon_event_map' was not declared. Should it be static?
arch/x86/kernel/cpu/perf_counter.c:54:11: warning: symbol 'max_intel_perfmon_events' was not declared. Should it be static?
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix wrong cache sharing detection on platforms supporting > 8 bit apicid's
In the presence of extended topology eumeration leaf 0xb provided
by cpuid, 32bit extended initial_apicid in cpuinfo_x86 struct will be
updated by detect_extended_topology(). At this instance, we should also
reinit the apicid (which could also potentially be extended to 32bit).
With out this there will potentially be duplicate apicid's populated in the
per cpu's cpuinfo_x86 struct, resulting in wrong cache sharing topology etc
detected by init_intel_cacheinfo().
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Impact: remove cpumask_t from stack.
We should not try to save and restore cpus_allowed on current.
We can't use work_on_cpu() here, since it's in the hotplug cpu path
(if anyone else tries to get the hotplug lock from a workqueue we
could deadlock against them).
Fortunately, we can just use smp_call_function_single() since the
function can run from an interrupt.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Impact: fix disabled MCE after resume
Don't prevent multiple initialization of MCEs.
Back from early prehistory mcheck_init() has a reentry check. Presumably
that was needed in very old kernels to prevent it entering twice.
But as Andreas points out this prevents CPU hotplug (and therefore resume)
to correctly reinitialize MCEs when a AP boots again after being
offlined.
Just drop the check.
Reported-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: reward non-stop TSCs with good TSC-based clocksources, etc.
Add support for CPUID_0x80000007_Bit8 on Intel CPUs as well. This bit means
that the TSC is invariant with C/P/T states and always runs at constant
frequency.
With Intel CPUs, we have 3 classes
* CPUs where TSC runs at constant rate and does not stop n C-states
* CPUs where TSC runs at constant rate, but will stop in deep C-states
* CPUs where TSC rate will vary based on P/T-states and TSC will stop in deep
C-states.
To cover these 3, one feature bit (CONSTANT_TSC) is not enough. So, add a
second bit (NONSTOP_TSC). CONSTANT_TSC indicates that the TSC runs at
constant frequency irrespective of P/T-states, and NONSTOP_TSC indicates
that TSC does not stop in deep C-states.
CPUID_0x8000000_Bit8 indicates both these feature bit can be set.
We still have CONSTANT_TSC _set_ and NONSTOP_TSC _not_set_ on some older Intel
CPUs, based on model checks. We can use TSC on such CPUs for time, as long as
those CPUs do not support/enter deep C-states.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>