At run time (when 'perf' is starting up), locate the specific table of
PMU events that corresponds to the current CPU. Using that table, create
aliases for the each of the PMU events in the CPU. The use these aliases
to parse the user specified perf event.
In short this would allow the user to specify events using their aliases
rather than raw event codes.
Based on input and some earlier patches from Andi Kleen, Jiri Olsa.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-4-git-send-email-sukadev@linux.vnet.ibm.com
[ Make pmu_add_cpu_aliases() return void, since it was returning just '0' and
furthermore, even that was being discarded via an explicit (void) cast ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch stores the cpu socket_id and core_id in a perf.data header,
and reads them into the perf_env struct when processing perf.data files.
The changes modifies the CPU_TOPOLOGY section, making sure it is
backward/forward compatible.
The patch checks the section size before reading the core and socket ids.
It never reads data crossing the section boundary. An old perf binary
without this patch can also correctly read the perf.data from a new perf
with this patch.
Because the new info is added at the end of the cpu_topology section, an
old perf tool ignores the extra data.
Examples:
1. New perf with this patch read perf.data from an old perf without the
patch:
$ perf_new report -i perf_old.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# Core ID and Socket ID information is not available
# node0 meminfo : total = 32823872 kB, free = 29315548 kB
# node0 cpu list : 0-17,36-53
......
2. Old perf without the patch reads perf.data from a new perf with the
patch:
$ perf_old report -i perf_new.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# node0 meminfo : total = 32823872 kB, free = 29190932 kB
# node0 cpu list : 0-17,36-53
......
3. New perf read new perf.data:
$ perf_new report -i perf_new.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# CPU 0: Core ID 0, Socket ID 0
# CPU 1: Core ID 1, Socket ID 0
......
# CPU 61: Core ID 10, Socket ID 1
# CPU 62: Core ID 11, Socket ID 1
# CPU 63: Core ID 16, Socket ID 1
# node0 meminfo : total = 32823872 kB, free = 29190932 kB
# node0 cpu list : 0-17,36-53
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1441115893-22006-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make perf build for x86 once the UAPI disintegration patches for that arch
have been applied by adding the appropriate -I flags - in the right order -
and then converting some #includes that use ../.. notation to find main kernel
headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
This makes sure we get the userspace version of the pt_regs struct. Ideally,
we wouldn't have the latter -I flag at all, but unfortunately we want
asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
at least not for x86. I wonder if the bits outside of the __KERNEL__ guards
*should* be transferred there.
I note also that perf seems to do its dependency handling manually by listing
all the header files it might want to use in LIB_H in the Makefile. Can this
be changed to use -MD?
Note that to do make this work, we need to export and UAPI disintegrate
linux/hw_breakpoint.h, which I think should've been exported previously so that
perf can access the bits. We have to do this in the same patch to maintain
bisectability.
Signed-off-by: David Howells <dhowells@redhat.com>
The UAPI commits forgot to test tooling builds such as tools/perf/,
and this fixes the fallout.
Manual conversion.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add 'perf kvm stat' support to analyze kvm vmexit/mmio/ioport smartly
Usage:
- kvm stat
run a command and gather performance counter statistics, it is the alias of
perf stat
- trace kvm events:
perf kvm stat record, or, if other tracepoints are interesting as well, we
can append the events like this:
perf kvm stat record -e timer:* -a
If many guests are running, we can track the specified guest by using -p or
--pid, -a is used to track events generated by all guests.
- show the result:
perf kvm stat report
The output example is following:
13005
13059
total 2 guests are running on the host
Then, track the guest whose pid is 13059:
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.253 MB perf.data.guest (~11065 samples) ]
See the vmexit events:
Analyze events for all VCPUs:
VM-EXIT Samples Samples% Time% Avg time
APIC_ACCESS 460 70.55% 0.01% 22.44us ( +- 1.75% )
HLT 93 14.26% 99.98% 832077.26us ( +- 10.42% )
EXTERNAL_INTERRUPT 64 9.82% 0.00% 35.35us ( +- 14.21% )
PENDING_INTERRUPT 24 3.68% 0.00% 9.29us ( +- 31.39% )
CR_ACCESS 7 1.07% 0.00% 8.12us ( +- 5.76% )
IO_INSTRUCTION 3 0.46% 0.00% 18.00us ( +- 11.79% )
EXCEPTION_NMI 1 0.15% 0.00% 5.83us ( +- -nan% )
Total Samples:652, Total events handled time:77396109.80us.
See the mmio events:
Analyze events for all VCPUs:
MMIO Access Samples Samples% Time% Avg time
0xfee00380:W 387 84.31% 79.28% 8.29us ( +- 3.32% )
0xfee00300:W 24 5.23% 9.96% 16.79us ( +- 1.97% )
0xfee00300:R 24 5.23% 7.83% 13.20us ( +- 3.00% )
0xfee00310:W 24 5.23% 2.93% 4.94us ( +- 3.84% )
Total Samples:459, Total events handled time:4044.59us.
See the ioport event:
Analyze events for all VCPUs:
IO Port Access Samples Samples% Time% Avg time
0xc050:POUT 3 100.00% 100.00% 13.75us ( +- 10.83% )
Total Samples:3, Total events handled time:41.26us.
And, --vcpu is used to track the specified vcpu and --key is used to sort the
result:
Analyze events for VCPU 0:
VM-EXIT Samples Samples% Time% Avg time
HLT 27 13.85% 99.97% 405790.24us ( +- 12.70% )
EXTERNAL_INTERRUPT 13 6.67% 0.00% 27.94us ( +- 22.26% )
APIC_ACCESS 146 74.87% 0.03% 21.69us ( +- 2.91% )
IO_INSTRUCTION 2 1.03% 0.00% 17.77us ( +- 20.56% )
CR_ACCESS 2 1.03% 0.00% 8.55us ( +- 6.47% )
PENDING_INTERRUPT 5 2.56% 0.00% 6.27us ( +- 3.94% )
Total Samples:195, Total events handled time:10959950.90us.
Signed-off-by: Dong Hao <haodong@linux.vnet.ibm.com>
Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com>
[ Dong Hao <haodong@linux.vnet.ibm.com>
Runzhen Wang <runzhen@linux.vnet.ibm.com>:
- rebase it on current acme's tree
- fix the compiling-error on i386 ]
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1347870675-31495-4-git-send-email-haodong@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Storing data for VDSO shared object, because we need it for the post
unwind processing.
The VDSO shared object is same for all process on a running system, so
it makes no difference when we store it inside the tracer - perf.
When [vdso] map memory is hit, we retrieve [vdso] DSO image and store it
into temporary file.
During the build-id processing phase, the [vdso] DSO image is stored in
build-id db, and build-id reference is made inside perf.data. The
build-id vdso file object is called '[vdso]'. We don't use temporary
file name which gets removed when record is finished.
During report phase the vdso build-id object is treated as any other
build-id DSO object.
Adding following API for vdso object:
bool is_vdso_map(const char *filename)
- returns true if the filename matches vdso map name
struct dso *vdso__dso_findnew(struct list_head *head)
- find/create proper vdso DSO object
vdso__exit(void)
- removes temporary VDSO image if there's any
This change makes backtrace dwarf post unwind possible from [vdso] maps.
Following output is current report of [vdso] sample dwarf backtrace:
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.52% ex [vdso] [.] 0x00007fff3ace89af
|
--- 0x7fff3ace89af
Following output is new report of [vdso] sample dwarf backtrace:
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.52% ex [vdso] [.] 0x00000000000009af
|
--- 0x7fff3ace89af
main
__libc_start_main
_start
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347295819-23177-5-git-send-email-jolsa@redhat.com
[ committer note: s/ALIGN/PERF_ALIGN/g to cope with the android build changes ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With dynamic pmu allocation there are also dynamically assigned pmu ids.
These ids are used in event->attr.type to describe the pmu to be used
for that event. The information is available in sysfs, e.g:
/sys/bus/event_source/devices/breakpoint/type: 5
/sys/bus/event_source/devices/cpu/type: 4
/sys/bus/event_source/devices/ibs_fetch/type: 6
/sys/bus/event_source/devices/ibs_op/type: 7
/sys/bus/event_source/devices/software/type: 1
/sys/bus/event_source/devices/tracepoint/type: 2
These mappings are needed to know which samples belong to which pmu. If
a pmu is added dynamically like for ibs_fetch or ibs_op the type value
may vary.
Now, when decoding samples from perf.data this information in sysfs
might be no longer available or may have changed. We need to store it in
perf.data. Using the header for this. Now the header information created
with perf report contains an additional section looking like this:
# pmu mappings: ibs_op = 7, ibs_fetch = 6, cpu = 4, breakpoint = 5, tracepoint = 2, software = 1
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1345144224-27280-9-git-send-email-robert.richter@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.
This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format. The
change is backward compatible with existing perf.data files.
We define the following useful new extensions:
- HEADER_HOSTNAME: the hostname
- HEADER_OSRELEASE: the kernel release number
- HEADER_ARCH: the hw architecture
- HEADER_CPUDESC: generic CPU description
- HEADER_NRCPUS: number of online/avail cpus
- HEADER_CMDLINE: perf command line
- HEADER_VERSION: perf version
- HEADER_TOPOLOGY: cpu topology
- HEADER_EVENT_DESC: full event description (attrs)
- HEADER_CPUID: easy-to-parse low level CPU identication
The small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.
Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.
Thanks to David Ahern for reviewing and testing the many versions of
this patch.
$ perf report --stdio
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
...
$ perf report --stdio -I
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# sibling cores : 0-3
# sibling threads : 0
# sibling threads : 1
# sibling threads : 2
# sibling threads : 3
# node0 meminfo : total = 8320608 kB, free = 7571024 kB
# node0 cpu list : 0-3
# ========
#
...
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer notes: Use --show-info in the tools as was in the docs, rename
perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
conflict with f69b64f7 "perf: Support setting the disassembler style" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can reuse things like the id to attr lookup routine
(perf_evlist__id2evsel) that uses a hash table instead of the linear
lookup done in the older perf_header_attr routines, etc.
Also to make evsels/evlist more pervasive an API, simplyfing using the
emerging perf lib.
cc: Arun Sharma <arun@sharma-home.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>