In order for perf buildid-list to work with pipe-mode files, it needs to
process buildids and event attr structs.
$ perf record -o - noploop 2 | ./perf inject -b | perf buildid-list -i - -H
noploop for 2 seconds
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.084 MB - (~3678 samples) ]
0000000000000000000000000000000000000000 [kernel.kallsyms]
3a0d0629efe74a8da3eeba372cdbd74ad9b8f5d5 /usr/local/bin/noploop
The reason [kernel.kallsyms] shows a 0 build-id comes from the
way buildids are injected in the stream.
The buildid for the kernel is provided by a BUILD_ID record. The
[kernel.kallsyms] is provided by a MMAP record. There is no clean and
obvious way to link the two, unfortunately.
In regular mode, the kernel buildid is generated from reading the ELF
image or kallsyms and perf knows to associate [kernel.kallsyms] to it.
Later on, when perf processes the [kernel.kallsyms] MMAP record, it will
already have a dso for it.
So for now, make sure perf buildid-list shows the buildids for
everything but the kernel image.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1337081295-10303-6-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __perf_session__process_pipe_events(), there was a risk we would read
more than what a union perf_event struct can hold. this could happen in
case, perf is reading a file which contains new record types it does not
know about and which are larger than anything it knows about.
In general, perf is supposed to skip records it does not understand, but
in pipe mode, those have to be read and ignored. The fixed size header
contains the size of the record, but that size may be larger than union
perf_event, yet it was used as the backing to the read in:
union perf_event event;
void *p;
size = event->header.size;
p = &event;
p += sizeof(struct perf_event_header);
if (size - sizeof(struct perf_event_header)) {
err = readn(self->fd, p, size - sizeof(struct perf_event_header));
We fix this by allocating a buffer based on the size reported in the
header. We reuse the buffer as much as we can. We realloc in case it
becomes too small. In the common case, the performance impact is
negligible.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1337081295-10303-3-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf inject -b was broken. It would not inject any build_id into the
stream. Furthermore, it would strip samples from the stream.
The reason was a missing initialization of the event attribute
structure. The perf_tool.tool.attr() callback was pointing to a simple
repipe. But there was no initialization of the internal data structures
to keep track of events and event ids. That later caused event id
lookups to fail, and sample would get removed.
The patch simply adds back the call to perf_event__process_attr() to
initialize the evlist structure and now build_ids are again injected.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1337081295-10303-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the perf data file is read cross architectures, the
perf_event__attr_swap function takes care about endianness of all the
struct fields except the bitfield flags.
The bitfield flags need to be transformed as well, since the bitfield
binary storage differs for both endians.
ABI says:
Bit-fields are allocated from right to left (least to most significant)
on little-endian implementations and from left to right (most to least
significant) on big-endian implementations.
The above seems to be byte specific, so we need to reverse each byte of
the bitfield. 'Internet' also says this might be implementation specific
and we probably need proper fix and carry perf_event_attr bitfield flags
in separate data file FEAT_ section. Thought this seems to work for now.
Note, running following to test perf endianity handling:
test 1)
- origin system:
# perf record -a -- sleep 10 (any perf record will do)
# perf report > report.origin
# perf archive perf.data
- copy the perf.data, report.origin and perf.data.tar.bz2
to a target system and run:
# tar xjvf perf.data.tar.bz2 -C ~/.debug
# perf report > report.target
# diff -u report.origin report.target
- the diff should produce no output
(besides some white space stuff and possibly different
date/TZ output)
test 2)
- origin system:
# perf record -ag -fo /tmp/perf.data -- sleep 1
- mount origin system root to the target system on /mnt/origin
- target system:
# perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
--kallsyms /mnt/origin/proc/kallsyms
- complete perf.data header is displayed
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337151548-2396-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While migrating to the libtraceevent, the perl scripting engine
missed this structure rename.
This fixes:
util/scripting-engines/trace-event-perl.c: In function "find_cache_event":
util/scripting-engines/trace-event-perl.c:244: error: assignment from incompatible pointer type
util/scripting-engines/trace-event-perl.c:248: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:248: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:250: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c: In function "perl_process_tracepoint":
util/scripting-engines/trace-event-perl.c:286: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:286: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:307: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c: In function "perl_generate_script":
util/scripting-engines/trace-event-perl.c:498: error: passing argument 1 of "trace_find_next_event" from incompatible pointer type
util/scripting-engines/../trace-event.h:56: note: expected "struct event_format *" but argument is of type "struct event *"
util/scripting-engines/trace-event-perl.c:498: error: assignment from incompatible pointer type
util/scripting-engines/trace-event-perl.c:499: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:499: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:513: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:532: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:556: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:569: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:570: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:579: error: dereferencing pointer to incomplete type
util/scripting-engines/trace-event-perl.c:580: error: dereferencing pointer to incomplete type
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1337697049-30251-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Handle the print argument types brought by the new libparsevent in perl
scripting engine.
PRINT_BSTRING and PRINT_DYNAMIC_ARRAY are treated just like strings
and thus don't require specific processing.
But PRINT_FUNC need specific plugins which are not yet handled, lets
warn if we meet this case.
This fixes:
util/scripting-engines/trace-event-perl.c: In function define_event_symbol:
util/scripting-engines/trace-event-perl.c:188: error: enumeration value PRINT_BSTRING not handled in switch
util/scripting-engines/trace-event-perl.c:188: error: enumeration value PRINT_DYNAMIC_ARRAY not handled in switch
util/scripting-engines/trace-event-perl.c:188: error: enumeration value PRINT_FUNC not handled in switch
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1337697049-30251-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a new hardcoded term 'name' allowing to specify a name for the
pmu event. The term is defined along with standard pmu terms. If no
'name' term is given, the event name follows following template:
"raw 0x<perf_event_attr::config>"
running:
perf stat -e cpu/config=1,name=krava1/u ls
will produce following output:
...
Performance counter stats for 'ls':
0 krava1
...
running:
perf stat -e cpu/config=1/u ls
will produce following output:
...
Performance counter stats for 'ls':
0 raw 0x1
...
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337584373-2741-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Conflicts:
tools/perf/Makefile
This tree from Frederic unifies the perf and trace-cmd trace event format
parsing code into a single library.
Powertop and other tools will also be able to make use of it.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
764e16a changed perf-record to create events disabled by default and
enable them once perf initializations are done. This setting was dropped
by 0f82ebc. Now perf events are once again generated during perf's
initialization phase (e.g., generating maps).
As an example, perf opens a lot of files at startup. Unpatched:
perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB /tmp/perf.data (~3798 samples) ]
Using perf-script to look at the samples shows the perf command generating
563 of the 566 total events.
Patched:
perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB /tmp/perf.data (~1206 samples) ]
Using perf-script to look at the samples does not show perf command.
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1336968088-11531-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge reason: We are going to queue up a dependent patch:
"perf tools: Move parse event automated tests to separated object"
That depends on:
commit e7c72d8
perf tools: Add 'G' and 'H' modifiers to event parsing
Conflicts:
tools/perf/builtin-stat.c
Conflicted with the recent 'perf_target' patches when checking the
result of perf_evsel open routines to see if a retry is needed to cope
with older kernels where the exclude guest/host perf_event_attr bits
were not used.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The callchain address is stored as u64. Current code uses following
format string to display callchain address:
"%p\n", (void *)(long)chain->ip
This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
environment. Fixing this to always display whole 64 bits.
Note, running following to test perf endianity handling:
test 1)
- origin system:
# perf record -a -- sleep 10 (any perf record will do)
# perf report > report.origin
# perf archive perf.data
- copy the perf.data, report.origin and perf.data.tar.bz2
to a target system and run:
# tar xjvf perf.data.tar.bz2 -C ~/.debug
# perf report > report.target
# diff -u report.origin report.target
- the diff should produce no output
(besides some white space stuff and possibly different
date/TZ output)
test 2)
- origin system:
# perf record -ag -fo /tmp/perf.data -- sleep 1
- mount origin system root to the target system on /mnt/origin
- target system:
# perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
--kallsyms /mnt/origin/proc/kallsyms
- complete perf.data header is displayed
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes and improvements for perf/core:
- perf_target: abstraction for --uid, --pid, --tid, --cpu, --all-cpus handling,
eliminating code duplicated in the tools, having constraints that apply to
all of them, from Namhyung Kim
- Fixes for handling fallback to cpu-clock on PPC, from David Ahern
- Fix for processing events with unknown size, from Jiri Olsa
- Compilation fix on 32-bit, from Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf stat on PPC currently fails to run:
$ perf stat -- sleep 1
Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information.
Fatal: Not all events could be opened.
The problem is that until 2.6.37 (behavior changed with commit b0a873e)
perf on PPC returns ENXIO when hw_perf_event_init() fails. With this
patch we get the expected behavior:
$ perf stat -v -- sleep 1
cycles event is not supported by the kernel.
stalled-cycles-frontend event is not supported by the kernel.
stalled-cycles-backend event is not supported by the kernel.
instructions event is not supported by the kernel.
branches event is not supported by the kernel.
branch-misses event is not supported by the kernel.
...
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf top' command falls back to cpu-clock if the H/W cycles event
is not supported, but the event name is not updated leading to a
misleading header:
PerfTop: 8 irqs/sec kernel:75.0% exact: 0.0% [1000Hz cycles], ...
Update the event name when the event type is changed so that the
header displays correctly:
PerfTop: 794 irqs/sec kernel:100.0% exact: 0.0% [1000Hz cpu-clock], ...
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1336495789-58420-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat on PPC currently fails to run:
$ perf stat -- sleep 1
Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information.
Fatal: Not all events could be opened.
The problem is that until 2.6.37 (behavior changed with commit b0a873e)
perf on PPC returns ENXIO when hw_perf_event_init() fails. With this
patch we get the expected behavior:
$ perf stat -v -- sleep 1
cycles event is not supported by the kernel.
stalled-cycles-frontend event is not supported by the kernel.
stalled-cycles-backend event is not supported by the kernel.
instructions event is not supported by the kernel.
branches event is not supported by the kernel.
branch-misses event is not supported by the kernel.
...
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf-record on PPC is not falling back to cpu-clock:
$ perf record -ag -fo /tmp/perf.data -- sleep 1
Error: sys_perf_event_open() syscall returned with 6 (No such device or address). /bin/dmesg may provide additional information.
Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
The problem is that until 2.6.37 (behavior changed with commit b0a873e)
perf on PPC returns ENXIO when hw_perf_event_init() fails. With this
patch we get the expected behavior:
$ perf record -ag -fo /tmp/perf.data -v -- sleep 1
Old kernel, cannot exclude guest or host samples.
The cycles event is not supported, trying to fall back to cpu-clock-ticks
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.151 MB /tmp/perf.data (~6592 samples) ]
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1336490937-57106-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf annotate browser improvements:
- Get back the line separating the overheads from the disassembly, requested by
Peter Zijlstra, Linus agreed now that it is a solid line and more column real
state was harvested. Also it has the jump->arrow lines separated from it by
the address/jump target column.
- Don't change asm line color when toggling source code view. Requested by
Peter Zijlstra.
Current snapshot:
avtab_search_node
│ push %rbp
│ mov %rsp,%rbp
│ → callq mcount
│ movzwl 0x6(%rsi),%edx
│ and $0x7fff,%dx
│ test %rdi,%rdi
│ ↓ jne 20
0.42 │17:┌─→xor %eax,%eax
│19:│ leaveq
0.42 │ │← retq
│ │ nopl 0x0(%rax,%rax,1)
│20:│ mov (%rdi),%rax
0.08 │ │ test %rax,%rax
│ └──je 17
│ movzwl (%rsi),%ecx
│ movzwl 0x2(%rsi),%r9d
│ movzwl 0x4(%rsi),%r8d
│ movzwl %cx,%esi
│ movzwl %r9w,%r10d
│ shl $0x9,%esi
│ lea (%rsi,%r10,4),%esi
│ lea (%r8,%rsi,1),%esi
│ and 0x10(%rdi),%si
│ movzwl %si,%esi
│ mov (%rax,%rsi,8),%rax
1.01 │ test %rax,%rax
│ ↑ je 19
│ nopw 0x0(%rax,%rax,1)
3.19 │60: cmp %cx,(%rax)
│ ↓ jne 7e
0.08 │ cmp %r9w,0x2(%rax)
│ ↓ jne 7e
│ cmp %r8w,0x4(%rax)
│ ↓ jne 79
│ test %dx,0x6(%rax)
│ ↑ jne 19
│79: cmp %r8w,0x4(%rax)
83.45 │7e: ↑ ja 17
3.36 │ mov 0x10(%rax),%rax
7.98 │ test %rax,%rax
│ ↑ jne 60
│ leaveq
│ ← retq
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>