Conflicts:
tools/perf/Makefile
This tree from Frederic unifies the perf and trace-cmd trace event format
parsing code into a single library.
Powertop and other tools will also be able to make use of it.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
764e16a changed perf-record to create events disabled by default and
enable them once perf initializations are done. This setting was dropped
by 0f82ebc. Now perf events are once again generated during perf's
initialization phase (e.g., generating maps).
As an example, perf opens a lot of files at startup. Unpatched:
perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB /tmp/perf.data (~3798 samples) ]
Using perf-script to look at the samples shows the perf command generating
563 of the 566 total events.
Patched:
perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB /tmp/perf.data (~1206 samples) ]
Using perf-script to look at the samples does not show perf command.
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1336968088-11531-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge reason: We are going to queue up a dependent patch:
"perf tools: Move parse event automated tests to separated object"
That depends on:
commit e7c72d8
perf tools: Add 'G' and 'H' modifiers to event parsing
Conflicts:
tools/perf/builtin-stat.c
Conflicted with the recent 'perf_target' patches when checking the
result of perf_evsel open routines to see if a retry is needed to cope
with older kernels where the exclude guest/host perf_event_attr bits
were not used.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The callchain address is stored as u64. Current code uses following
format string to display callchain address:
"%p\n", (void *)(long)chain->ip
This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
environment. Fixing this to always display whole 64 bits.
Note, running following to test perf endianity handling:
test 1)
- origin system:
# perf record -a -- sleep 10 (any perf record will do)
# perf report > report.origin
# perf archive perf.data
- copy the perf.data, report.origin and perf.data.tar.bz2
to a target system and run:
# tar xjvf perf.data.tar.bz2 -C ~/.debug
# perf report > report.target
# diff -u report.origin report.target
- the diff should produce no output
(besides some white space stuff and possibly different
date/TZ output)
test 2)
- origin system:
# perf record -ag -fo /tmp/perf.data -- sleep 1
- mount origin system root to the target system on /mnt/origin
- target system:
# perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
--kallsyms /mnt/origin/proc/kallsyms
- complete perf.data header is displayed
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes and improvements for perf/core:
- perf_target: abstraction for --uid, --pid, --tid, --cpu, --all-cpus handling,
eliminating code duplicated in the tools, having constraints that apply to
all of them, from Namhyung Kim
- Fixes for handling fallback to cpu-clock on PPC, from David Ahern
- Fix for processing events with unknown size, from Jiri Olsa
- Compilation fix on 32-bit, from Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Perf annotate browser improvements:
- Get back the line separating the overheads from the disassembly, requested by
Peter Zijlstra, Linus agreed now that it is a solid line and more column real
state was harvested. Also it has the jump->arrow lines separated from it by
the address/jump target column.
- Don't change asm line color when toggling source code view. Requested by
Peter Zijlstra.
Current snapshot:
avtab_search_node
│ push %rbp
│ mov %rsp,%rbp
│ → callq mcount
│ movzwl 0x6(%rsi),%edx
│ and $0x7fff,%dx
│ test %rdi,%rdi
│ ↓ jne 20
0.42 │17:┌─→xor %eax,%eax
│19:│ leaveq
0.42 │ │← retq
│ │ nopl 0x0(%rax,%rax,1)
│20:│ mov (%rdi),%rax
0.08 │ │ test %rax,%rax
│ └──je 17
│ movzwl (%rsi),%ecx
│ movzwl 0x2(%rsi),%r9d
│ movzwl 0x4(%rsi),%r8d
│ movzwl %cx,%esi
│ movzwl %r9w,%r10d
│ shl $0x9,%esi
│ lea (%rsi,%r10,4),%esi
│ lea (%r8,%rsi,1),%esi
│ and 0x10(%rdi),%si
│ movzwl %si,%esi
│ mov (%rax,%rsi,8),%rax
1.01 │ test %rax,%rax
│ ↑ je 19
│ nopw 0x0(%rax,%rax,1)
3.19 │60: cmp %cx,(%rax)
│ ↓ jne 7e
0.08 │ cmp %r9w,0x2(%rax)
│ ↓ jne 7e
│ cmp %r8w,0x4(%rax)
│ ↓ jne 79
│ test %dx,0x6(%rax)
│ ↑ jne 19
│79: cmp %r8w,0x4(%rax)
83.45 │7e: ↑ ja 17
3.36 │ mov 0x10(%rax),%rax
7.98 │ test %rax,%rax
│ ↑ jne 60
│ leaveq
│ ← retq
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, 'perf record -- sleep 1' creates a cpu map for all online
cpus since it turns out calling cpu_map__new(NULL). Fix it.
Also it is guaranteed that cpu_list is NULL if PID/TID is given by
calling perf_target__validate(), so we can make the conditional bit
simpler.
This also fixes perf test 7 (Validate) failure on my 6 core machine:
$ cat /sys/devices/system/cpu/online
0-11
$ ./perf test -v 7
7: Validate PERF_RECORD_* events & perf_sample fields:
--- start ---
perf_evlist__mmap: Operation not permitted
---- end ----
Validate PERF_RECORD_* events & perf_sample fields: FAILED!
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1336367344-28071-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Include header fixes for
... bool:
util/parse-events.h:31: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘have_tracepoints’
... and types.h:
util/parse-events.h:28: error: expected ‘)’ before ‘config’
util/parse-events.h:34: error: expected declaration specifiers or ‘...’ before ‘u64’
util/parse-events.h:45: error: expected ‘)’ before ‘type’
This happens if now other include files are included before
util/parse-events.h.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1333643188-26895-2-git-send-email-robert.richter@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Annotation improvements:
Now the default annotate browser uses a much more compact format, implementing
suggestions made made by several people, notably Linus.
Here is part of the new __list_del_entry() annotation:
__list_del_entry
8.47 │ push %rbp
8.47 │ mov (%rdi),%rdx
20.34 │ mov $0xdead000000100100,%rcx
3.39 │ mov 0x8(%rdi),%rax
0.00 │ mov %rsp,%rbp
1.69 │ cmp %rcx,%rdx
0.00 │ je 43
1.69 │ mov $0xdead000000200200,%rcx
3.39 │ cmp %rcx,%rax
0.00 │ je a3
5.08 │ mov (%rax),%r8
18.64 │ cmp %r8,%rdi
0.00 │ jne 84
1.69 │ mov 0x8(%rdx),%r8
25.42 │ cmp %r8,%rdi
0.00 │ jne 65
1.69 │ mov %rax,0x8(%rdx)
0.00 │ mov %rdx,(%rax)
0.00 │ leaveq
0.00 │ retq
0.00 │ 43: mov %rdx,%r8
0.00 │ mov %rdi,%rcx
0.00 │ mov $0xffffffff817cd6a8,%rdx
0.00 │ mov $0x31,%esi
0.00 │ mov $0xffffffff817cd6e0,%rdi
0.00 │ xor %eax,%eax
0.00 │ callq ffffffff8104eab0 <warn_slowpath_fmt>
0.00 │ leaveq
0.00 │ retq
0.00 │ 65: mov %rdi,%rcx
0.00 │ mov $0xffffffff817cd780,%rdx
0.00 │ mov $0x3a,%esi
0.00 │ mov $0xffffffff817cd6e0,%rdi
0.00 │ xor %eax,%eax
0.00 │ callq ffffffff8104eab0 <warn_slowpath_fmt>
0.00 │ leaveq
0.00 │ retq
The infrastructure is there to provide formatters for any instruction,
like the one I'll do for call functions to elide the address.
Further fixes on top of the first iteration:
- Sometimes a jump points to an offset with no instructions, make the
mark jump targets function handle that, for now just ignoring such
jump targets, more investigation is needed to figure out how to cope
with that.
- Handle jump targets that are outside the function, for now just don't
try to draw the connector arrow, right thing seems to be to mark this
jump with a -> (right arrow) and handle it like a callq.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The event parsing code in perf was originally copied from trace-cmd
but never was kept up-to-date with the changes that was done there.
The trace-cmd libtraceevent.a code is much more mature than what is
currently in perf.
This updates the code to use wrappers to handle the calls to the
new event parsing code. The new code requires a handle to be pass
around, which removes the global event variables and allows
more than one event structure to be read from different files
(and different machines).
But perf still has the old global events and the code throughout
perf does not yet have a nice way to pass around a handle.
A global 'pevent' has been made for perf and the old calls have
been created as wrappers to the new event parsing code that uses
the global pevent.
With this change, perf can later incorporate the pevent handle into
the perf structures and allow more than one file to be read and
compared, that contains different events.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Move the trace-event-parse.c code that originally came from trace-cmd into
their own files. The new file will be called trace-parse-events.c, as
the name of trace-cmd's file was parse-events.c too, but it conflicted
with the parse-events.c file in perf that parses the command line.
This tries to update the code with mimimal changes.
Perf specific code stays in the trace-event-parse.[ch] files and
the common parsing code is now in trace-parse-events.c and
trace-parse-events.h.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>