So far we used the libtraceevent printing routines when showing
tracepoint arguments, but since 'perf trace' has a lot of beautifiers
for syscall arguments, and since some of those can be used to augment
tracepoint arguments, add a routine to make use of those beautifiers
and allow the user to choose which one to use.
The default now is to use the same beautifiers used for the strace-like
sys_enter+sys_exit lines, but the user can choose the libtraceevent ones
by either using the:
perf trace --libtraceevent_print
command line option, or by setting:
# cat ~/.perfconfig
[trace]
tracepoint_beautifiers = libtraceevent
For instance, here are some examples:
# perf trace -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
0.000 sched:sched_wakeup(comm: "perf", pid: 5273 (perf), prio: 120, success: 1, target_cpu: 6)
0.621 nanosleep(rqtp: 0x7ffdd06d1140, rmtp: NULL) ...
0.628 sched:sched_switch(prev_comm: "sleep", prev_pid: 5273 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/6", next_pid: 0, next_prio: 120)
1000.879 sched:sched_wakeup(comm: "sleep", pid: 5273 (sleep), prio: 120, success: 1, target_cpu: 6)
0.621 ... [continued]: nanosleep()) = 0
1001.026 exit_group(error_code: 0) = ?
1001.216 sched:sched_process_exit(comm: "sleep", pid: 5273 (sleep), prio: 120)
#
And then using libtraceevent, as before:
# perf trace --libtraceevent_print -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
0.000 sched:sched_wakeup(comm=perf pid=5288 prio=120 target_cpu=001)
0.739 nanosleep(rqtp: 0x7ffeba6c2f40, rmtp: NULL) ...
0.747 sched:sched_switch(prev_comm=sleep prev_pid=5288 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
1000.902 sched:sched_wakeup(comm=sleep pid=5288 prio=120 target_cpu=001)
0.739 ... [continued]: nanosleep()) = 0
1001.012 exit_group(error_code: 0) = ?
#
The new default allocates an array of 'struct syscall_arg_fmt' for the
tracepoint arguments and, just like with syscall arguments, tries to
find suitable syscall_arg__scnprintf_NAME() routines to augment those
tracepoint arguments based on their type (as in the tracefs "format"
file), or even in their name + type, for instance arguntents with names
ending in "fd" with type "int" get the fd scnprintf beautifier attached,
etc.
Soon this will take advantage of the kernel BTF information to augment
enumerations based on the tracefs "format" type info.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-o8qdluotkcb3b1x2gjqrejcl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like 'perf trace' and 'perf script', should be useful for instance
to only consider samples after the initialization phase of some
workload.
The man page has some examples and considerations about its current
interface, that still doesn't handle the on/off events in a special way,
behaving just like when multiple events are specified, i.e.:
- In non-group mode (when the event list is not enclosed in {}) show a
a menu to allow choosing which event the user wants to see in the
histograms browser
- In group mode, be it using {} or asking for --group, show one column
per event.
Try for instance:
# perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv
Replace probe:icmp_rcv, that I put in place using:
# perf probe icmp_rcv:59
To hit when broadcast packets arrive, with a probe installed after an
initialization phase is over or after some other point of interest, some
garbage collection, etc, and also use --switch-off, for instance, on a
probe installed after said garbage collection is over.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is sometimes useful to generate a snapshot when perf record exits;
I've been using a wrapper script around the workload that would do a
killall -USR2 perf when the workload exits.
This patch makes it easier and also works when perf record is attached
to a pre-existing task. A new snapshot option 'e' can be specified in
-S to enable this behavior:
root@elsewhere:~# perf record -e intel_pt// -Se sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.085 MB perf.data ]
Co-developed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com
[ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
perf.data:
Alexey Budankov:
- Fix loading of compressed data split across adjacent records
Jiri Olsa:
- Fix buffer size setting for processing CPU topology perf.data header.
perf stat:
Jiri Olsa:
- Fix segfault for event group in repeat mode
Cong Wang:
- Always separate "stalled cycles per insn" line, it was being appended to
the "instructions" line.
perf script:
Andi Kleen:
- Fix --max-blocks man page description.
- Improve man page description of metrics.
- Fix off by one in brstackinsn IPC computation.
perf probe:
Arnaldo Carvalho de Melo:
- Avoid calling freeing routine multiple times for same pointer.
perf build:
- Do not use -Wshadow on gcc < 4.8, avoiding too strict warnings
treated as errors, breaking the build.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull tracing updates from Steven Rostedt:
"The main changes in this release include:
- Add user space specific memory reading for kprobes
- Allow kprobes to be executed earlier in boot
The rest are mostly just various clean ups and small fixes"
* tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
tracing: Make trace_get_fields() global
tracing: Let filter_assign_type() detect FILTER_PTR_STRING
tracing: Pass type into tracing_generic_entry_update()
ftrace/selftest: Test if set_event/ftrace_pid exists before writing
ftrace/selftests: Return the skip code when tracing directory not configured in kernel
tracing/kprobe: Check registered state using kprobe
tracing/probe: Add trace_event_call accesses APIs
tracing/probe: Add probe event name and group name accesses APIs
tracing/probe: Add trace flag access APIs for trace_probe
tracing/probe: Add trace_event_file access APIs for trace_probe
tracing/probe: Add trace_event_call register API for trace_probe
tracing/probe: Add trace_probe init and free functions
tracing/uprobe: Set print format when parsing command
tracing/kprobe: Set print format right after parsed command
kprobes: Fix to init kprobes in subsys_initcall
tracepoint: Use struct_size() in kmalloc()
ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS
ftrace: Enable trampoline when rec count returns back to one
tracing/kprobe: Do not run kprobe boot tests if kprobe_event is on cmdline
tracing: Make a separate config for trace event self tests
...
It is useful to aggregate counts per die. E.g. Uncore becomes die-scope
on Xeon Cascade Lake-AP.
Introduce a new option "--per-die" to support per-die aggregation.
The global id for each core has been changed to socket + die id + core
id. The global id for each die is socket + die id.
Add die information for per-core aggregation. The output of per-core
aggregation will be changed from "S0-C0" to "S0-D0-C0". Any scripts
which rely on the output format of per-core aggregation probably be
broken.
For 'perf stat record/report', there is no die information when
processing the old perf.data. The per-die result will be the same as
per-socket.
Committer notes:
Renamed 'die' variable to 'die_id' to fix the build in some systems:
CC /tmp/build/perf/builtin-script.o
cc1: warnings being treated as errors
builtin-stat.c: In function 'perf_env__get_die':
builtin-stat.c:963: error: declaration of 'die' shadows a global declaration
util/util.h:19: error: shadowed declaration is here
mv: cannot stat `/tmp/build/perf/.builtin-stat.o.tmp': No such file or directory
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-bsnhx7vgsuu6ei307mw60mbj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the new CPUID.1F, a new level type of CPU topology, 'die', is
introduced. The 'die' information in CPU topology should be added in
perf header.
To be compatible with old perf.data, the patch checks the section size
before reading the die information. The new info is added at the end of
the cpu_topology section, the old perf tool ignores the extra data. It
never reads data crossing the section boundary.
The new perf tool with the patch can be used on legacy kernel. Add a new
function has_die_topology() to check if die topology information is
supported by kernel. The function only check X86 and CPU 0. Assuming
other CPUs have same topology.
Use similar method for core and socket to support die id and sibling
dies string.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1559688644-106558-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One can just record callchains in the kernel or user space with this new
options.
We can use it together with "--all-kernel" options.
This two options is used just like print_stack(sys) or print_ustack(usr)
for systemtap.
Shown below is the usage of this new option combined with "--all-kernel"
options:
1. Configure all used events to run in kernel space and just collect
kernel callchains.
$ perf record -a -g --all-kernel --kernel-callchains
2. Configure all used events to run in kernel space and just collect
user callchains.
$ perf record -a -g --all-kernel --user-callchains
Committer notes:
Improved documentation to state that asking for kernel callchains really
is asking for excluding user callchains, and vice versa.
Further mentioned that using both won't get both, but nothing, as both
will be excluded.
Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add field 'ipc' to display instructions-per-cycle.
Example:
perf record -e intel_pt/cyc/u ls
perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
ls 2670177.697113434: 7f0dfdbcd090 _start+0x0 mov %rsp, %rdi IPC: 0.00 (1/877)
ls 2670177.697113434: 7f0dfdbcd093 _start+0x3 callq 0x7f0dfdbce030
ls 2670177.697113434: 7f0dfdbce030 _dl_start+0x0 pushq %rbp
ls 2670177.697113434: 7f0dfdbce031 _dl_start+0x1 mov %rsp, %rbp
ls 2670177.697113434: 7f0dfdbce034 _dl_start+0x4 pushq %r15
ls 2670177.697113434: 7f0dfdbce036 _dl_start+0x6 pushq %r14
ls 2670177.697113434: 7f0dfdbce038 _dl_start+0x8 pushq %r13
ls 2670177.697113434: 7f0dfdbce03a _dl_start+0xa pushq %r12
ls 2670177.697113434: 7f0dfdbce03c _dl_start+0xc mov %rdi, %r12
ls 2670177.697113434: 7f0dfdbce03f _dl_start+0xf pushq %rbx
ls 2670177.697113434: 7f0dfdbce040 _dl_start+0x10 sub $0x38, %rsp
ls 2670177.697113434: 7f0dfdbce044 _dl_start+0x14 rdtsc
ls 2670177.697113434: 7f0dfdbce046 _dl_start+0x16 mov %eax, %eax
ls 2670177.697113434: 7f0dfdbce048 _dl_start+0x18 shl $0x20, %rdx
ls 2670177.697113434: 7f0dfdbce04c _dl_start+0x1c or %rax, %rdx
ls 2670177.697114471: 7f0dfdbce04f _dl_start+0x1f movq 0x27e22(%rip), %rax IPC: 0.00 (15/1685)
ls 2670177.697116177: 7f0dfdbce056 _dl_start+0x26 movq %rdx, 0x27683(%rip) IPC: 0.00 (1/881)
Note, the IPC values are low due to page faults at the beginning of
execution. The additional cycles are due to the time to enter the
kernel, not the actual kernel page fault handler.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the --show-bpf-events command line option to show the eBPF related events:
PERF_RECORD_KSYMBOL
PERF_RECORD_BPF_EVENT
Usage:
# perf record -a
...
# perf script --show-bpf-events
...
swapper 0 [000] 0.000000: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
swapper 0 [000] 0.000000: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 36
...
Committer testing:
# perf script --show-bpf-events | egrep -i 'PERF_RECORD_(BPF|KSY)'
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029a6c3 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 47
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029c1ae len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 48
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02ddd1c len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 49
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02dfc11 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 50
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc045da0a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 51
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc04ef4b4 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 52
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc09e15da len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 53
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0d2b1a3 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 54
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0fd9850 len 381 type 1 flags 0x0 name bpf_prog_819967866022f1e1_sys_enter
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 179
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0feb1ec len 191 type 1 flags 0x0 name bpf_prog_c1bd85c092d6e4aa_sys_exit
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 180
^C[root@quaco pt]# perf evlist
intel_pt//ku
dummy:u
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add user memory access attribute for kprobe event arguments.
If a given 'local variable' is in user-space, User can
specify memory access method by '@user' suffix. This is
not only for string but also for data structure.
If we access a field of data structure in user memory from
kernel on some arch, it will fail. e.g.
perf probe -a "sched_setscheduler param->sched_priority"
This will fail to access the "param->sched_priority" because
the param is __user pointer. Instead, we can now specify
@user suffix for such argument.
perf probe -a "sched_setscheduler param->sched_priority@user"
Note that kernel memory access with "@user" must always fail
on any arch.
Link: http://lkml.kernel.org/r/155789874562.26965.10836126971405890891.stgit@devnote2
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With this patch, we can use the 'percore' event qualifier in perf-stat.
root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
1.000773050 S0-C0 98,352,832 cpu/event=0,umask=0x3,percore=1/ (50.01%)
1.000773050 S0-C1 103,763,057 cpu/event=0,umask=0x3,percore=1/ (50.02%)
1.000773050 S0-C2 196,776,995 cpu/event=0,umask=0x3,percore=1/ (50.02%)
1.000773050 S0-C3 176,493,779 cpu/event=0,umask=0x3,percore=1/ (50.02%)
1.000773050 CPU0 47,699,641 cpu/event=0,umask=0x3/ (50.02%)
1.000773050 CPU1 49,052,451 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU2 102,771,422 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU3 100,784,662 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU4 43,171,342 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU5 54,152,158 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU6 93,618,410 cpu/event=0,umask=0x3/ (49.98%)
1.000773050 CPU7 74,477,589 cpu/event=0,umask=0x3/ (49.99%)
In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:
S0-C0 = CPU0 + CPU4
S0-C1 = CPU1 + CPU5
S0-C2 = CPU2 + CPU6
S0-C3 = CPU3 + CPU7
So the result is expected (tiny difference is ignored).
Note that, the 'percore' event qualifier needs to use with option '-A'.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.
We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.
This can be implemented in only the user tool, no kernel support needed.
v4:
---
1. Add Arnaldo's patch which updates the documentation for
this new qualifier.
2. Rebase to latest perf/core branch
v3:
---
Simplify the code according to Jiri's comments.
Before:
"return term->val.percore ? true : false;"
Now:
"return term->val.percore;"
v2:
---
Change the qualifier name from 'coresum' to 'percore' according to
comments from Jiri and Andi.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>