The perf kvm still use the legacy interface.
Switch to the new perf_mmap__read_event() interface for perf kvm.
No functional change.
Committer notes:
Tested before and after running:
# perf kvm stat record
On a machine with a kvm guest, then used:
# perf kvm stat report
Before/after results match and look like:
# perf kvm stat record -a sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.132 MB perf.data.guest (1828 samples) ]
# perf kvm stat report
Analyze events for all VMs, all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
IO_INSTRUCTION 258 40.06% 0.08% 3.51us 122.54us 14.87us (+- 6.76%)
MSR_WRITE 178 27.64% 0.01% 0.47us 6.34us 2.18us (+- 4.80%)
EPT_MISCONFIG 148 22.98% 0.03% 3.76us 65.60us 11.22us (+- 8.14%)
HLT 47 7.30% 99.88% 181.69us 249988.06us 102061.36us (+-13.49%)
PAUSE_INSTRUCTION 5 0.78% 0.00% 0.38us 0.79us 0.47us (+-17.05%)
MSR_READ 4 0.62% 0.00% 1.14us 3.33us 2.67us (+-19.35%)
EXTERNAL_INTERRUPT 2 0.31% 0.00% 2.15us 2.17us 2.16us (+- 0.30%)
PENDING_INTERRUPT 1 0.16% 0.00% 2.56us 2.56us 2.56us (+- 0.00%)
PREEMPTION_TIMER 1 0.16% 0.00% 3.21us 3.21us 3.21us (+- 0.00%)
Total Samples:644, Total events handled time:4802790.72us.
#
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1519945751-37786-1-git-send-email-kan.liang@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we can crash perf record when running in pipe mode, like:
$ perf record ls | perf report
# To display the perf.data header info, please use --header/--header-only options.
#
perf: Segmentation fault
Error:
The - file has no samples!
The callstack of the crash is:
0x0000000000515242 in perf_event__synthesize_event_update_name
3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
(gdb) bt
#0 0x0000000000515242 in perf_event__synthesize_event_update_name
#1 0x00000000005158a4 in perf_event__synthesize_extra_attr
#2 0x0000000000443347 in record__synthesize
#3 0x00000000004438e3 in __cmd_record
#4 0x000000000044514e in cmd_record
#5 0x00000000004cbc95 in run_builtin
#6 0x00000000004cbf02 in handle_internal_command
#7 0x00000000004cc054 in run_argv
#8 0x00000000004cc422 in main
The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.
We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.
Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
# perf record -F 200000 sleep 1
warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz.
The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate.
The kernel will lower it when perf's interrupts take too long.
Use --strict-freq to disable this throttling, refusing to record.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
For those wanting that it fails if the desired frequency can't be used:
# perf record --strict-freq -F 200000 sleep 1
error: Maximum frequency rate (15,000 Hz) exceeded.
Please use -F freq option with a lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
#
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's a problem with relying on backtrace data from 'perf trace' the
way the trace+probe_libc_inet_pton does. This test inserts uprobe within
ping binary and checks that it gets its sample using 'perf trace'.
It also checks it gets proper backtrace from sample and that's where the
issue is.
The 'perf trace' does not sort events (by definition) so it can happen
that it processes the event sample before the ping binary memory map
event. This can (very rarely) happen as proved by this events dump
output (from custom added debug output):
...
7680/7680: [0x7f4e29718000(0x204000) @ 0 fd:00 33611321 4230892504]: r-xp /usr/lib64/libdl-2.17.so
7680/7680: [0x7f4e29502000(0x216000) @ 0 fd:00 33617257 2606846872]: r-xp /usr/lib64/libz.so.1.2.7
(IP, 0x2): 7680/7680: 0x7f4e29c2ed60 period: 1 addr: 0
7680/7680: [0x564842ef0000(0x233000) @ 0 fd:00 83 1989280200]: r-xp /usr/bin/ping
7680/7680: [0x7f4e2aca2000(0x224000) @ 0 fd:00 33611308 1219144940]: r-xp /usr/lib64/ld-2.17.so
...
In this case 'perf trace' fails to resolve the last callchain IP (within
the ping binary) because it does not know about the ping binary memory
map yet and the test fails like this:
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
0.000 probe_libc:inet_pton:(7f4e29c2ed60))
__GI___inet_pton (/usr/lib64/libc-2.17.so)
getaddrinfo (/usr/lib64/libc-2.17.so)
[0] ([unknown])
FAIL: expected backtrace entry 8 ".*\(.*/bin/ping.*\)$" got "[0] ([unknown])"
Switching the test to use 'perf record' and 'perf script' instead of
'perf trace'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180301165215.6780-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This first happened with a gcc function, _cpp_lex_token, that has the
usual jumps:
│1159e6c: ↓ jne 115aa32 <_cpp_lex_token@@Base+0xf92>
I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:
│1159e8b: ↓ jne c469be <cpp_named_operator2name@@Base+0xa72>
I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.
A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.
For now just don't draw the arrow.
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefgmud@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If we execute 'perf stat --per-thread' with non-root account (even set
kernel.perf_event_paranoid = -1 yet), it reports the error:
jinyao@skl:~$ perf stat --per-thread
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
kernel.perf_event_paranoid = -1
Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.
This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.
For example (run with non-root, kernel.perf_event_paranoid isn't set):
jinyao@skl:~$ perf stat --per-thread
^C
Performance counter stats for 'system wide':
vmstat-3458 6.171984 cpu-clock:u (msec) # 0.000 CPUs utilized
perf-3670 0.515599 cpu-clock:u (msec) # 0.000 CPUs utilized
vmstat-3458 1,163,643 cycles:u # 0.189 GHz
perf-3670 40,881 cycles:u # 0.079 GHz
vmstat-3458 1,410,238 instructions:u # 1.21 insn per cycle
perf-3670 3,536 instructions:u # 0.09 insn per cycle
vmstat-3458 288,937 branches:u # 46.814 M/sec
perf-3670 936 branches:u # 1.815 M/sec
vmstat-3458 15,195 branch-misses:u # 5.26% of all branches
perf-3670 76 branch-misses:u # 8.12% of all branches
12.651675247 seconds time elapsed
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On older (e.g. v4.4) kernels, an annoying fallback message can be
observed in 'perf top':
┌─Warning:──────────────────────┐
│fall back to non-overwrite mode│
│ │
│ │
│Press any key... │
└───────────────────────────────┘
The 'perf top' utility has been changed to overwrite mode since commit
ebebbf0823 ("perf top: Switch default mode to overwrite mode").
For older kernels which don't have overwrite mode support, 'perf top'
will fall back to non-overwrite mode and print out the fallback message
using ui__warning(), which needs user's input to close.
The fallback message is not critical for end users. Turning it to debug
message which is printed when running with -vv.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Fixes: ebebbf0823 ("perf top: Switch default mode to overwrite mode")
Link: http://lkml.kernel.org/r/1519669030-176549-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using -G with one cgroup and -e with multiple events, only the
first event gets the correct cgroup setting, all events from the second
onwards will track system-wide events.
If the user wants to track multiple events for a specific cgroup, the
user must give parameters like the following:
$ perf stat -e e1 -e e2 -e e3 -G test,test,test
This patch simplify this case, just type one cgroup:
$ perf stat -e e1 -e e2 -e e3 -G test
$ mkdir -p /sys/fs/cgroup/perf_event/empty_cgroup
$ perf stat -e cycles -e cache-misses -a -I 1000 -G empty_cgroup
Before:
1.001007226 <not counted> cycles empty_cgroup
1.001007226 7,506 cache-misses
After:
1.000834097 <not counted> cycles empty_cgroup
1.000834097 <not counted> cache-misses empty_cgroup
Signed-off-by: weiping zhang <zhangweiping@didichuxing.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180129154805.GA6284@localhost.didichuxing.com
[ Improved the doc text a bit, providing an example for cgroup + system wide counting ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently a number of Makefiles break when used with toolchains that
pass extra flags in CC and other cross-compile related variables (such
as --sysroot).
Thus we get this error when we use a toolchain that puts --sysroot in
the CC var:
~/src/linux/tools$ make iio
[snip]
iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
#include <unistd.h>
^~~~~~~~~~
This occurs because we clobber several env vars related to
cross-compiling with lines like this:
CC = $(CROSS_COMPILE)gcc
Although this will point to a valid cross-compiler, we lose any extra
flags that might exist in the CC variable, which can break toolchains
that rely on them (for example, those that use --sysroot).
This easily shows up using a Yocto SDK:
$ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi
$ echo $CC
arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
-mcpu=cortex-a8
--sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi
$ echo $CROSS_COMPILE
arm-poky-linux-gnueabi-
$ echo ${CROSS_COMPILE}gcc
krm-poky-linux-gnueabi-gcc
Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
--sysroot and other flags that enable us to find the right libraries to
link against, so we can't find unistd.h and other libraries and headers.
Normally with the --sysroot flag we would find unistd.h in the sdk
directory in the sysroot:
$ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h
The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
already set, and it compiles correctly with the above toolchain.
So, generalize the logic that perf uses in the common Makefile and
remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.
Note that this patch does not fix cross-compile for all the tools (some
have other bugs), but it does fix it for all except usb and acpi, which
still have other unrelated issues.
I tested both with and without the patch on native and cross-build and
there appear to be no regressions.
Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.com
Signed-off-by: Martin Kelly <martin@martingkelly.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before this change, the '--graph-funcs', '--nograph-funcs' and
'--trace-funcs' options didn't work as expected when the <func> doesn't
exist. Because the kernel side hid possible errors.
$ sudo ./perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
0) 0.140 us | rcu_all_qs();
3) 0.304 us | mutex_unlock();
0) 0.153 us | find_vma();
3) 0.088 us | __fsnotify_parent();
0) 6.145 us | handle_mm_fault();
3) 0.089 us | fsnotify();
3) 0.161 us | __sb_end_write();
3) 0.710 us | SyS_close();
3) 7.848 us | exit_to_usermode_loop();
On the example above, I specified the function filter 'abcdefg' but all
functions are enabled. The expected result is for all functions to be
filtered, since there is no such function ('abcdefg')
The original fix is to make the kernel support '\0' as end of string:
https://lkml.org/lkml/2018/1/16/116
But above fix cannot be compatible with old kernels. Then Namhyung Kim
suggest adding a space after function name.
This patch will append an '\n' when write tracing file. After this fix,
the perf will report correct error state. Also let it print an error if
reset_tracing_files() fails.
Committer testing:
Now it prints:
# perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
failed to set tracing filters
#
And for an existing function:
# perf ftrace -a --graph-depth 1 --graph-funcs SyS_open
3) | SyS_open() {
3) ! 494.899 us | }
0) + 23.910 us | SyS_open();
1) + 17.115 us | SyS_open();
1) + 13.900 us | SyS_open();
------------------------------------------
3) qemu-sy-2817 => pickup-1290
------------------------------------------
3) + 20.021 us | SyS_open();
#
Signed-off-by: Changbin Du <changbin.du@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1519007609-14551-1-git-send-email-changbin.du@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In some situations the vfs_getname is being added both as requested and
with a _1 suffix (inlines?):
probe:vfs_getname_1 (on getname_flags:63@acme/git/linux/fs/namei.c with pathname)
This ends up making the cleanup to miss that one, as it removes just
'probe:vfs_getname', which makes the second test to use this probe point
to fail, since it finds that leftover from the first test, use a
wildcard to remove both.
Before:
# perf test 60 61 62 63
60: Use vfs_getname probe to get syscall args filenames : FAILED!
61: probe libc's inet_pton & backtrace it with ping : Ok
62: Check open filename arg using perf trace + vfs_getname: FAILED!
63: Add vfs_getname probe to get syscall args filenames : Ok
After:
# perf test 60 61 62 63
60: Use vfs_getname probe to get syscall args filenames : Ok
61: probe libc's inet_pton & backtrace it with ping : Ok
62: Check open filename arg using perf trace + vfs_getname: Ok
63: Add vfs_getname probe to get syscall args filenames : Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2k5kutwr4ds36adiakyb4yvy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Fedora 27 and latest Linux kernel the test case
trace+probe_libc_inet_pton.sh fails again on s390. This time is the
inlining of functions which does not match. After an update of the
glibc (from 2.26-16 to 2.26-24) the output is different
The expected output is:
__inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
....
The actual output is:
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
0.000 probe_libc:inet_pton:(3ffb2140448))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
...
Fix this by being less strict on 'inlined' verses library name and
accept both
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180214070303.55757-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The function get_cpuid_str() is called by perf_pmu__getcpuid() and on
s390 returns a complete description of the CPU and its capabilities,
which is a comma separated list.
To map the CPU type with the value defined in the
pmu-events/arch/s390/mapfile.csv, introduce an architecture specific
cpuid compare function named strcmp_cpuid_str()
The currently used regex algorithm is defined as the weak default and
will be used if no platform specific one is defined. This matches the
current behavior.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-3-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf record ... is setup to record data, the s390 cpu information
was a fixed string "IBM/S390".
Replace this string with one containing more information about the
machine. The information included in the cpuid is a comma separated
list:
manufacturer,type,model-capacity,model[,version,authorization]
with
- manufacturer: up to 16 byte name of the manufacturer (IBM).
- type: a four digit number refering to the machine
generation.
- model-capacitiy: up to 16 characters describing number
of cpus etc.
- model: up to 16 characters describing model.
- version: the CPU-MF counter facility version number,
available on LPARs only, omitted on z/VM guests.
- authorization: the CPU-MF counter facility authorization level,
available on LPARs only, omitted on z/VM guests.
Before:
[root@s8360047 perf]# ./perf record -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ]
[root@s8360047 perf]# ./perf report --header | fgrep cpuid
# cpuid : IBM/S390
[root@s8360047 perf]#
After:
[root@s35lp76 perf]# ./perf report --header|fgrep cpuid
# cpuid : IBM,3906,704,M03,3.5,002f
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-1-tmricht@linux.vnet.ibm.com
[ Use scnprintf instead of strncat to fix build errors on gcc GNU C99 5.4.0 20160609 -march=zEC12 -m64 -mzarch -ggdb3 -O6 -std=gnu99 -fPIC -fno-omit-frame-pointer -funwind-tables -fstack-protector-all ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao reported memory corrupton in perf report with
branch info used for stack trace:
> Following command lines will cause perf crash.
> perf record -j call -g -a <application>
> perf report --branch-history
>
> *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
> /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
> perf[0x51b914]
> perf(hist_entry_iter__add+0x1e5)[0x51f305]
> perf[0x43cf01]
> perf[0x4fa3bf]
> perf[0x4fa923]
> perf[0x4fd396]
> perf[0x4f9614]
> perf(perf_session__process_events+0x89e)[0x4fc38e]
> perf(cmd_report+0x15d2)[0x43f202]
> perf[0x4a059f]
> perf(main+0x631)[0x427b71]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
> perf(_start+0x29)[0x427d89]
For the cumulative output, we allocate the he_cache array based on the
--max-stack option value and populate it with data from 'callchain_cursor'.
The --max-stack option value does not ensure now the limit for number of
callchain_cursor nodes, so the cumulative iter code will allocate smaller array
than it's actually needed and cause above corruption.
I think the --max-stack limit does not apply here anyway, because we add
callchain data as normal hist entries, while the --max-stack control the limit
of single entry callchain depth.
Using the callchain_cursor.nr as he_cache array count to fix this. Also
removing struct hist_entry_iter::max_stack, because there's no longer any use
for it.
We need more fixes to ensure that the branch stack code follows properly the
logic of --max-stack, which is not the case at the moment.
Original-patch-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180216123619.GA9945@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,
1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'
Percent│ IPC Cycle
│ if (flag)
1.37 │0.4┌── 1 ↓ je 82
│ │ x += x / y + y / x;
0.00 │0.4│ 1310 movsd (%rsp),%xmm0
0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4
│0.4│ movsd 0x8(%rsp),%xmm1
│0.4│ movsd (%rsp),%xmm3
│0.4│ divsd %xmm4,%xmm0
0.00 │0.4│ 579 divsd %xmm3,%xmm1
│0.4│ movsd (%rsp),%xmm2
│0.4│ addsd %xmm1,%xmm0
│0.4│ addsd %xmm2,%xmm0
0.00 │0.4│ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.4└─→ 82: sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.
With this patch, the result is:
Percent│ IPC Cycle
│ if (flag)
1.37 │0.48 1 ┌──je 82
│ │ x += x / y + y / x;
0.00 │0.48 1310 │ movsd (%rsp),%xmm0
0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4
│0.48 │ movsd 0x8(%rsp),%xmm1
│0.48 │ movsd (%rsp),%xmm3
│0.48 │ divsd %xmm4,%xmm0
0.00 │0.48 579 │ divsd %xmm3,%xmm1
│0.48 │ movsd (%rsp),%xmm2
│0.48 │ addsd %xmm1,%xmm0
│0.48 │ addsd %xmm2,%xmm0
0.00 │0.48 │ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.48 82:└─→sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
Committer notes:
Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.
While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Added user space perf functionality to translate CoreSight traces into
instruction events with branch stack.
To invoke the new functionality, use the perf inject tool with
--itrace=il. For example, to translate the ETM trace from perf.data into
last branch records in a new inj.data file:
$ perf inject --itrace=i100000il128 -i perf.data -o perf.data.new
The 'i' parameter to itrace generates periodic instruction events. The
period between instruction events can be specified as a number of
instructions suffixed by i (default 100000).
The parameter to 'l' specifies the number of entries in the branch stack
attached to instruction events.
The 'b' parameter to itrace generates events on taken branches.
This patch also fixes the contents of the branch events used in perf
report - previously branch events were generated for each contiguous
range of instructions executed. These are fixed to generate branch
events between the last address of a range ending in an executed branch
instruction and the start address of the next range.
Based on patches by Sebastian Pop <s.pop@samsung.com> with additional fixes
and support for specifying the instruction period.
Originally-by: Sebastian Pop <s.pop@samsung.com>
Signed-off-by: Robert Walker <robert.walker@arm.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1518607481-4059-2-git-send-email-robert.walker@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.
Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.
This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 73c0ca1eee ("perf thread_map: Enumerate all threads from /proc")
Link: http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The symbol search called by machine__find_kernel_symbol_by_name is using
internally arch__compare_symbol_names function to compare 2 symbol
names, because different archs have different ways of comparing symbols.
Mostly for skipping '.' prefixes and similar.
In test 1 when we try to find matching symbols in kallsyms and vmlinux,
by address and by symbol name. When either is found we compare the pair
symbol names by simple strcmp, which is not good enough for reasons
explained in previous paragraph.
On powerpc this can cause lockup, because even thought we found the
pair, the compared names are different and don't match simple strcmp.
Following code path is executed, that leads to lockup:
- we find the pair in kallsyms by sym->start
next_pair:
- we compare the names and it fails
- we find the pair by sym->name
- the pair addresses match so we call goto next_pair
because we assume the names match in this case
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 031b84c407 ("perf probe ppc: Enable matching against dot symbols automatically")
Link: http://lkml.kernel.org/r/20180215122635.24029-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>