Perf stat doesn't count the uncore event aliases from the same uncore
block in a group, for example:
perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000
# time counts unit events
1.000447342 <not counted> unc_m_cas_count.all
1.000447342 <not counted> unc_m_clockticks
2.000740654 <not counted> unc_m_cas_count.all
2.000740654 <not counted> unc_m_clockticks
The output is very misleading. It gives a wrong impression that the
uncore event doesn't work.
An uncore block could be composed by several PMUs. An uncore event alias
is a joint name which means the same event runs on all PMUs of a block.
Perf doesn't support mixed events from different PMUs in the same group.
It is wrong to put uncore event aliases in a big group.
The right way is to split the big group into multiple small groups which
only include the events from the same PMU.
Only uncore event aliases from the same uncore block should be specially
handled here. It doesn't make sense to mix the uncore events with other
uncore events from different blocks or even core events in a group.
With the patch:
# time counts unit events
1.001557653 140,833 unc_m_cas_count.all
1.001557653 1,330,231,332 unc_m_clockticks
2.002709483 85,007 unc_m_cas_count.all
2.002709483 1,429,494,563 unc_m_clockticks
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Updates to the handling of expedited grace periods, perhaps most
notably parallelizing their initialization. Other changes
include fixes from Boqun Feng.
- Miscellaneous fixes. These include an nvme fix from Nitzan Carmi
that I am carrying because it depends on a new SRCU function
cleanup_srcu_struct_quiesced(). This branch also includes fixes
from Byungchul Park and Yury Norov.
- Updates to reduce lock contention in the rcu_node combining tree.
These are in preparation for the consolidation of RCU-bh,
RCU-preempt, and RCU-sched into a single flavor, which was
requested by Linus Torvalds in response to a security flaw
whose root cause included confusion between the multiple flavors
of RCU.
- Torture-test updates that save their users some time and effort.
Conflicts:
drivers/nvme/host/core.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, kvm-find-errors.sh looks only for build errors ("error:"),
so this commit makes it also locate build warnings ("warning:").
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
With the addition of the end-of-test state, it is not uncommon for the
kvm.sh summary lines to overflow 80 characters. This commit therefore
applies abbreviations in order to make the line fit into 80 characters
with high probability.
And yes, I did make heavy use of punched cards back in the day, so 80
columns it is for my xterms! ;-)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
This commit adds the end-of-test test, if present in the console output,
to the kvm.sh test summary that is printed by kvm-recheck.sh. Note that
this only applies to rcutorture console output.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
The rcutorture scripting scans the console output twice, once to look
for various sorts of hangs and again to find warnings and panics.
Unfortunately, only the output of the second scan gets written to the
console.log.diags file, which can cause hangs to be overlooked.
This commit therefore folds the parse-torture.sh script (which looks
for hangs) into the parse-console.sh script (which looks for warnings
and panics). This allows both types of failure information to be
added to console.log.diags, while still reliably removing this file
when it proves to be empty.
This also fixes a long-standing bug where rcuperf log files would
unconditionally complain about a hang.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
This commit adds a script that allows viewing the build and/or
console output from failed rcutorture, locktorture, or rcuperf runs.
This replaces a time-honored but inefficient manual procedure that uses
cut and paste.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
So that kprobe definitions become:
int probe(function, variables)(void *ctx, int err, var1, var2, ...)
The existing 5sec.c, got converted and goes from:
SEC("func=hrtimer_nanosleep rqtp->tv_sec")
int func(void *ctx, int err, long sec)
{
}
To:
int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec)
{
}
If we decide to add tv_nsec as well, then it becomes:
$ cat tools/perf/examples/bpf/5sec.c
#include <bpf.h>
int probe(hrtimer_nanosleep, rqtp->tv_sec rqtp->tv_nsec)(void *ctx, int err, long sec, long nsec)
{
return sec == 5;
}
license(GPL);
$
And if we run it, system wide as before and run some 'sleep' with values
for the tv_nsec field, we get:
# perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c
0.000 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=100000000
9641.650 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=123450001
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1v9r8f6ds5av0w9pcwpeknyl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Description:
. Disable strace like syscall tracing (--no-syscalls), or try tracing
just some (-e *sleep).
. Attach a filter function to a kernel function, returning when it should
be considered, i.e. appear on the output:
$ cat tools/perf/examples/bpf/5sec.c
#include <bpf.h>
SEC("func=hrtimer_nanosleep rqtp->tv_sec")
int func(void *ctx, int err, long sec)
{
return sec == 5;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
$
. Run it system wide, so that any sleep of >= 5 seconds and < than 6
seconds gets caught.
. Ask for callgraphs using DWARF info, so that userspace can be unwound
. While this is running, run something like "sleep 5s".
# perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c/call-graph=dwarf/
0.000 perf_bpf_probe:func:(ffffffff9811b5f0) tv_sec=5
hrtimer_nanosleep ([kernel.kallsyms])
__x64_sys_nanosleep ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
__GI___nanosleep (/usr/lib64/libc-2.26.so)
rpl_nanosleep (/usr/bin/sleep)
xnanosleep (/usr/bin/sleep)
main (/usr/bin/sleep)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/sleep)
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2nmxth2l2h09f9gy85lyexcq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So, the first helper is the one shortening a variable/function section
attribute, from, for instance:
char _license[] __attribute__((section("license"), used)) = "GPL";
to:
char _license[] SEC("license") = "GPL";
Convert empty.c to that and it becomes:
# cat ~acme/lib/examples/perf/bpf/empty.c
#include <bpf.h>
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zmeg52dlvy51rdlhyumfl5yf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The first one is the bare minimum that bpf infrastructure accepts before
it expects actual events to be set up:
$ cat tools/perf/examples/bpf/empty.c
char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
$
If you remove that "version" line, then it will be refused with:
# perf trace -e tools/perf/examples/bpf/empty.c
event syntax error: 'tools/perf/examples/bpf/empty.c'
\___ Failed to load tools/perf/examples/bpf/empty.c from source: 'version' section incorrect or lost
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
The next ones will, step by step, show simple filters, then the needs
for headers will be made clear, it will be put in place and tested with
new examples, rinse, repeat.
Back to using this first one to test the perf+bpf infrastructure:
If we run it will fail, as no functions are present connecting with,
say, a tracepoint or a function using the kprobes or uprobes
infrastructure:
# perf trace -e tools/perf/examples/bpf/empty.c
WARNING: event parser found nothing
invalid or unsupported event: 'tools/perf/examples/bpf/empty.c'
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
But, if we set things up to dump the generated object file to a file,
and do this after having run 'make install', still on the developer's
$HOME directory:
# cat ~/.perfconfig
[llvm]
dump-obj = true
#
# perf trace -e ~acme/lib/examples/perf/bpf/empty.c
LLVM: dumping /home/acme/lib/examples/perf/bpf/empty.o
WARNING: event parser found nothing
invalid or unsupported event: '/home/acme/lib/examples/perf/bpf/empty.c'
<SNIP>
#
We can look at the dumped object file:
# ls -la ~acme/lib/examples/perf/bpf/empty.o
-rw-r--r--. 1 root root 576 May 4 12:10 /home/acme/lib/examples/perf/bpf/empty.o
# file ~acme/lib/examples/perf/bpf/empty.o
/home/acme/lib/examples/perf/bpf/empty.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), not stripped
# readelf -sw ~acme/lib/examples/perf/bpf/empty.o
Symbol table '.symtab' contains 3 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 _license
2: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 4 _version
#
# tools/bpf/bpftool/bpftool --pretty ~acme/lib/examples/perf/bpf/empty.o
null
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-y7dkhakejz3013o0w21n98xd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
--build-id may not be a default linker config.
Make sure it's used when linking urandom_read test program.
Otherwise test_stacktrace_build_id[_nmi] tests will be failling.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix segfault when processing unknown threads in cs-etm (Leo Yan)
- Fix "perf test inet_pton" on s390 failing due to missing inline (Thomas Richter)
- Display all available events on 'perf annotate --stdio' (Jin Yao)
- Add missing newline when parsing empty BPF proggie (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Typically a switch table can be found by detecting a .rodata access
followed an indirect jump:
1969: 4a 8b 0c e5 00 00 00 mov 0x0(,%r12,8),%rcx
1970: 00
196d: R_X86_64_32S .rodata+0x438
1971: e9 00 00 00 00 jmpq 1976 <dispc_runtime_suspend+0xb6a>
1972: R_X86_64_PC32 __x86_indirect_thunk_rcx-0x4
Randy Dunlap reported a case (seen with GCC 4.8) where the .rodata
access uses RIP-relative addressing:
19bd: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # 19c4 <dispc_runtime_suspend+0xbb8>
19c0: R_X86_64_PC32 .rodata+0x45c
19c4: e9 00 00 00 00 jmpq 19c9 <dispc_runtime_suspend+0xbbd>
19c5: R_X86_64_PC32 __x86_indirect_thunk_rdi-0x4
In this case the relocation addend needs to be adjusted accordingly in
order to find the location of the switch table.
The fix is for case 3 (as described in the comments), but also make the
existing case 1 & 2 checks more precise by only adjusting the addend for
R_X86_64_PC32 relocations.
This fixes the following warnings:
drivers/video/fbdev/omap2/omapfb/dss/dispc.o: warning: objtool: dispc_runtime_suspend()+0xbb8: sibling call from callable instruction with modified stack frame
drivers/video/fbdev/omap2/omapfb/dss/dispc.o: warning: objtool: dispc_runtime_resume()+0xcc5: sibling call from callable instruction with modified stack frame
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b6098294fd67afb69af8c47c9883d7a68bf0f8ea.1526305958.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add test cases where we combine semi-random imm values, mainly for testing
JITs when they have different encoding options for 64 bit immediates in
order to reduce resulting image size.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This new test captures stackmap with build_id with hardware event
PERF_COUNT_HW_CPU_CYCLES.
Because we only support one ips-to-build_id lookup per cpu in NMI
context, stack_amap will not be able to do the lookup in this test.
Therefore, we didn't do compare_stack_ips(), as it will alwasy fail.
urandom_read.c is extended to run configurable cycles so that it can be
caught by the perf event.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The exec_target binary could segfault calling _exit(2) because r13
is not set up properly (and libc looks at that when performing a
syscall). Call SYS_exit using syscall(2) which doesn't seem to
have this problem.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>