'perf stat' displays miss ratio of L1-dcache, L1-icache, dTLB cache,
iTLB cache and LL-cache. Take L1-dcache for example, miss ratio is
caculated as "L1-dcache-load-misses/L1-dcache-loads". So "of all
L1-dcache hits" is unsuitable to describe it, and "of all L1-dcache
accesses" seems better.
The comments of L1-icache, dTLB cache, iTLB cache and LL-cache are
fixed in the same way.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1600253331-10535-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following leaks were detected by ASAN:
Indirect leak of 360 byte(s) in 9 object(s) allocated from:
#0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
#2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
#3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
#4 0x560578e07045 in test__pmu tests/pmu.c:155
#5 0x560578de109b in run_test tests/builtin-test.c:410
#6 0x560578de109b in test_and_print tests/builtin-test.c:440
#7 0x560578de401a in __cmd_test tests/builtin-test.c:661
#8 0x560578de401a in cmd_test tests/builtin-test.c:807
#9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: cff7f956ec ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail. Also it can fail in the middle like in
resolve_metric() even for single metric.
In those cases, the intermediate list and ids will be leaked like:
Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
#2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
#3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
#4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
#5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
#6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
#7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
#8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
#9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
#10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
#11 0x55f7e70be09b in run_test tests/builtin-test.c:410
#12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: 83de0b7d53 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test_generic_metric() missed to release entries in the pctx. Asan
reported following leak (and more):
Direct leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
#2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
#3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
#4 0x55f7e7341667 in expr__add_ref util/expr.c:120
#5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
#6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
#7 0x55f7e712390b in compute_single tests/parse-metric.c:128
#8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
#9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
#10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
#11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
#12 0x55f7e70be09b in run_test tests/builtin-test.c:410
#13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: 6d432c4c8a ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string. But tool event (duration_time) passes a result of
strdup() caused a leak.
It was found by ASAN during metric test:
Direct leak of 210 byte(s) in 70 object(s) allocated from:
#0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
#2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
#3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
#4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
#5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
#6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
#7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
#8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
#9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
#10 0x559fbbc0109b in run_test tests/builtin-test.c:410
#11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
#12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
#13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
#14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: f0fbb114e3 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The aliases were never released causing the following leaks:
Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
#0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
#1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
#2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
#3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
#4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
#5 0x56332c76a09b in run_test tests/builtin-test.c:410
#6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
#7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
#8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
#9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: 956a78356c ("perf test: Test pmu-events aliases")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If events in a group explicitly set a frequency or period with leader
sampling, don't disable the samples on those events.
Prior to 5.8:
perf record -e '{cycles/period=12345000/,instructions/period=6789000/}:S'
would clear the attributes then apply the config terms. In commit
5f34278867 leader sampling configuration was moved to after applying the
config terms, in the example, making the instructions' event have its period
cleared.
This change makes it so that sampling is only disabled if configuration
terms aren't present.
Committer testing:
Before:
# perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.051 MB perf.data (6 samples) ]
#
# perf evlist -v
cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
instructions/period=2/: size: 120, config: 0x1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
#
After:
# perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 0.0001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.052 MB perf.data (4 samples) ]
# perf evlist -v
cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
instructions/period=2/: size: 120, config: 0x1, { sample_period, sample_freq }: 2, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
#
Fixes: 5f34278867 ("perf evlist: Move leader-sampling configuration")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When compiling with DEBUG=1 on Fedora 32 I'm getting crash for 'perf
test signal':
Program received signal SIGSEGV, Segmentation fault.
0x0000000000c68548 in __test_function ()
(gdb) bt
#0 0x0000000000c68548 in __test_function ()
#1 0x00000000004d62e9 in test_function () at tests/bp_signal.c:61
#2 0x00000000004d689a in test__bp_signal (test=0xa8e280 <generic_ ...
#3 0x00000000004b7d49 in run_test (test=0xa8e280 <generic_tests+1 ...
#4 0x00000000004b7e7f in test_and_print (t=0xa8e280 <generic_test ...
#5 0x00000000004b8927 in __cmd_test (argc=1, argv=0x7fffffffdce0, ...
...
It's caused by the symbol __test_function being in the ".bss" section:
$ readelf -a ./perf | less
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[28] .bss NOBITS 0000000000c356a0 008346a0
00000000000511f8 0000000000000000 WA 0 0 32
$ nm perf | grep __test_function
0000000000c68548 B __test_function
I guess most of the time we're just lucky the inline asm ended up in the
".text" section, so making it specific explicit with push and pop
section clauses.
$ readelf -a ./perf | less
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[13] .text PROGBITS 0000000000431240 00031240
0000000000306faa 0000000000000000 AX 0 0 16
$ nm perf | grep __test_function
00000000004d62c8 T __test_function
Committer testing:
$ readelf -wi ~/bin/perf | grep producer -m1
<c> DW_AT_producer : (indirect string, offset: 0x254a): GNU C99 10.2.1 20200723 (Red Hat 10.2.1-1) -mtune=generic -march=x86-64 -ggdb3 -std=gnu99 -fno-omit-frame-pointer -funwind-tables -fstack-protector-all
^^^^^
^^^^^
^^^^^
$
Before:
$ perf test signal
20: Breakpoint overflow signal handler : FAILED!
$
After:
$ perf test signal
20: Breakpoint overflow signal handler : Ok
$
Fixes: 8fd34e1cce ("perf test: Improve bp_signal")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200911130005.1842138-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To address these errors found when cross building from x86_64 to MIPS
little endian 32-bit:
CC /tmp/build/perf/util/parse-events-bison.o
util/parse-events.y: In function 'parse_events_parse':
util/parse-events.y:514:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
514 | (void *) $2, $6, $4);
| ^
util/parse-events.y:531:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
531 | (void *) $2, NULL, $4)) {
| ^
util/parse-events.y:547:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
547 | (void *) $2, $4, 0);
| ^
util/parse-events.y:564:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
564 | (void *) $2, NULL, 0)) {
| ^
Fixes: cabbf26821 ("perf parse: Before yyabort-ing free components")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For a while we need to have a dummy event for doing things like
receiving PERF_RECORD_COMM, PERF_RECORD_EXEC, etc for threads being
created and dying while we synthesize the pre-existing ones at tool
start.
This 'dummy' event is needed for keeping track of thread lifetime events
early in the session but are uninteresting otherwise, i.e. no need to
have it in a initial events menu for the non-grouped case, i.e. for:
# perf top -e cycles,instructions
or even for plain:
# perf top
When 'cycles' and that 'dummy' event are in place.
The code to remove that 'dummy' event ended up creating an endless loop
for the grouped case, i.e.:
# perf top -e '{cycles,instructions}'
Fix it.
Fixes: bee9ca1c8a ("perf report TUI: Remove needless 'dummy' event from menu")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a compile error on F32 and gcc version 10.1 on s390 in file
utils/stat-display.c. The error does not show up with make DEBUG=y. In
fact the issue shows up when using both compiler options -O6 and
-D_FORTIFY_SOURCE=2 (which are omitted with DEBUG=Y).
This is the offending call chain:
print_counter_aggr()
printout(config, -1, 0, ...) with 2nd parm id set to -1
aggr_printout(config, x, id --> -1, ...) which leads to this code:
case AGGR_NONE:
if (evsel->percore && !config->percore_show_thread) {
....
} else {
fprintf(config->output, "CPU%*d%s",
config->csv_output ? 0 : -7,
evsel__cpus(evsel)->map[id],
^^ id is -1 !!!!
config->csv_sep);
}
This is a compiler inlining issue which is detected on s390 but not on
other plattforms.
Output before:
# make util/stat-display.o
.....
util/stat-display.c: In function ‘perf_evlist__print_counters’:
util/stat-display.c:121:4: error: array subscript -1 is below array
bounds of ‘int[]’ [-Werror=array-bounds]
121 | fprintf(config->output, "CPU%*d%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122 | config->csv_output ? 0 : -7,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123 | evsel__cpus(evsel)->map[id],
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124 | config->csv_sep);
| ~~~~~~~~~~~~~~~~
In file included from util/evsel.h:13,
from util/evlist.h:13,
from util/stat-display.c:9:
/root/linux/tools/lib/perf/include/internal/cpumap.h:10:7:
note: while referencing ‘map’
10 | int map[];
| ^~~
cc1: all warnings being treated as errors
mv: cannot stat 'util/.stat-display.o.tmp': No such file or directory
make[3]: *** [/root/linux/tools/build/Makefile.build:97: util/stat-display.o]
Error 1
make[2]: *** [Makefile.perf:716: util/stat-display.o] Error 2
make[1]: *** [Makefile.perf:231: sub-make] Error 2
make: *** [Makefile:110: util/stat-display.o] Error 2
[root@t35lp46 perf]#
Output after:
# make util/stat-display.o
.....
CC util/stat-display.o
[root@t35lp46 perf]#
Committer notes:
Removed the removal of {} enclosing the multiline else block, as pointed
out by Jiri Olsa.
Suggested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200825063304.77733-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linux 5.9 introduced perf test case "Parse and process metrics" and
on s390 this test case always dumps core:
[root@t35lp67 perf]# ./perf test -vvvv -F 67
67: Parse and process metrics :
--- start ---
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
parsing metric: inst_retired.any / cpu_clk_unhalted.thread
Segmentation fault (core dumped)
[root@t35lp67 perf]#
I debugged this core dump and gdb shows this call chain:
(gdb) where
#0 0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6
#1 0x000003ffabc293de in strcasestr () from /lib64/libc.so.6
#2 0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any",
n=<optimized out>)
at util/metricgroup.c:368
#3 find_metric (map=<optimized out>, map=<optimized out>,
metric=0x1e6ea20 "inst_retired.any")
at util/metricgroup.c:765
#4 __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0,
metric_no_group=<optimized out>, m=<optimized out>)
at util/metricgroup.c:844
#5 resolve_metric (ids=0x0, map=0x0, metric_list=0x0,
metric_no_group=<optimized out>)
at util/metricgroup.c:881
#6 metricgroup__add_metric (metric=<optimized out>,
metric_no_group=metric_no_group@entry=false, events=<optimized out>,
events@entry=0x3ffd84fb878, metric_list=0x0,
metric_list@entry=0x3ffd84fb868, map=0x0)
at util/metricgroup.c:943
#7 0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>,
metric_list=0x3ffd84fb868, events=0x3ffd84fb878,
metric_no_group=<optimized out>, list=<optimized out>)
at util/metricgroup.c:988
#8 parse_groups (perf_evlist=perf_evlist@entry=0x1e70260,
str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>,
metric_no_merge=<optimized out>,
fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>,
metric_events=0x3ffd84fba58, map=0x1)
at util/metricgroup.c:1040
#9 0x0000000001103eb2 in metricgroup__parse_groups_test(
evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>,
str=str@entry=0x12f34b2 "IPC",
metric_no_group=metric_no_group@entry=false,
metric_no_merge=metric_no_merge@entry=false,
metric_events=0x3ffd84fba58)
at util/metricgroup.c:1082
#10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0,
ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC",
vals=0x3ffd84fbad8, name=0x12f34b2 "IPC")
at tests/parse-metric.c:159
#11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8,
name=0x12f34b2 "IPC")
at tests/parse-metric.c:189
#12 test_ipc () at tests/parse-metric.c:208
.....
..... omitted many more lines
This test case was added with
commit 218ca91df4 ("perf tests: Add parse metric test for frontend metric").
When I compile with make DEBUG=y it works fine and I do not get a core dump.
It turned out that the above listed function call chain worked on a struct
pmu_event array which requires a trailing element with zeroes which was
missing. The marco map_for_each_event() loops over that array tests for members
metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes
the issue.
Output after:
[root@t35lp46 perf]# ./perf test 67
67: Parse and process metrics : Ok
[root@t35lp46 perf]#
Committer notes:
As Ian remarks, this is not s390 specific:
<quote Ian>
This also shows up with address sanitizer on all architectures
(perhaps change the patch title) and perhaps add a "Fixes: <commit>"
tag.
=================================================================
==4718==ERROR: AddressSanitizer: global-buffer-overflow on address
0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp
0x7ffd24327c58
READ of size 8 at 0x55c93b4d59e8 thread T0
#0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2
#1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9
#2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9
#3 0x55c93a1528db in metricgroup__add_metric
tools/perf/util/metricgroup.c:943:9
#4 0x55c93a151996 in metricgroup__add_metric_list
tools/perf/util/metricgroup.c:988:9
#5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8
#6 0x55c93a1513e1 in metricgroup__parse_groups_test
tools/perf/util/metricgroup.c:1082:9
#7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8
#8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9
#9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2
#10 0x55c93a00f1e8 in test__parse_metric
tools/perf/tests/parse-metric.c:345:2
#11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9
#12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9
#13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4
#14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9
#15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11
#16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8
#17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2
#18 0x55c939e45f7e in main tools/perf/perf.c:539:3
0x55c93b4d59e8 is located 0 bytes to the right of global variable
'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25'
(0x55c93b4d54a0) of size 1352
SUMMARY: AddressSanitizer: global-buffer-overflow
tools/perf/util/metricgroup.c:764:2 in find_metric
Shadow bytes around the buggy address:
0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9
0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
</quote>
I'm also adding the missing "Fixes" tag and setting just .name to NULL,
as doing it that way is more compact (the compiler will zero out
everything else) and the table iterators look for .name being NULL as
the sentinel marking the end of the table.
Fixes: 0a507af9c6 ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently if we run 'perf record -e cycles:u', exclude_guest=0.
But it doesn't make sense in most cases that we request for
user-space counting but we also get the guest report.
Of course, we also need to consider 'perf kvm' usage case that
authorized perf users on the host may only want to count guest user
space events. For example,
# perf kvm --guest record -e cycles:u
When we have 'exclude_guest=1' for 'perf kvm' usage, we may get nothing
from guest events.
To keep perf semantics consistent and clear, this patch sets
exclude_guest=1 for user-space counting but except for 'perf kvm' usage.
Before:
perf record -e cycles:u ./div
perf evlist -v
cycles:u: ..., exclude_kernel: 1, exclude_hv: 1, ...
After:
perf record -e cycles:u ./div
perf evlist -v
cycles:u: ..., exclude_kernel: 1, exclude_hv: 1, exclude_guest: 1, ...
Before:
perf kvm --guest record -e cycles:u -vvv
perf_event_attr:
size 120
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
freq 1
sample_id_all 1
After:
perf kvm --guest record -e cycles:u -vvv
perf_event_attr:
size 120
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
freq 1
sample_id_all 1
For Before/After, exclude_guest are both 0 for perf kvm usage.
perf test 6
6: Parse event definition strings : Ok
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Like Xu <like.xu@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200814012120.16647-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "mwait_idle_with_hints" one was already there, some compiler
artifact now adds this ".constprop.0" suffix, cover that one too.
At some point we need to put these in a special bucket and show it
somewhere on the screen.
Noticed building the kernel on a fedora:32 system using:
gcc version 10.2.1 20200723 (Red Hat 10.2.1-1) (GCC)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When I execute 'perf top' without HAVE_LIBBPF_SUPPORT, there exists the
following segmentation fault, skip the side-band event setup to fix it,
this is similar with commit 1101c872c8 ("perf record: Skip side-band
event setup if HAVE_LIBBPF_SUPPORT is not set").
[yangtiezhu@linux perf]$ ./perf top
<SNIP>
perf: Segmentation fault
Obtained 6 stack frames.
./perf(sighandler_dump_stack+0x5c) [0x12011b604]
[0xffffffc010]
./perf(perf_mmap__read_init+0x3e) [0x1201feeae]
./perf() [0x1200d715c]
/lib64/libpthread.so.0(+0xab9c) [0xffee10ab9c]
/lib64/libc.so.6(+0x128f4c) [0xffedc08f4c]
Segmentation fault
[yangtiezhu@linux perf]$
I use git bisect to find commit b38d85ef49 ("perf bpf: Decouple
creating the evlist from adding the SB event") is the first bad commit,
so also add the Fixes tag.
Committer testing:
First build perf explicitely disabling libbpf:
$ make NO_LIBBPF=1 O=/tmp/build/perf -C tools/perf install-bin && perf test python
Now make sure it isn't linked:
$ perf -vv | grep -w bpf
bpf: [ OFF ] # HAVE_LIBBPF_SUPPORT
$
$ nm ~/bin/perf | grep libbpf
$
And now try to run 'perf top':
# perf top
perf: Segmentation fault
-------- backtrace --------
perf[0x5bcd6d]
/lib64/libc.so.6(+0x3ca6f)[0x7fd0f5a66a6f]
perf(perf_mmap__read_init+0x1e)[0x5e1afe]
perf[0x4cc468]
/lib64/libpthread.so.0(+0x9431)[0x7fd0f645a431]
/lib64/libc.so.6(clone+0x42)[0x7fd0f5b2b912]
#
Applying this patch fixes the issue.
Fixes: b38d85ef49 ("perf bpf: Decouple creating the evlist from adding the SB event")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1597753837-16222-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
BPF basic filtering test fails on s390x (when vmlinux debuginfo is
utilized instead of /proc/kallsyms)
Info:
- bpf_probe_load installs the bpf code at do_epoll_wait.
- For s390x, do_epoll_wait resolves to 3 functions including inlines.
found inline addr: 0x43769e
Probe point found: __s390_sys_epoll_wait+6
found inline addr: 0x437290
Probe point found: do_epoll_wait+0
found inline addr: 0x4375d6
Probe point found: __se_sys_epoll_wait+6
- add_bpf_event creates evsel for every probe in a BPF object. This
results in 3 evsels.
Solution:
- Expected result = 50% of the samples to be collected from epoll_wait *
number of entries present in the evlist.
Committer testing:
# perf test 42
42: BPF filter :
42.1: Basic BPF filtering : Ok
42.2: BPF pinning : Ok
42.3: BPF prologue generation : Ok
42.4: BPF relocation checker : Ok
#
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: bpf@vger.kernel.org
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
LPU-Reference: 20200817072754.58344-1-sumanthk@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull more perf tools updates from Arnaldo Carvalho de Melo:
"Fixes:
- Fixes for 'perf bench numa'.
- Always memset source before memcpy in 'perf bench mem'.
- Quote CC and CXX for their arguments to fix build in environments
using those variables to pass more than just the compiler names.
- Fix module symbol processing, addressing regression detected via
"perf test".
- Allow multiple probes in record+script_probe_vfs_getname.sh 'perf
test' entry.
Improvements:
- Add script to autogenerate socket family name id->string table from
copy of kernel header, used so far in 'perf trace'.
- 'perf ftrace' improvements to provide similar options for this
utility so that one can go from 'perf record', 'perf trace', etc to
'perf ftrace' just by changing the name of the subcommand.
- Prefer new "sched:sched_waking" trace event when it exists in 'perf
sched' post processing.
- Update POWER9 metrics to utilize other metrics.
- Fall back to querying debuginfod if debuginfo not found locally.
Miscellaneous:
- Sync various kvm headers with kernel sources"
* tag 'perf-tools-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (40 commits)
perf ftrace: Make option description initials all capital letters
perf build-ids: Fall back to debuginfod query if debuginfo not found
perf bench numa: Remove dead code in parse_nodes_opt()
perf stat: Update POWER9 metrics to utilize other metrics
perf ftrace: Add change log
perf: ftrace: Add set_tracing_options() to set all trace options
perf ftrace: Add option --tid to filter by thread id
perf ftrace: Add option -D/--delay to delay tracing
perf: ftrace: Allow set graph depth by '--graph-opts'
perf ftrace: Add support for trace option tracing_thresh
perf ftrace: Add option 'verbose' to show more info for graph tracer
perf ftrace: Add support for tracing option 'irq-info'
perf ftrace: Add support for trace option funcgraph-irqs
perf ftrace: Add support for trace option sleep-time
perf ftrace: Add support for tracing option 'func_stack_trace'
perf tools: Add general function to parse sublevel options
perf ftrace: Add option '--inherit' to trace children processes
perf ftrace: Show trace column header
perf ftrace: Add option '-m/--buffer-size' to set per-cpu buffer size
perf ftrace: Factor out function write_tracing_file_int()
...
During a perf-record, use the -ldebuginfod API to query a debuginfod
server, should the debug data not be found in the usual system
locations. If successful, the usual $HOME/.debug dir is populated.
Tested with:
$ find .
.
./ctags-debuginfo-5.8-26.fc31.x86_64.rpm
./usr
./usr/lib
./usr/lib/debug
./usr/lib/debug/.build-id
./usr/lib/debug/.build-id/ca
./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d
./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d.debug
./usr/lib/debug/usr
./usr/lib/debug/usr/bin
./usr/lib/debug/usr/bin/ctags-5.8-26.fc31.x86_64.debug
$ debuginfod -F .
...
$ rm -rf ~/.debug/ ; mkdir ~/.debug
$ perf record make tags
BUILD: Doing 'make -j8' parallel build
GEN tags
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.107 MB perf.data (1483 samples) ]
$ find ~/.debug | grep ctags
/home/jolsa/.debug/usr/bin/ctags
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes
$ rm -rf ~/.debug/ ; mkdir ~/.debug
$ DEBUGINFOD_URLS=http://localhost:8002 perf record make tags
BUILD: Doing 'make -j8' parallel build
GEN tags
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (1531 samples) ]
$ find ~/.debug | grep ctag
/home/jolsa/.debug/usr/bin/ctags
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/debug
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes
Note the 'debug' file is created in the last run.
Note that currently the debuginfo data are downloaded only on record path,
we still need add this support to perf build-id/report.. and test ;-)
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>