Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf record:
Alexey Budankov:
- Fix binding of AIO user space buffers to nodes
maps:
Dominik b. Czarnota:
- Fix off by one in strncpy() size argument.
Arnaldo Carvalho de Melo:
- Use strstarts() to look for Android libraries.
Ian Rogers:
- Give synthetic mmap events an inode generation.
man pages:
Ian Rogers:
- Set man page date to last git commit.
perf test:
Ian Rogers:
- Print if shell directory isn't present.
perf report:
Jin Yao:
- Fix no branch type statistics report issue.
perf expr:
Jiri Olsa:
- Fix copy/paste mistake
vendor events:
Kan Liang:
- Support metric constraints.
vendor events intel:
Kan Liang:
- Add NO_NMI_WATCHDOG metric constraint.
vendor events s390:
Thomas Richter:
- Add new deflate counters for IBM z15.
ARM cs-etm:
Leo Yan:
- Last branch improvements.
intel-pt:
Adrian Hunter:
- Update intel-pt.txt file with new location of the documentation.
- Add Intel PT man page references.
- Rename intel-pt.txt and put it in man page format.
perl scripting:
Michael Petlan:
- Add common_callchain to fix argument order.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
tools/perf/util/map.c
Previously we could get the report of branch type statistics.
For example:
# perf record -j any,save_type ...
# t perf report --stdio
#
# Branch Statistics:
#
COND_FWD: 40.6%
COND_BWD: 4.1%
CROSS_4K: 24.7%
CROSS_2M: 12.3%
COND: 44.7%
UNCOND: 0.0%
IND: 6.1%
CALL: 24.5%
RET: 24.7%
But now for the recent perf, it can't report the branch type statistics.
It's a regression issue caused by commit 40c39e3046 ("perf report: Fix
a no annotate browser displayed issue"), which only counts the branch
type statistics for browser mode.
This patch moves the branch_type_count() outside of ui__has_annotation()
checking, then branch type statistics can work for stdio mode.
Fixes: 40c39e3046 ("perf report: Fix a no annotate browser displayed issue")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200313134607.12873-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When mmap2 events are synthesized the ino_generation field isn't being
set leading to uninitialized memory being compared.
Caught with clang's -fsanitize=memory:
==124733==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55a96a6a65cc in __dso_id__cmp tools/perf/util/dsos.c:23:6
#1 0x55a96a6a81d5 in dso_id__cmp tools/perf/util/dsos.c:38:9
#2 0x55a96a6a717f in __dso__cmp_long_name tools/perf/util/dsos.c:74:15
#3 0x55a96a6a6c4c in __dsos__findnew_link_by_longname_id tools/perf/util/dsos.c:106:12
#4 0x55a96a6a851e in __dsos__findnew_by_longname_id tools/perf/util/dsos.c:178:9
#5 0x55a96a6a7798 in __dsos__find_id tools/perf/util/dsos.c:191:9
#6 0x55a96a6a7b57 in __dsos__findnew_id tools/perf/util/dsos.c:251:20
#7 0x55a96a6a7a57 in dsos__findnew_id tools/perf/util/dsos.c:259:17
#8 0x55a96a7776ae in machine__findnew_dso_id tools/perf/util/machine.c:2709:9
#9 0x55a96a77dfcf in map__new tools/perf/util/map.c:193:10
#10 0x55a96a77240a in machine__process_mmap2_event tools/perf/util/machine.c:1670:8
#11 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
#12 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
#13 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
#14 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
#15 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
#16 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
#17 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
#18 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
#19 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
#20 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
#21 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
#22 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
#23 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
#24 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
#25 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
#26 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
#27 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
#28 0x55a96a282223 in main tools/perf/perf.c:538:3
Uninitialized value was stored to memory at
#1 0x55a96a6a18f7 in dso__new_id tools/perf/util/dso.c:1230:14
#2 0x55a96a6a78ee in __dsos__addnew_id tools/perf/util/dsos.c:233:20
#3 0x55a96a6a7bcc in __dsos__findnew_id tools/perf/util/dsos.c:252:21
#4 0x55a96a6a7a57 in dsos__findnew_id tools/perf/util/dsos.c:259:17
#5 0x55a96a7776ae in machine__findnew_dso_id tools/perf/util/machine.c:2709:9
#6 0x55a96a77dfcf in map__new tools/perf/util/map.c:193:10
#7 0x55a96a77240a in machine__process_mmap2_event tools/perf/util/machine.c:1670:8
#8 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
#9 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
#10 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
#11 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
#12 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
#13 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
#14 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
#15 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
#16 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
#17 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
#18 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
#19 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
Uninitialized value was stored to memory at
#0 0x55a96a7725af in machine__process_mmap2_event tools/perf/util/machine.c:1646:25
#1 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
#2 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
#3 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
#4 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
#5 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
#6 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
#7 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
#8 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
#9 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
#10 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
#11 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
#12 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
#13 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
#14 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
#15 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
#16 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
#17 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
#18 0x55a96a282223 in main tools/perf/perf.c:538:3
Uninitialized value was created by a heap allocation
#0 0x55a96a22f60d in malloc llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:925:3
#1 0x55a96a882948 in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:655:15
#2 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
#3 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
#4 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
#5 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
#6 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
#7 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
#8 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
#9 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
#10 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
#11 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
#12 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
#13 0x55a96a282223 in main tools/perf/perf.c:538:3
SUMMARY: MemorySanitizer: use-of-uninitialized-value tools/perf/util/dsos.c:23:6 in __dso_id__cmp
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200313053129.131264-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since common_callchain has been added to the argument array, we need to
reflect it in perl-based scripts, because otherwise the following args
would be shifted and thus incorrect. E.g. rw-by-pid and calculation of
read and written bytes:
Before:
read counts by pid:
pid comm # reads bytes_requested bytes_read
------ -------------------- ----------- ---------- ----------
19301 dd 4 424510450039736 0
After:
read counts by pid:
pid comm # reads bytes_requested bytes_read
------ -------------------- ----------- ---------- ----------
19301 dd 4 9536 4341
Committer testing:
To see before after first do:
# perf script record rw-by-pid
^C
Now you'll have a perf.data file to report on, then do before and after
using:
# perf script report rw-by-pid
Anbd notice the bytes_request/bytes_read, as above.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Benjamin Salon <bsalon@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20200311132836.12693-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the man page dates reflect the date the man pages were built.
This patch adjusts the date so that the date is when then man page
last had a commit against it. The date is generated using 'git log'.
Committer testing:
$ git log -1 --pretty="format:%cd" --date=short tools/perf/Documentation/perf-top.txt
2020-01-14
Before:
rm -rf /tmp/build/perf
mkdir -p /tmp/build/perf
make -C tools/perf O=/tmp/build/perf/ install
$ date
Wed 11 Mar 2020 10:21:19 AM -03
$ man perf-top | tail -1
perf 03/11/2020 PERF-TOP(1)
$
After:
rm -rf /tmp/build/perf
mkdir -p /tmp/build/perf
make -C tools/perf O=/tmp/build/perf/ install
$ date
$ date
Wed 11 Mar 2020 10:24:06 AM -03
$ man perf-top | tail -1
perf 2020-01-14 PERF-TOP(1)
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200311052110.23132-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When 'etm->instructions_sample_period' is less than
'tidq->period_instructions', the function cs_etm__sample() cannot handle
this case properly with its logic.
Let's see below flow as an example:
- If we set itrace option '--itrace=i4', then function cs_etm__sample()
has variables with initialized values:
tidq->period_instructions = 0
etm->instructions_sample_period = 4
- When the first packet is coming:
packet->instr_count = 10; the number of instructions executed in this
packet is 10, thus update period_instructions as below:
tidq->period_instructions = 0 + 10 = 10
instrs_over = 10 - 4 = 6
offset = 10 - 6 - 1 = 3
tidq->period_instructions = instrs_over = 6
- When the second packet is coming:
packet->instr_count = 10; in the second pass, assume 10 instructions
in the trace sample again:
tidq->period_instructions = 6 + 10 = 16
instrs_over = 16 - 4 = 12
offset = 10 - 12 - 1 = -3 -> the negative value
tidq->period_instructions = instrs_over = 12
So after handle these two packets, there have below issues:
The first issue is that cs_etm__instr_addr() returns the address within
the current trace sample of the instruction related to offset, so the
offset is supposed to be always unsigned value. But in fact, function
cs_etm__sample() might calculate a negative offset value (in handling
the second packet, the offset is -3) and pass to cs_etm__instr_addr()
with u64 type with a big positive integer.
The second issue is it only synthesizes 2 samples for sample period = 4.
In theory, every packet has 10 instructions so the two packets have
total 20 instructions, 20 instructions should generate 5 samples
(4 x 5 = 20). This is because cs_etm__sample() only calls once
cs_etm__synth_instruction_sample() to generate instruction sample per
range packet.
This patch fixes the logic in function cs_etm__sample(); the basic
idea for handling coming packet is:
- To synthesize the first instruction sample, it combines the left
instructions from the previous packet and the head of the new
packet; then generate continuous samples with sample period;
- At the tail of the new packet, if it has the rest instructions,
these instructions will be left for the sequential sample.
Suggested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Every time synthesize instruction sample, the last branch recording will
be reset. This is fine if the instruction period is big enough, for
example if use the option '--itrace=i100000', the last branch array is
reset for every sample with 100000 instructions per period; before
generate the next instruction sample, there has the sufficient packets
coming to fill the last branch array.
On the other hand, if set a very small period, the packets will be
significantly reduced between two continuous instruction samples, thus
the last branch array is almost empty for new instruction sample by
frequently resetting.
To allow the last branches to work properly for any instruction periods,
this patch avoids to reset the last branch for every instruction sample
and only reset it when flush the trace data. The last branches will be
reset only for two cases, one is for trace starting, another case is for
discontinuous trace; other cases can keep recording last branches for
continuous instruction samples.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some metric groups have metric constraints. A metric group can be
scheduled as a group only when some constraints are applied. For
example, Page_Walks_Utilization has a metric constraint,
"NO_NMI_WATCHDOG".
When NMI watchdog is disabled, the metric group can be scheduled as a
group. Otherwise, splitting the metric group into standalone metrics.
Add a new function, metricgroup__has_constraint(), to check whether all
constraints are applied. If not, splitting the metric group into
standalone metrics.
Currently, only one constraint, "NO_NMI_WATCHDOG", is checked. Print a
warning for the metric group with the constraint, when NMI WATCHDOG is
enabled.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1582581564-184429-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for new deflate counters:
- Counter 247: cycles CPU spent obtaining access to Deflate unit
- Counter 252: cycles CPU is using Deflate unit
- Counter 264: Increments by one for every DEFLATE CONVERSION CALL
instruction executed.
- Counter 265: Increments by one for every DEFLATE CONVERSION CALL
instruction executed that ended in Condition Codes
0, 1 or 2.
Also adjust the some crypto counter description to latest documentation.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200310142937.32045-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A new branch sample type PERF_SAMPLE_BRANCH_HW_INDEX has been introduced
in latest kernel.
Enable HW_INDEX by default in LBR call stack mode.
If kernel doesn't support the sample type, switching it off.
Add HW_INDEX in attr_fprintf as well. User can check whether the branch
sample type is set via debug information or header.
Committer testing:
First collect some samples with LBR callchains, system wide, for a few
seconds:
# perf record --call-graph lbr -a sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.625 MB perf.data (224 samples) ]
#
Now lets use 'perf evlist -v' to look at the branch_sample_type:
# perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
#
So the machine has the kernel feature, and it was correctly added to
perf_event_attr.branch_sample_type, for the default 'cycles' event.
If we do it in another machine, where the kernel lacks the HW_INDEX
feature, we get:
# perf record --call-graph lbr -a sleep 2s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.690 MB perf.data (499 samples) ]
# perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
#
No HW_INDEX in attr.branch_sample_type.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200228163011.19358-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The low level index of raw branch records for the most recent branch can
be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
branch_sample_type. Extend struct branch_stack to support it.
However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
entries[] will be output by kernel. The pointer of entries[] could be
wrong, since the output format is different with new struct
branch_stack. Add a variable no_hw_idx in struct perf_sample to
indicate whether the hw_idx is output. Add get_branch_entry() to return
corresponding pointer of entries[0].
To make dummy branch sample consistent as new branch sample, add hw_idx
in struct dummy_branch_stack for cs-etm and intel-pt.
Apply the new struct branch_stack for synthetic events as well.
Extend test case sample-parsing to support new struct branch_stack.
Committer notes:
Renamed get_branch_entries() to perf_sample__branch_entries() to have
proper namespacing and pave the way for this to be moved to libperf,
eventually.
Add 'static' to that inline as it is in a header.
Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
on arm64.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we put an event with multiple probes, perf-probe fails to delete
with filters. This comes from a failure to list up the event name
because of overwrapping its name.
To fix this issue, skip to list up the event which has same name.
Without this patch:
# perf probe -l \*
probe_perf:map__map_ip (on perf_sample__fprintf_brstackoff:21@
probe_perf:map__map_ip (on perf_sample__fprintf_brstackoff:25@
probe_perf:map__map_ip (on append_inlines:12@util/machine.c in
probe_perf:map__map_ip (on unwind_entry:19@util/machine.c in /
probe_perf:map__map_ip (on map__map_ip@util/map.h in /home/mhi
probe_perf:map__map_ip (on map__map_ip@util/map.h in /home/mhi
# perf probe -d \*
"*" does not hit any event.
Error: Failed to delete events. Reason: No such file or directory (Code: -2)
With it:
# perf probe -d \*
Removed event: probe_perf:map__map_ip
#
Fixes: 72363540c0 ("perf probe: Support multiprobe event")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158287666197.16697.7514373548551863562.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When I tried to compile tools/perf from the top directory with the -C
option, the O= option didn't work correctly if I passed a relative path:
$ make O=BUILD -C tools/perf/
make: Entering directory '/home/mhiramat/ksrc/linux/tools/perf'
BUILD: Doing 'make -j8' parallel build
../scripts/Makefile.include:4: *** O=/home/mhiramat/ksrc/linux/tools/perf/BUILD does not exist. Stop.
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/mhiramat/ksrc/linux/tools/perf'
The O= directory existence check failed because the check script ran in
the build target directory instead of the directory where I ran the make
command.
To fix that, once change directory to $(PWD) and check O= directory,
since the PWD is set to where the make command runs.
Fixes: c883122acc ("perf tools: Let O= makes handle relative paths")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158351957799.3363.15269768530697526765.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 3b2323c2c1 ("perf bench futex: Use cpumaps") the default
number of threads the benchmark uses got changed from number of online
CPUs to zero:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 15930]: blocking on 0 threads (at [private] futex 0x558b8ee4bfac), waking up 1 at a time.
[Run 1]: Wokeup 0 of 0 threads in 0.0000 ms
[...]
[Run 10]: Wokeup 0 of 0 threads in 0.0000 ms
Wokeup 0 of 0 threads in 0.0004 ms (+-40.82%)
Restore the old behavior by grabbing the number of online CPUs via
cpu->nr:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 18356]: blocking on 8 threads (at [private] futex 0xb3e62c), waking up 1 at a time.
[Run 1]: Wokeup 8 of 8 threads in 0.0260 ms
[...]
[Run 10]: Wokeup 8 of 8 threads in 0.0270 ms
Wokeup 8 of 8 threads in 0.0419 ms (+-24.35%)
Fixes: 3b2323c2c1 ("perf bench futex: Use cpumaps")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200305083714.9381-3-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since glibc 2.28 when running 'perf top --stdio', input handling no
longer works, but hitting any key always just prints the "Mapped keys"
help text.
To fix it, call clearerr() in the display_thread() loop to clear any EOF
sticky errors, as instructed in the glibc NEWS file
(https://sourceware.org/git/?p=glibc.git;a=blob;f=NEWS):
* All stdio functions now treat end-of-file as a sticky condition. If you
read from a file until EOF, and then the file is enlarged by another
process, you must call clearerr or another function with the same effect
(e.g. fseek, rewind) before you can read the additional data. This
corrects a longstanding C99 conformance bug. It is most likely to affect
programs that use stdio to read interactive input from a terminal.
(Bug #1190.)
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200305083714.9381-2-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
clang warns:
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string
comparison function instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/map.c:434:15: error: result of comparison against a string literal
is unspecified (use an explicit string comparison function instead)
[-Werror,-Wstring-compare]
if (srcline != SRCLINE_UNKNOWN)
^ ~~~~~~~~~~~~~~~
Reviewer Notes:
Looks good to me. Some more context:
https://clang.llvm.org/docs/DiagnosticsReference.html#wstring-compare
The spec says:
J.1 Unspecified behavior
The following are unspecified:
.. Whether two string literals result in distinct arrays (6.4.5).
Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/900
Link: http://lore.kernel.org/lkml/20200223193456.25291-1-nick.desaulniers@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>