Attempting to build the perf tool for an arm64 target results in the
following failure:
arch/arm64/util/unwind-libunwind.c: In function 'libunwind__arch_reg_id':
arch/arm64/util/unwind-libunwind.c:77:3: error: implicit declaration of function 'pr_err'
pr_err("unwind: invalid reg id %d\n", regnum);
^
arch/arm64/util/unwind-libunwind.c:77:3: error: nested extern declaration of 'pr_err'
This is due to commit 84f5d36f48 ("perf tools: Move pr_* debug macros
into debug object") moving the pr_* macros into a new header file, but
failing to update architectures other than x86.
This patch adds the missing include, and fixes the build again.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1412076432-22045-1-git-send-email-will.deacon@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With workload that spawns and destroys many threads and processes, it
was found that perf-mem could took a long time to post-process the perf
data after the target workload had completed its operation.
The performance bottleneck was found to be the lookup and insertion of
the new DSO structures (thousands of them in this case).
In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread), the
perf profile below shows what perf was doing after the profiled AIM7
shared workload completed:
- 83.94% perf libc-2.11.3.so [.] __strcmp_sse42
- __strcmp_sse42
- 99.82% map__new
machine__process_mmap_event
perf_session_deliver_event
perf_session__process_event
__perf_session__process_events
cmd_record
cmd_mem
run_builtin
main
__libc_start_main
- 13.17% perf perf [.] __dsos__findnew
__dsos__findnew
map__new
machine__process_mmap_event
perf_session_deliver_event
perf_session__process_event
__perf_session__process_events
cmd_record
cmd_mem
run_builtin
main
__libc_start_main
So about 97% of CPU times were spent in the map__new() function trying
to insert new DSO entry into the DSO linked list. The whole
post-processing step took about 9 minutes.
The DSO structures are currently searched linearly. So the total
processing time will be proportional to n^2.
To overcome this performance problem, the DSO code is modified to also
put the DSO structures in a RB tree sorted by its long name in
additional to being in a simple linked list. With this change, the
processing time will become proportional to n*log(n) which will be much
quicker for large n. However, the short name will still be searched
using the old linear searching method. With that patch in place, the
same perf-mem post-processing step took less than 30 seconds to
complete.
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Link: http://lkml.kernel.org/r/1412098575-27863-3-git-send-email-Waiman.Long@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When given the number of threads to requeue at once by user input,
there's always the risk of this value being larger than the total number
of threads. This doesn't make any sense, and the kernel can easily deal
with such sort of situations, hence no big deal. We should however
prevent bogus output such as:
./perf bench --repeat 2 futex requeue -q 10
Run summary [PID 22210]: Requeuing 4 threads (from [private] 0x99ef3c to 0x99ef38), 10 at a time.
[Run 1]: Requeued 10 of 4 threads in 0.0040 ms
[Run 2]: Requeued 10 of 4 threads in 0.0030 ms
Requeued 10 of 4 threads in 0.0035 ms (+-14.29%)
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Link: http://lkml.kernel.org/r/1412008868-22328-2-git-send-email-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using 'perf trace' for mmap is truncating return values by stripping the
top 32 bits, actually printing only the lower 32 bits.
This was because the ret value was of an 'int' type and not a 'long'
type.
The Problem:
991258501.244 ( 0.004 ms): mmap(len: 40001536, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x56691000
991258501.257 ( 0.000 ms): minfault [_int_malloc+0x1038] => //anon@0x7fa056691008 //(d.)
The first line shows an mmap, which succeeds and returns 0x56691000.
However the next line shows a memory access to that virtual memory area,
specifically to 0x7fa056691008. The upper 32 bit is lost due to the
problem mentioned above, and thus mmap's return value didn't have the
upper 0x7fa0.
Tested on 3.17-rc5 from the linus's tree, and the HEAD of tip/master
Signed-off-by: Chang Hyun Park <heartinpiece@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1411736041-8017-1-git-send-email-heartinpiece@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On systems with more than one socket perf stat --per-core would either
segfault or stop before outputting all cores.
The problem was that the output code referenced the id including the
socket number in the higher bits, which is far beyond any per cpu array.
Mask out the socket number before referencing cpus in abs_printout.
I also renamed the variable in nsec_printout to be clear what it is,
even though it doesn't reference cpus.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1411591846-32736-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We will use this in perf's evlist class so that it can, at
fdarray__filter() time, to unmap the associated ring buffer.
We may need to have further info associated with each fdarray entry, in
that case we'll make that int array a 'union fdarray_priv' one and put a
pointer there so that users can stash whatever they want there. For now,
an int is enough tho.
v2: Add clarification to the per array entry priv area, as well as make
it a union, which makes usage a bit longer, but if/when we make it
use more space by allowing per entry pointers existing users source
code will not have to be changed, just rebuilt.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-0p00bn83quck3fio3kcs9vca@git.kernel.org
The extensible file description array that grew in the perf_evlist class
can be useful for other tools, as it is not something that only evlists
need, so move it to tools/lib/api/fd to ease sharing it.
v2: Don't use {} like in:
libapi_dirs:
$(QUIET_MKDIR)mkdir -p $(OUTPUT){fs,fd}/
in Makefiles, as it will not work in some systems, as in ubuntu13.10.
v3: Add fd/*.[ch] to LIBAPIKFS_SOURCES (Fix from Jiri Olsa)
v4: Leave the fcntl(fd, O_NONBLOCK) in the evlist layer, remains to
be checked if it is really needed there, but has no place in the
fdarray class (Fix from Jiri Olsa)
v5: Remove evlist details from fdarray grow/filter tests. Improve it a
bit doing more tests about expected internal state.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-kleuni3hckbc3s0lu6yb9x40@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We want to know when the fd went away, like when a monitored thread
exits.
If we do not monitor such events, then the tools will wait forever on
events from a vanished thread, like when running:
$ sleep 5s &
$ perf record -p `pidof sleep`
This builds upon the kernel patch by Jiri Olsa that actually makes a
poll on those file descriptors to return POLLHUP.
It is also needed to change the tools to use
perf_evlist__filter_pollfd() to check if there are remainings fds to
monitor or if all are gone, in which case they will exit the
poll/mmap/read loop.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-a4fslwspov0bs69nj825hqpq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Do not access kallsyms to show available variables and show source lines
in user binaries.
This behavior always requires the root privilege when sysctl sets
kernel.kptr_restrict=1, but we don't need it just for analyzing user
binaries.
Without this patch (by normal user, kptr_restrict=1):
----
$ perf probe -x ./perf -V add_cmdname
Failed to init vmlinux path.
Error: Failed to show vars.
$ perf probe -x ./perf -L add_cmdname
Failed to init vmlinux path.
Error: Failed to show lines.
----
With this patch:
----
$ perf probe -x ./perf -V add_cmdname
Available variables at add_cmdname
@<perf_unknown_cmd_config+144>
(No matched variables)
@<list_commands_in_dir+160>
(No matched variables)
@<add_cmdname+0>
char* name
size_t len
struct cmdnames* cmds
$ perf probe -x ./perf -L add_cmdname
<add_cmdname@/home/fedora/ksrc/linux-3/tools/perf/util/help.c:0>
0 void add_cmdname(struct cmdnames *cmds, const char *name, size_t len)
1 {
2 struct cmdname *ent = malloc(sizeof(*ent) + len + 1);
4 ent->len = len;
5 memcpy(ent->name, name, len);
6 ent->name[len] = 0;
...
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: david lerner <dlernerdroid@gmail.com>
Cc: linux-perf-user@vger.kernel.org
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20140917084054.3722.73975.stgit@kbuild-f20.novalocal
[ Added missing 'bool user' argument to the !DWARF show_line_range() stub ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a vmlinux is stripped, perf will use it and ignore kallsyms. We
end up with useless profiles where everything maps to a few
runtime symbols:
63.39% swapper [kernel.kallsyms] [k] hcall_real_table
4.90% beam.smp [kernel.kallsyms] [k] hcall_real_table
4.44% beam.smp [kernel.kallsyms] [k] __sched_text_start
3.72% beam.smp [kernel.kallsyms] [k] __run_at_kexec
Detect this case and fallback to using kallsyms. This fixes the issue:
62.81% swapper [kernel.kallsyms] [k] snooze_loop
4.44% beam.smp [kernel.kallsyms] [k] __schedule
0.91% beam.smp [kernel.kallsyms] [k] _switch
0.73% beam.smp [kernel.kallsyms] [k] put_prev_entity
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140909085929.4a5a81f0@kryten
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Decoding an Intel PT trace of the kernel requires an accurate kernel
object image. This is provided by making a copy of kcore. However the
copy needs to be made under the same conditions as the original
recording, and then it needs to be associated with the perf.data file.
The perf-with-kcore script does that.
The script also checks the permissions on the buildid cache and can be
used to fix them. That is needed for distributions where root does not
have a home directory and consequently writes to the same buildid cache
as the user, resulting in cached files that the user does not have
access to.
Example:
$ ./perf-with-kcore
Usage: perf-with-kcore <perf sub-command> <perf.data directory> [<sub-command options> [ -- <workload>]]
<perf sub-command> can be record, script, report or inject
or: perf-with-kcore fix_buildid_cache_permissions
$ ./perf-with-kcore record pt_uname -e intel_pt// -- uname
Recording
Using /home/ahunter/bin/perf
perf version 3.15.rc3.g4549ba
/home/ahunter/bin/perf record -o pt_uname/perf.data -e intel_pt// -- uname
Linux
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.023 MB pt_uname/perf.data ]
Copying kcore
[sudo] password for ahunter:
Done
$ tools/perf/perf-with-kcore.sh script pt_uname | head
Using /home/ahunter/bin/perf
perf version 3.15.rc3.g4549ba
/home/ahunter/bin/perf script -i pt_uname/perf.data --kallsyms=pt_uname/kcore_dir/kallsyms
swapper 0 [002] 161533.969666: sched:sched_switch: swapper/2:0 [120] R ==> perf:11316 [120]
:11315 11315 [003] 161533.969704: sched:sched_switch: perf:11315 [120] S ==> swapper/3:0 [120]
:11316 11316 [002] 161533.969783: sched:sched_switch: perf:11316 [120] R ==> migration/2:33 [0]
:33 33 [002] 161533.969791: sched:sched_switch: migration/2:33 [0] S ==> swapper/2:0 [120]
swapper 0 [003] 161533.969792: sched:sched_switch: swapper/3:0 [120] R ==> perf:11316 [120]
:11316 11316 [003] 161533.970062: branches: 0 [unknown] ([unknown]) => ffffffff810532fa native_write_msr_safe ([kernel.kallsyms])
:11316 11316 [003] 161533.970062: branches: ffffffff810532fd native_write_msr_safe ([kernel.kallsyms]) => ffffffff81035b31 pt_config_start ([kernel.kallsyms])
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406786474-9306-30-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This enables a PMU event to be specified in the form:
pmu//
which is effectively the same as:
pmu/config=0/
This patch is a precursor to defining default config for a PMU.
Further explanation extracted from lkml thread:
Imagine that the 'tsc' term did not exist.
Intel PT trace data would not contain TSC packets, and the decoder would
not know how to decode them.
Then imagine that a new version of the hardware adds 'tsc'.
It is such a useful feature that we want it by default, but older
versions of the tools don't know how to decode it, so the kernel cannot
turn it on by default.
It is similar to why the kernel does not select perf_event_attr.mmap2 by
default.
The kernel doesn't know whether the tool supports it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408129739-17368-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>