If a process is in a mountns and has symbols in /tmp/perf-<pid>.map,
look first in the namespace using the tgid for the pidns that the
process might be in. If the map isn't found there, try looking in the
mountns where perf is running, and use the tgid that's appropriate for
perf's pid namespace. If all else fails, use the original pid.
This allows us to locate a symbol map file in the mount namespace, if it
was generated there. However, we also try the tool's /tmp in case it's
there instead.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1499305693-1599-3-git-send-email-kjlx@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Symbol versioning, as in glibc, results in symbols being defined as:
<real symbol>@[@]<version>
(Note that "@@" identifies a default symbol, if the symbol name is
repeated.)
perf is currently unable to deal with this, and is unable to create user
probes at such symbols:
--
$ nm /lib/powerpc64le-linux-gnu/libpthread.so.0 | grep pthread_create
0000000000008d30 t __pthread_create_2_1
0000000000008d30 T pthread_create@@GLIBC_2.17
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
probe-definition(0): pthread_create
symbol:pthread_create file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Open Debuginfo file: /usr/lib/debug/lib/powerpc64le-linux-gnu/libpthread-2.19.so
Try to find probe point from debuginfo.
Probe point 'pthread_create' not found.
Error: Failed to add events. Reason: No such file or directory (Code: -2)
--
One is not able to specify the fully versioned symbol, either, due to
syntactic conflicts with other uses of "@" by perf:
--
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create@@GLIBC_2.17
probe-definition(0): pthread_create@@GLIBC_2.17
Semantic error :SRC@SRC is not allowed.
0 arguments
Error: Command Parse Error. Reason: Invalid argument (Code: -22)
--
This patch ignores versioning for default symbols, thus allowing probes to be
created for these symbols:
--
$ /usr/bin/sudo ./perf probe -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
Added new event:
probe_libpthread:pthread_create (on pthread_create in /lib/powerpc64le-linux-gnu/libpthread-2.19.so)
You can now use it in all perf tools, such as:
perf record -e probe_libpthread:pthread_create -aR sleep 1
$ /usr/bin/sudo ./perf record -e probe_libpthread:pthread_create -aR ./test 2
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.052 MB perf.data (2 samples) ]
$ /usr/bin/sudo ./perf script
test 2915 [000] 19124.260729: probe_libpthread:pthread_create: (3fff99248d38)
test 2916 [000] 19124.260962: probe_libpthread:pthread_create: (3fff99248d38)
$ /usr/bin/sudo ./perf probe --del=probe_libpthread:pthread_create
Removed event: probe_libpthread:pthread_create
--
Committer note:
Change the variable storing the result of strlen() to 'int', to fix the build
on debian:experimental-x-mipsel, fedora:24-x-ARC-uClibc, ubuntu:16.04-x-arm,
etc:
util/symbol.c: In function 'symbol__match_symbol_name':
util/symbol.c:422:11: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (len < versioning - name)
^
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/c2b18d9c-17f8-9285-4868-f58b6359ccac@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.
Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.
The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g address
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--60.31%--hypot +20
| | |
| | |--8.52%--__hypot_finite +273
| | |
| | |--7.32%--__hypot_finite +411
...
--35.34%--_start +4194346
__libc_start_main +241
|
|--6.65%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--2.70%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--1.69%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
With this patch and `-g srcline` we instead get the following output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g srcline
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--64.02%--hypot
| | |
| | --59.81%--__hypot_finite
| |
| --0.53%--cabs
|
--35.34%--_start
__libc_start_main
|
|--12.48%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it will always evaluate to 'true', as reported by clang:
util/map.c:390:36: error: address of array 'map->dso->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (map && map->dso && (map->dso->name || map->dso->long_name)) {
~~~~~~~~~~^~~~ ~~
util/map.c:393:22: error: address of array 'map->dso->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
else if (map->dso->name)
~~ ~~~~~~~~~~^~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-x8cu007cly40kfp8xnpi9kya@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Here is a small patch which tries to fulfill a point in the perf todo
list:
* Make pressing 'V' multiple times to go on cycling thru various
verbosity levels in 'perf top', so that info that is present in
'perf top -v' can be obtained without having to restart the tool
(acme).
After a small grep in the code, the max verbosity level seems 3; so,
we cycle at 4; I did not dare define a MAX_VERBOSE_LEVEL constant.
Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161012214823.14324-2-alexis.berlemont@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The csets:
0ac3348e50 ("perf tools: Recognize hugetlb mapping as anon mapping")
d7e404af11 ("perf record: Mark MAP_HUGETLB when synthesizing mmap events")
Added code conditional on MAP_HUGETLB, to make it build in older systems
where that define wasn't available. Now that we grabbed copies of
uapi/linux/mmap.h to have all those definitions in tools/, use it so
that we can support building the tools for older systems (without the
MAP_HUGETLB define in its libc headers) using new kernels that support
such maps.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-wv6oqbfkpxbix4umj2kcfmaz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Warn unmatched function filter correctly instead of warning
"symbol-loading error", since that can be a filter issue.
From the technical point of view, this adds a filter chech in map__load
and if there is a filter, it returns -2 (filter-out), instead of -1
(error), and perf-probe checks it and change message.
E.g. without this fix:
# perf probe -F rt_sp*
no symbols found in [kernel.kallsyms], maybe install a debug package?
Failed to load symbols in kernel
With this fix:
# perf probe -F rt_sp*
no symbols passed the given filter.
Failed to find symbols matched to "rt_sp*"
Error: Failed to show functions.
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/146885835596.16106.2293540792775552481.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In this patch, the offset of '.text' section is stored into dso
and used here to re-calculate address to objdump.
In most of the cases, executable code is in '.text' section, so the
adjustment made to a symbol in dso__load_sym (using
sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
get objdump address from symbol address (rip). However, it is not true
for kernel and kernel module since there could be multiple executable
sections with different offset. Exclude kernel for this reason.
After this patch, even dso->adjust_symbols is set to true for shared
objects, map__rip_2objdump() and map__objdump_2mem() would return
correct result, so perf behavior of annotate won't be changed.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When new maps are cloned out of split map they are added into origin
map's group, but their groups pointer is not updated.
This could lead to a segfault, because map->groups is expected to be
always set as reported by Markus:
__map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
238 return __machine__kernel_map(map->groups->machine, map->type) =
(gdb) bt
#0 __map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
#1 0x00000000004393e4 in symbol_filter (map=map@entry=0x1abb7a0, sym=sym@entry
#2 0x00000000004fcd4d in dso__load_sym (dso=dso@entry=0x166dae0, map=map@entry
#3 0x00000000004a64e0 in dso__load (dso=0x166dae0, map=map@entry=0x1abb7a0, fi
#4 0x00000000004b941f in map__load (filter=0x4393c0 <symbol_filter>, map=<opti
#5 map__find_symbol (map=0x1abb7a0, addr=40188, filter=0x4393c0 <symbol_filter
...
Adding __map_groups__insert function to add map into groups together
with map->groups pointer update. It takes no lock as opposed to existing
map_groups__insert, as maps__fixup_overlappings(), where it is being
called, already has the necessary lock held.
Using __map_groups__insert to add new maps after map split.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20151104140811.GA32664@krava.brq.redhat.com
Fixes: cfc5acd4c8 ("perf top: Filter symbols based on __map__is_kernel(map)")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This has a different model than the 'thread' and 'map' struct lifetimes:
there is not a definitive "don't use this DSO anymore" event, i.e. we may
get many 'struct map' holding references to the '/usr/lib64/libc-2.20.so'
DSO but then at some point some DSO may have no references but we still
don't want to straight away release its resources, because "soon" we may
get a new 'struct map' that needs it and we want to reuse its symtab or
other resources.
So we need some way to garbage collect it when crossing some memory
usage threshold, which is left for anoter patch, for now it is
sufficient to release it when calling dsos__exit(), i.e. when deleting
the whole list as part of deleting the 'struct machine' containing it,
which will leave only referenced objects being used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-majzgz07cm90t2tejrjy4clf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To make it consistent with the other dso lifetime routines.
For instance:
struct dso *vdso__new(struct machine *machine, const char *short_name,
const char *long_name)
Becomes:
struct dso *machine__addnew_vdso(struct machine *machine, const
char *short_name, const char *long_name)
Because:
1) There is no 'struct vdso' for us to have vdso__ prefixed routines.
2) Because it will not really just create a new instance of 'struct
dso', it'll call dso__new() but it will also insert it into the
DSO's list/rbtree, and we have a method name for that: 'addnew',
just like we have dsos__addnew().
3) So it is really a 'struct machine' operation, it is the first
argument, etc.
This way the place where this is used gets consistent:
if (vdso) {
pgoff = 0;
- dso = vdso__dso_findnew(machine, thread);
+ dso = machine__findnew_vdso(machine, thread);
} else
dso = machine__findnew_dso(machine, filename);
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-r3w3tvh8exm9xfz3p4tz9qbz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have pointers to struct map instances in several places, like in the
hist_entry instances, so we need a way to know when we can destroy them,
otherwise we may either keep leaking them or end up referencing deleted
instances.
Start fixing it by reference counting them.
This patch puts the reference count for struct map in place, replacing
direct map__delete() calls with map__put() ones and then grabbing a
reference count when adding it to the maps struct where maps for a
struct thread are kept.
Next we'll grab reference counts when setting pointers to struct map
instances, in places like in the hist_entry code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wi19xczk0t2a41r1i2chuio5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes off-by-one errors in the management of maps.
A map is defined by start address and length as implemented by
map__new():
map__init(map, type, start, start + len, pgoff, dso);
map->start = addr;
map->end = end;
Consequently, the actual address range is [start; end[ map->end is the
first byte outside the range.
This patch fixes two bugs where upper bound checking was off-by-one.
In V2, we fix map_groups__fixup_overlappings() some more where
map->start was off-by-one as reported by Jiri.
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20141006083532.GA4850@quad
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>